Weeks have passed since Apple's announcement of the Mac Pro, and while we wanted to conclude our look at the Mac Pro much earlier, like many Mac Pro users we ran into some serious performance issues under Windows XP.

With the Mac Pro performance issues resolved and some more time with the system under our belts, we're able to bring you the final part in our Mac Pro coverage. This time we're focusing on upgrading the memory and CPUs in the Mac Pro, as well as looking at its performance as a PC running Windows XP.

As a high end Xeon based machine that can run both Mac OS X and Windows XP, the Mac Pro has the potential to be the power user's dream. Today our task is to find out just how upgradable this machine is and how well it runs XP, if it can truly be your only system if you're a Mac and PC user.

FBD Revisited

Thus far the only real downside we've seen to the Mac Pro is its use of Fully Buffered DIMM (FBD). As we mentioned in our initial article discussing the Mac Pro's specifications, the FBD spec calls for a serial interface between memory controller and memory modules, while allowing the chips on the memory modules themselves to be regular mainstream DDR2 devices. A FBD memory controller talks to an AMB (Advanced Memory Buffer) on each memory module, which acts as a translation hub and buffer for all communication between the DDR2 devices on the module and the requests from the memory controller.

The major benefit of FBD is the ability to feature more memory modules per channel (up to 8 per channel), offering greater capacity for high end servers and workstations than even registered DDR2. The downside to FBD is that there is significant overhead and latency introduced by using a packetized interface and using the AMBs to translate from one interface technology to another (FBD to DDR2).

As we mentioned and proved in our previous articles, the number and configuration of FB-DIMMs in your Mac Pro can affect performance. The Intel 5000X chipset in the Mac Pro features two 144-bit FBD branches, each being the width of two FB-DIMMs (effectively giving the chipset four "channels"). Therefore you need at least two FB-DIMMs in the system (the width of a single FBD branch), but ideally you'd need at least four to have a hope of attaining peak bandwidth.

As some of our readers (and Intel) pointed out, the story doesn't just end at needing four FB-DIMMs. The rank of the FB-DIMMs can impact performance as well, and ideally each of your FB-DIMM modules would be dual rank modules. The rank of a DIMM is determined by dividing the width of all of the devices on the module by the width of the module itself. For example, a single rank FB-DIMM would have 9 DDR2 devices each being 8-bits wide. A dual rank FB-DIMM would be composed of 18 DDR2 devices, each still being 8-bits wide. All of our 512MB FB-DIMMs are single rank modules, while our 1GB and 2GB modules are dual rank.

The story doesn't end with rank though. Because of the dedicated read and write lanes between the memory controller and the AMBs on FB-DIMMs, you can be reading from one FB-DIMM while writing to another. So in theory, if you're running an application (or combination of applications) that have a lot of concurrent reads and writes going on you could stand to benefit from having more than one FB-DIMM per channel.

Based on all of the above information, it would seem like your best bet is to stick as many dual rank FB-DIMMs as you can afford in your system, and if that were the case then we'd be able to move on from here. Unfortunately it's not, because as we mentioned in previous articles, the more FB-DIMMs you have in your system, the higher access latencies will be to those additional FB-DIMMs.

What we then end up with is a tradeoff between more bandwidth and higher latency, so which route do you take? We've done a lot of testing and most of our tests seem to favor the four dual-rank FB-DIMM module configuration, but the number/configuration of modules really depends on your particular needs. We're still testing to figure out what the tangible real world performance differences are between the multitude of memory configurations, but for now just know that if you need maximum bandwidth you'll want 8 dual rank FB-DIMMs, but if you want lower latency you'll want fewer modules. Whether or not you'll see a performance difference will depend mostly on the application(s) you're running.