nForce 4 SLI X16 Unleashed
We certainly do evaluate and showcase our fair share of ASUS products here at HotHardware, and it’s no great mystery because Asus is also typically one of the first out of the block with a new chipset technology. The company’s new A8N32 SLI Deluxe motherboard is no exception. NVIDIA’s new nForce 4 SLI X16 chipset has been out the door at NVIDIA for months now, but retail boards have yet to appear on the scene. We’ve heard ABIT and others may have boards on the horizon, based on this new high end NVIDIA chipset, but the A8N32 SLI Deluxe was the first to hit our test bench, and it looks like it will be the first to hit retail sometime this week. NVIDIA has also launched the Intel version of the nForce 4 SLI X16, and we’re also working with that board, dubbed the P5N32 SLI Deluxe, right now.
The new NVIDIA nForce 4 SLI X16 chipset takes the same basic feature set of the hugely successful nForce 4 SLI and expands upon its PCI Express lane configuration, offering two full X16 graphics slots for dual graphics SLI configurations. This adds another 16 lanes to the chipset, as well as 16 additional 2.5G PCI Express SerDes (serializer/de serializer) interfaces, which in turn adds to chip real estate and pin count. As a result, in this incarnation, the nForce 4 SLI X16 has become a discrete two chip solution for north and southbridge functionality. We’ll give you more on the chipset later; for now let’s run down what the ASUS A8N32 SLI Deluxe has to offer.
We’ve also grown accustomed to ASUS delivering nicely appointed bundles with its “Deluxe” and “AI Life” branded boards, and the A8N32 SLI Deluxe’s set up is pretty much cut from the same cloth. There are an abundance of SATA data and power cables to enable various configurations, as well as various USB, Game Port, Firewire, and serial IO back plate modules. ASUS bundles WinDVD suite with the board along with the drivers and utilities disk. In addition, you get an SLI bridge connector, but it looks like ASUS has had to go back to the flexible ribbon cable version with the A8N32 SLI Deluxe. This isn’t quite as robust, in terms of staying fastened to the cards, as the rigid PCB version. Unfortunately there wasn’t much of an option here if you wanted to populate either of the two PCI slots in between the board’s two PCI Express graphics slots. Which brings us to another small shortcoming of the A8N32 SLI Deluxe, which we’ll discuss shortly.
In this version of the chipset, NVIDIA expanded PCI Express connectivity to a total of 32 lanes specifically dedicated to graphics for full support of two X16 PCI Express Graphics cards in SLI mode. No longer must the cards split a single X16 connection into a pair of X8 electrical connections in each slot. Now each PEG slot is fully connected with an X16 link to the root complex, in this case what is pictured as the “SPP” or System Platform Processor in the block diagram above. This is what can be classically thought of as the northbridge of the chipset, while the NVIDIA “MCP” or Media Communications Processor pictured above handles the I/O and Storage connectivity, for support of Serial ATA drives also in RAID 0, 1, 0+1 and 5 configurations, as well as USB2.0, Firewire, Gig E, and NVIDIA ActiveArmor Secure Networking and Firewall features. Below is a table that details all of the various incarnations of the entire nForce 4 family, for your reference.
The A8N32 SLI Deluxe board itself is an impressive assembly of technology to behold. Chipset cooling on the board is a totally passive solution, employing the company’s heat pipe technology along with its “Stack Cool” fanless design to cool both the SPP and MCP as well as the CPU power array. ASUS also offers “optional” cooling fans that clip on top of the power array heatsink fin assembly, but this is not a requirement for normal operation or even overclocking conditions with air cooling for the CPU. Rather, ASUS recommends the use of these fans in extreme cases where watercooling or some other air free method of cooling, such as Vapor Phase Change, is used.
However, the location of two of the board’s standard PCI slots make it a very tight squeeze for anyone who actually runs this board with a Dual Graphics SLI setup. And isn’t that the whole intent of the product, with its extra X16 PCIe connection? If you would actually like to use those two PCI slots for expansion, say a TV Tuner card or something of that nature, air flow between the PCIe Graphics slots will be very restricted. In addition, with any dual slot cooled graphics card (ASUS’ own N7800 GTX Top for example), you’ll be limited to a single half height PCI card in between the PEG slots, and the end PCI slot on the board is also completely obstructed, as well. We were left wondering what the physical layout limitation may have been that caused ASUS to drop not one but two PCI slots in between the PEG slots, when perhaps they could have been placed at the end of the line where they could be more easily accessed and utilized. or “CPU Parameter Recall”, in the event you set up a non bootable configuration. If it is detected that the CPU locked up, timings will be set back to default parameters on the next boot attempt. This feature is also accompanied by a large selection of DRAM and CPU voltage settings, in .125V increments. Standard 1MHz CPU timings are present, as well as a full assortment of multiplier options for both CPU and DRAM.
There is, however, a very unique BIOS menu option called “DDR DRAM Skew,” which basically is driven from an on board PLL from ICS that allows programmable advance or delay of the DRAM clock skew in 150ps (picosecond) increments. Frankly, we’re not sure exactly how useful this feature is, in that DDR DRAM timing has very tight skew and jitter characteristics that must be met to ensure stable performance at any speed. However, perhaps this setting, when trying to stabilize things in extreme overclocking scenarios, will have some sort of normalizing affect on the clock signal. We played a bit with this setting and weren’t definitively able to determ Louboutin Outlet ine if i Louboutin Outlet t bought us any more DRAM timing margin while overclocking. Still, more control is a good thin Louboutin Outlet g, and we plan on exploring this feature a bit more in the weeks ahead.
You’ll notice in the screen shot above that our motherboard temp, which is typically a reading taken from the PWM circuit on the motherboard, is showing a somewhat toasty 44oC. Although this was taken with an open air bench setup, it also was a reading we got while overclocking our Athlon 64 FX 57 processor to over 3GHz. Regardless, with the nature of the fanless heat pipe design of this motherboard, we do recommend a well ventilated case.
Of course, our well ventilated HotHardware test bench is a perfect environment for overclocking, so we dialed up the ever capable BIOS menu options of the A8N32 SLI Deluxe and like others before it in its lineage, the board was as smooth as silk under pressure.
How we configured our test systems: When configuring the test systems for this showcase, we first entered the system BIOS and set each board to its “Optimized” or “High Performance Defaults.” We then manually configured our system RAM to run at 200MHz (DDR400), with the timings set to CL 2,2,2,5 settings. The hard drive was then formatted and Windows XP Professional (SP2) was installed. Then we installed all of the necessary drivers and removed Windows Messenger from the system altogether. Auto Updating and System Restore were also disabled, and we set up a 768MB permanent page file on the same partition as the Windows installation. Lastly, we set Windows XP’s Visual Effects to “best performance,” installed all of our benchmarking software, defragged the hard drive, and ran all of the tests. Our World Bench 5 tests were done on a completely clean install as the very first batch of each test we ran.
We began our testing with SiSoftware’s SANDRA, the System ANalyzer, Diagnostic and Reporting Assistant. SANDRA consists of a set of information and diagnostic utilities that can provide a host of useful information about your hardware and operating system. We ran three of the built in subsystem tests that partially comprise the SANDRA 2005 suite (CPU, Multimedia, and Memory). All of these tests were run with our A64 FX 53 processor set to its default clock speed of 2.4GHz (12x200MHz).
Sandra is more of a quick sanity check for us than anything else. The results shown here are pretty much as expected for the processor we chose, as well. What’s perhaps somewhat interesting and impressive are the memory scores we took at DDR400 CAS 2,2,2,5 settings with the A8N32 SLI Deluxe in excess of 6000MB/sec. This is right in line with what we’ve seen on the A8N SLI Premium historically, as well. Moving along to the Sandra hard disk benchmark, we see a strong 40MB/sec driven from the WD Raptor drive, which again is right on top of what we’ve seen in previous versions of this board with the standard nForce 4 SLI single chip solution. It seems at least as far as this base level test goes, the extra latency of having discrete north and southbridge components with the new nForce 4 SLI X16 chipset has no immediately apparent effect on performance. However, there is still a lot more testing to be done so let’s move on.
“The Memory test suite is a collection of tests that isolate the performance of the memory subsystem. The memory subsystem consists of various devices on the PC. This includes the main memory, the CPU internal cache (known as the L1 cache) and the external cache (known as the L2 cache). As it is difficult to find applications that only stress the memory; we explicitly developed a set of tests geared for this purpose. The tests are written in C++ and assembly. They include: Reading data blocks from memory, writing data blocks to memory, performing copy operations on data blocks, random access to data items, and latency testing.” Courtesy FutureMark Corp.
Interestingly enough, in both the Photoshop 7 and Office XP SP2 WorldBench 5 tests, we see the new A8N32 SLI Deluxe fall behind the rest of the pack by a small margin, 1 2% to be exact. While this certainly doesn’t constitute a significant and end user perceivable variance, it does speak somewhat to what could be the slightly higher latency characteristics of NVIDIA’s discrete MCP chip in the new nForce 4 SLI X16 chipset versus the totally integrated single chip approach in the nForce4 SLI standard solution. Is it anything to quibble over? Absolutely not. Again, the 5 7 seconds overall, with the number of tests these benchmarks run, is nearly within the margin of error for any given test run.
3DMark05’s built in CPU test is a DirectX game engine performance metric that’s useful for comparing relative performance between similarly equipped systems. This test consists of two different 3D scenes that are generated with a software renderer, which is dependant on the host CPU’s performance. This means that the calculations normally reserved for your 3D accelerator are instead sent to the central host processor. The number of frames generated per second in each test are used to determine the final score based on a weighted average.
With 3DMark05’s CPU test, the ASUS A8N32 SLI Deluxe took the lead position by a more comfortable margin. It’s important to note that this was achieved with a single GeForce 6800 GT installed, so the board’s additional X16 graphics slot isn’t affecting the results in any way. What we’re looking at here is a 5% differential from the fastest score turned out by the A8N32 SLI Deluxe and the slowest score offered by the A8N SLI Deluxe. Amongst the three nForce 4 SLI classic boards it’s a virtual dead heat, but the A8N32 SLI Deluxe seems to have slightly better overall bandwidth in this test that mainly highlights CPU polygon throughput.
In this test, the tables turn in favor of the ASUS A8N32 SLI Deluxe, but again, the performance delta is so marginal that it’s really not worth considering. So when looking back at things here, the moral of the story is that as far as standard usage models in general CPU and Memory bandwidth measurements go, the new A8N32 SLI Deluxe and its nForce4 SLI X16 two chip set offers every bit of performance that the legacy single chip solution does today. So let’s look at a couple of areas where the board may perhaps have more of a competitive advantage.
Well now, we do see a distinct pattern here with our high quality Doom 3 tests. Without AA or Aniso Filtering turned on, versus the MSI K8N Neo4 Platinum SLI, the ASUS A8N32 SLI and its dual X16 PEG slots takes a more significant 9% lead at 1280X1024 and approximately 4% at 1600X1200. Again this is without AA and AF turned on. Enable AA and AF, and the two boards are right on top of each other, with the A8N32 SLI Deluxe scoring a few extra frames at 1280 res. So it seems as if the more GPU bound we are, the less of a lead the new nForce4 SLI X16 chipset has, but as frame rates increase, dual X16 PCI Express graphics has its advantages. We asked NVIDIA about the results we were seeing, for a more complete explanation and this is what we were told.
From Nick Stam, Director, Tech Marketing NVIDIA:
“You may see better scaling at 1x/1x (AA /AF) on x16 vs x8 SLI due to a few reasons, in descending order of most impact.
1) The higher frame rates obtained at 1x/1x (or even at lower resolution) actually requires more data to be exchanged between the two GPU cards across PCIe per unit time (more frames per second means the cards must exchange data more frequently). Resolution variations actually don’t have as much impact across PCIe as the increased frame rates do in the GPU GPU exchanges over PCIe. While much data does exchange through the SLI bridge connector, there is still some data that goes across PCIe between GPUs.
Between 1280 and 1600 res, we see very little variance in performance overall between the two motherboard solutions we tested. The ASUS A8N32 SLI Deluxe does, however, have a slight advantage consistent with what we saw in our Doom 3 tests, without Anti Aliasing or Anisotropic Filtering enabled. Again, this is due largely to the fact that there is more inter GPU communication going on at these higher frame rates. With AA and AF turned on, however, the scores are completely leveled.
Next we’ll plug in a pair of GeForce 7800 GTX cards and turn on SLI AA to see if we can invoke even more GPU to GPU traffic over PCI Express, with the blending that is required in this type of AA rendering technique.
Article Updated 11/11/05 BIOS Incompatibility Found
When our article first launched on 10/18/05, we reported to you our findings based on the hardware and related driver versions we had in hand at the time. Unfortunately the Asus A8N SLI Deluxe motherboard we tested in comparison to the A8N32 SLI Deluxe, needed a new BIOS update to take advantage of NVIDIA’s recently released 6.82 nForce chipset driver. This driver version was released in support of the nForce 4 SLI X16 chipset and we came to find out that the legacy A8N SLI D Louboutin Outlet eluxe had performance issues with this new driver. Asus sent an updated BIOS for the A8N SLI Deluxe to us on 11/2 and we had to re run our numbers in order to confirm our initial performance findings between the two motherboards. Here is what Asus reported to us with this BIOS release.
” A8N SLI Deluxe 1015.006 > Support the NVIDIA 81.82 graphics driver and nForce 6.82 driver to increase X16 performance. ”
At the time we were testing with the 1013 version of the A8N SLI Deluxe BIOS. Our first pass benchmark findings can still be seen here but clearly they are erroneous due to the fact that the A8N SLI Deluxe’s performance was hampered by this chipset/BIOS incompatibility. The following are our updated SLI AA benchmarks with Far Cry.