Multi-GPU SLI/CF Scaling: Lynnfield's Blemish

When running in single-GPU mode, the on-die PCIe controller maintains a full x16 connection to your graphics card:


Hooray.

In multi-GPU mode, the 16 lanes have to be split in two:

To support this the motherboard maker needs to put down ~$3 worth of PCIe switches:

Now SLI and Crossfire can work, although the motherboard maker also needs to pay NVIDIA a few dollars to legally make SLI work.

The question is do you give up any performance when going with Lynnfield's 2 x8 implementation vs. Bloomfield/X58's 2 x16 PCIe configuration? In short, at the high end, yes.

I looked at scaling in two games that scaled the best with multiple GPUs: Crysis Warhead and FarCry 2. I ran all settings at their max, resolution at 2560 x 1600 but with no AA.

I included two multi-GPU configurations. A pair of GeForce GTX 275s from EVGA for NVIDIA:


A coupla GPUs and a few cores can go a long way

And to really stress things, I looked at two Radeon HD 4870 X2s from Sapphire. Note that each card has two GPUs so this is actually a 4-GPU configuration, enough to really stress a PCIe x8 interface.

First, the dual-GPU results from NVIDIA.

NVIDIA GeForce GTX 275 Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) - 1GPU 20.8 fps 23.0 fps 21.4 fps 41.0 fps
Intel Core i7 870 (P55) 1GPU 20.8 fps 22.9 fps 21.5 fps 40.5 fps
Intel Core i7 975 (X58) - 2GPUs 38.4 fps 42.3 fps 38.0 fps 73.2 fps
Intel Core i7 870 (P55) 2GPUs 38.0 fps 41.9 fps 37.4 fps 65.9 fps

 

The important data is in the next table. What you're looking at here is the % speedup from one to two GPUs on X58 vs. P55. In theory, X58 should have higher percentages because each GPU gets 16 PCIe lanes while Lynnfield only provides 8 per GPU.

GTX 275 -> GTX 275 SLI Scaling Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) 84.6% 83.9% 77.6% 78.5%
Intel Core i7 870 (P55) 82.7% 83.0% 74.0% 62.7%

 

For the most part, the X58 platform was only a couple of percent better in scaling. That changes with the Far Cry 2 results where X58 manages to get 78% scaling while P55 only delivers 62%. It's clearly not the most common case, but it can happen. If you're going to be building a high-end dual-GPU setup, X58 is probably worth it.

Next, the quad-GPU results from AMD:

AMD Radeon HD 4870 X2 Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) - 2GPUs 25.8 fps 31.3 fps 27.0 fps 70.9 fps
Intel Core i7 870 (P55) 2GPUs 24.4 fps 31.1 fps 26.6 fps 71.4 fps
Intel Core i7 975 (X58) - 4GPUs 27.0 fps 57.4 fps 47.9 fps 117.9 fps
Intel Core i7 870 (P55) 4GPUs 24.2 fps 50.0 fps 36.5 fps 116 fps

 

Again, what we really care about is the scaling. Note how single GPU performance is identical between Bloomfield/Lynnfield, but multi-GPU performance is noticeably lower on Lynnfield. This isn't going to be good:

4870 X2 -> 4870 X2 CF Scaling Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) 4.7% 83.4% 77.4% 66.3%
Intel Core i7 870 (P55) -1.0% 60.8% 37.2% 62.5%

 

Ouch. Maybe Lynnfield is human after all. Almost across the board the quad-GPU results significantly favor X58. It makes sense given how data hungry these GPUs are. Again, the conclusion here is that for a high end multi-GPU setup you'll want to go with X58/Bloomfield.

A Quick Look at GPU Limited Gaming

With all of our CPU reviews we try to strike a balance between CPU and GPU limited game tests in order to show which CPU is truly faster at running game code. In fact all of our CPU tests are designed to figure out which CPUs are best at a number of tasks.

However, the vast majority of games today will be limited by whatever graphics card you have in your system. The performance differences we talked about a earlier will all but disappear in these scenarios. Allow me to present data from Crysis Warhead running at 2560 x 1600 with maximum quality settings:

NVIDIA GeForce GTX 275 Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost)
Intel Core i7 975 20.8 fps 23.0 fps 21.4 fps
Intel Core i7 870 20.8 fps 22.9 fps 21.5 fps
AMD Phenom II X4 965 BE 20.9 fps 23.0 fps 21.5 fps

 

They're all the same. This shouldn't come as a surprise to anyone, it's always been the case. Any CPU near the high end, when faced with the same GPU bottleneck, will perform the same in game.

Now that doesn't mean you should ignore performance data and buy a slower CPU. You always want to purchase the best performing CPU you can at any given pricepoint. It'll ensure that regardless of the CPU/GPU balance in applications and games that you're always left with the best performance possible.

The Test

Motherboard: Intel DP55KG (Intel P55)
Intel DX58SO (Intel X58)
Intel DX48BT2 (Intel X48)
Gigabyte GA-MA790FXT-UD5P (790FX)
Chipset: Intel X48
Intel X58
Intel P55
AMD 790FX
Chipset Drivers: Intel 9.1.1.1015 (Intel)
AMD Catalyst 9.8
Hard Disk: Intel X25-M SSD (80GB)
Memory: Qimonda DDR3-1066 4 x 1GB (7-7-7-20)
Corsair DDR3-1333 4 x 1GB (7-7-7-20)
Patriot Viper DDR3-1333 2 x 2GB (7-7-7-20)
Video Card: eVGA GeForce GTX 280
Video Drivers: NVIDIA ForceWare 190.62 (Win764)
NVIDIA ForceWare 180.43 (Vista64)
NVIDIA ForceWare 178.24 (Vista32)
Desktop Resolution: 1920 x 1200
OS: Windows Vista Ultimate 32-bit (for SYSMark)
Windows Vista Ultimate 64-bit
Windows 7 64-bit

Turbo mode is enabled for the P55 and X58 platforms.

The Best Gaming CPU? SYSMark 2007 Performance
Comments Locked

343 Comments

View All Comments

  • jonup - Tuesday, September 8, 2009 - link

    Unfortunately people in corporate world do not make a difference between a HD4500 and a GX790. As lond as the Intel can display spreadsheets its good enough (or better) than a GTX295/HD4890X2, because it is Intel. You can change ignorance when it works.
  • PassingBy - Tuesday, September 8, 2009 - link

    My horizons are broad enough, thank you. The needs of many corporate desktops/laptops will be met by Clarkdale/Arrandale and no, nobody will go blind or suffer eyestrain (by virtue of the IGP anyway).
  • PassingBy - Tuesday, September 8, 2009 - link

    No edit function, so, as I point out later in the thread, people reading this review presumably won't be interested in IGPs anyway, given that these processors now have no IGP market. Wait for Clarkdale before trying to compare IGPs.
  • dragunover - Tuesday, September 8, 2009 - link

    Thanks for the review, if not as soon as I wanted it!
  • Boobs McGee - Tuesday, September 8, 2009 - link

    Do you guys have plans to do a motherboard review roundup for P55?
    If not, please do.
  • Gary Key - Tuesday, September 8, 2009 - link

    I actually have three roundups planned, we have 15 boards here ranging from the $100 uATX items up to the $300 EVGA Classified series. We are only testing with retail products, released BIOS', and retail processors so the delivery of more than 70% of the boards late last week has created a small logjam. ;)
    The first article should be up on Thursday with a couple of my favorite boards and then a rather large one up on Monday and the last one a few days after that. Raja is working on a separate roundup of the top three boards targeted for the more extreme OC community. We will also have a P55 memory specific article shortly.
  • ClagMaster - Tuesday, September 8, 2009 - link

    Looking forward to reading these P55 motherboard roundups.
  • Anand Lal Shimpi - Tuesday, September 8, 2009 - link

    Yes, Gary is nearly complete with his. Give him another day and it'll be up :)

    Take care,
    Anand
  • Comdrpopnfresh - Tuesday, September 8, 2009 - link

    By creating a new socket- they're providing a disincentive for early adopters of bloomfield. This chip is literally a humpty-dumpty that stands to benefit intel with everyone suffering a small loss of their own. The benefits of lynnfield vs bloomfield come from shuffling the architectural deck of nehalem. In reality, it only shows the possibilities of an inflexible architecture.

    The turbo mode isn't cutting it in day-to-day power consumption reduction. On the scale of a day, the average shmoe who is ass enough to leave a computer on for no reason gains no benefit. Lower the reach of a voltage plane, and reduce the number of components sucking juice, that only present benefits under certain situations (a third memory channel), and shmoe is happier.

    If it was in the article, I apologize, but with the pci-e controller being on the un-core... what happens on a chipset with integrated graphics? Will the igp be linked to the processor now, rather than a bridge chip? If ati or nvidia made their own supporting chipsets with an igp- would the igp represent a chip onto itself, solely connected to the cpu, or would it have to work through dmi, and leave those on-die pci-e lanes for domestic usage?

    It seems this is the warning rattle to nvidia that they chose their place with ion, and are stuck in it. When the change to 32nm comes, and the gpu is integrated into the cpu- what kind of robust 3rd party chipsets could exist in the budget end? Sure, you can always add a dedicated, off-die, gpu... but for budget boards used to eons of making room for a cpu and working a bridge chip around an igp- either horrible inefficiencies will creep up, or higher prices.
    My money is on westmere having at least three power planes.
    I'd like to know: with the pci-e controller on-die now... what impact this puts on graphics cards with higher on-card memory. Does it strengthen or minimalize it?
    And, can the cpu now share the gpu's memory as a way to extend cache- after years of being forced to share the system pool. That 16gb/s link to gddr5 looks mouthwatering. I'd like to see performance tests run with the pci-e varient ssds floating around out there saddled to the on-die pci-e lanes, and a graphics card running off of chipset. Rather than elevating a horse-power driven graphics subsystem, I think the benefits of supplying more 'torque' by freeing mass storage ssds from the SATA interface would be far more substantial, and in all applications of the PC. You already have the means for nearly 2+2/3 times the theoretical bandwidth of SATA-6- which up til now seems rather bug-ridden and defunct.

    Also interested in the outcomes of usb3 with this- as usb is built on the foundations of pci-e, is it not? If usb3 can allow for pci-e externally, and you remove the latency issue of usb signaling traveling from some peripheral bridge chip to the cpu, and just jack the usb3 communications into the cpu... could one use usb3 as a computer-to-computer psuedo qpi teaming/networking bridge for inter-desktop cpu communication. skip the entire bottleneck of client-level software implementation, and the subsystem communication buses for out-of-box signaling too...
  • plague911 - Tuesday, September 8, 2009 - link


    The market just got a little more crowded so hopefully this will bring a reduction in prices of the 920. but..

    “The Core i7 870 gets close enough to the Core i7 975 that I'm having a hard time justifying the LGA-1366 platform at all. As I see it, LGA-1366 has a few advantages:
    1) High-end multi-GPU Performance
    2) Stock Voltage Overclocking
    3) Future support for 6-core Gulftown CPUs

    Your exactly right 1366 I think is going to be be the best option to “future proof” my system however the new chips make the 920- seem a little low on features. With the goal of “performance on a budget” I feel like we are stuck either getting a board with a socket which wont compete in the future, or chip which is weaker than its lower class cousins. Unfortunately I dont see any of this being fixed in the next few cycles. Id like to see a low clocked gulftown (to save cost) feature rich with good OC potential thats on the lower end of the price scale. To me this would be a good follow up to 920 but but it dosent seem like that will be coming out for several cycles. Unless ofc i'm missing something which is probably the case.

Log in

Don't have an account? Sign up now