The Radeon HD 7970 Reprise: PCIe Bandwidth, Overclocking, & The State Of Anti-Aliasing
by Ryan Smith on January 27, 2012 4:30 PM EST- Posted in
- GPUs
- AMD
- Radeon
- Radeon HD 7000
With the release of AMD’s Radeon HD 7970 it’s clear that AMD has once again regained the single-GPU performance crown. But while the 7970’s place in the current GPU hierarchy is well established, we’re still trying to better understand the ins and outs of AMD’s new Graphics Core Next Architecture. What does it perform well at and what is it weak at? How might GCN scale with future GPUs? Etc.
Next week we’ll be taking a look at CrossFire performance and the performance of AMD’s first driver update. But in the meantime we wanted to examine a few other facets of the 7970: the impact of PCIe bandwidth on performance, overclocking our reference 7970 (and the performance impact thereof), and what AMD is doing for anti-aliasing with the surprise addition of SSAA for DX10+ along with an interesting technical demo implementing MSAA and complex lighting side-by-side. So let’s get started.
PCIe Bandwidth: When Do You Have Enough?
With the release of PCIe 3 we wanted to take a look at what the impact the additional bandwidth would have. Historically new PCIe revisions have come out well ahead of hardware that truly needs the bandwidth, and with the 7970 and PCIe 3 this once again appears to be the case. In our original 7970 review we saw that there were a small number of existing computational applications that could immediately benefit from the greater bandwidth, but what about gaming? We sat down with our benchmark suite and ran it at a number of different PCIe bandwidths in order to find an answer.
PCIe Bandwidth Comparison (Each Direction) | |||||
PCIe 1.x | PCIe 2.x | PCIe 3.0 | |||
x1 | 250MB/sec | 500MB/sec | 1GB/sec | ||
x2 | 500MB/sec | 1GB/sec | 2GB/sec | ||
x4 | 1GB/sec | 2GB/sec | 4GB/sec | ||
x8 | 2GB/sec | 4GB/sec | 8GB/sec | ||
x16 | 4GB/sec | 8GB/sec | 16GB/sec |
For any given game the amount of data sent per frame is largely constant regardless of resolution, so we’ve opted to test everything at 1680x1050. At the higher framerates this resolution offers on our 7970, this should generate more PCie traffic than higher, more GPU limited resolutions, and make the impact of different amounts of PCIe bandwidth more obvious.
At the high end the results are not surprising. In our informal testing ahead of the 7970 launch we didn’t see any differences between PCIe 2 and PCIe 3 worth noting, and our formal testing backs this up. Under gaming there is absolutely no appreciable difference in performance between PCIe 3 x16 (16GB/sec) and PCIe 2 (8GB/sec). Nor was there any difference between PCIe 3 x8 (8GB/sec) and the other aforementioned bandwidth configurations.
Going forward, for Ivy Bridge owners this will be good news. Even with only 16 PCIe 3 lanes available from the CPU, there should be no performance penalty from utilizing x8 configurations in order to enable CrossFire or other uses that would rob a 7970 of 8 lanes. But how about existing Sandy Bridge systems that can only support PCIe 2? As it turns out things aren’t quite as good.
Moving from PCIe 2 x16 (8GB/sec) to PCIe 2 x8 (4GB/sec) does incur a generally small penalty on the 7970. However like most tests this is entirely dependent on the game itself. With games like Metro 2033 the difference is non-existent, while Battlefield 3 and Crysis only lose 2-3%, and DiRT3 suffers the most, losing 14% of its performance. DiRT3’s minimum framerates look even worse, dropping by 19%. As DiRT3 is one of our higher performing games in the first place the real world difference is not going to be that great – it’s still well above 60fps at all times – but it’s clear that in the wrong situation only having 4GB/sec of PCIe bandwidth can bottleneck a 7970.
Finally if we take one further step to PCIe 3 x2 (2GB/sec), we see performance continue to drop on a game-by-game basis. Crysis, Metro, Civilization V, and Battlefield 3 still hold rather steady, having lost less than 5% of their performance versus PCIe 3 x16, but DiRT 3 continues to fall, while Total War: Shogun and Portal 2 begin to buckle. At these speeds DiRT3 is only 72% of its original performance, while Shogun and Portal 2 are at 81% and 92% respectively.
Ultimately what is clear is that 8GB/sec of bandwidth, either in the form of PCIe 2 x16 or PCIe 3 x8, will be necessary to completely feed the 7970. 16GB/sec (PCIe 3 x16) appears to be overkill for a single card at this time, and 4GB/sec or 2GB/sec will bottleneck the 7970 depending on the game. The good news is that even at 2GB/sec the bottlenecking is rather limited, and based on our selection of benchmarks it looks like a handful of games will be bottlenecked. Still, there’s a good argument here that 7970CF owners are going to want a PCIe 3 system to avoid bottlenecking their cards – in fact this may be the greatest benefit of PCIe 3 right now, as it should provide enough bandwidth to make an x8/x8 configuration every bit as fast as an x16/x16 configuration, allowing for maximum GPU performance with Intel’s mainstream CPUs.
47 Comments
View All Comments
dac7nco - Friday, January 27, 2012 - link
I was wondering when we'd start seeing bandwidth restrictions from 2.0 x8; Looks like Ivy Bridge will be a better than anticipated upgrade for those Z68 boards with 3.0 slots.Daimon
OblivionLord - Friday, January 27, 2012 - link
You'll see the limitation on the current 5970, 590, 6990, Mars2 when used on 8x and 16x 2.0. You won't see any limitation on 3.0 8x and 16x with current cards.If I had to guess then I'd say that in 2 years the highend videocards at that time will be powerful enough to finally show a limitation on 16x 2.0 and 8x 3.0, but not 16x 3.0
dragonsqrrl - Saturday, January 28, 2012 - link
So you'll be limited by 16x 2.0, but not 8x 3.0? How does that work exactly?Revdarian - Saturday, January 28, 2012 - link
I think, and might be mistaken, that he refers to using those dual gpu cards on multiple card solutions.In that case, well yeah, 8x and 16x 2.0 would be halve the bandwidth of the same setup with 3.0 (this is for cuadruple gpu solutions, a niche market)
Revdarian - Saturday, January 28, 2012 - link
*half even... meh grammar nazi-ing my own post heheheTermie - Friday, January 27, 2012 - link
Just as a counterpoint, Techspot just did an article on overclocking, and found that several mid-range cards hit around 15-17% overclock (and I believe this is on stock voltage). Link: http://www.techspot.com/review/486-graphics-card-o...You may want to note that what's unique about the 7970 is not that it can get up to an 18% overclock on stock volts, but that it is a top-end card that has 18% headroom. The 6970 had nothing close to that as Techspot found, for instance, so with the more expensive 7970, the headroom should be factored into the cost equation - the price premium for top-end cards rarely come with this bonus.
As an example, the HD5850, which was introduced as a high mid-range card, typically could reach a 17% overclock at stock volts (both of mine do), and the GTX460 was similar in this regard. That's why they were such value cards. But there's nothing entirely new about this kind of overclocking headroom at stock volts - it's not reserved only for CPUs, as you suggested.
Ryan Smith - Friday, January 27, 2012 - link
To clarify things, the point I was attempting to make was in reference to high end cards - the 580, 6970, 5870, and the like. Mid-range cards have traditionally overclocked better because there's plenty of thermal and power headroom to work with, which is consistent with Techspot's findings. In any case I've slightly edited the article to clarify this point.darkswordsman17 - Friday, January 27, 2012 - link
I think people will be disappointed in the overclocking part of this article, namely that you didn't do any voltage adjustments. I think people were wanting to see where the sweet spot for voltage is (best overclock without going too high, how increased voltage affects heat and power), like you often do with CPUs.On the flipside, I would have liked to see about undervolting. I saw someone mention that they had dropped voltage and were able to maintain clocks which cut the power consumption by a fair margin with no loss in performance.
Ryan Smith - Friday, January 27, 2012 - link
Considering that this is a reference card, I consider overclocking without voltage adjustment to be far more important. The 7970 is not an overengineered card like the 6990/5970 that was specifically built to be overvolted. It should be possible to give it some more voltage, but given the lack of design headroom in the power circuitry and the cooler, what you can achieve on stock voltage is much more important since it's all "free" performance.Termie - Saturday, January 28, 2012 - link
Ryan - as usual, thanks so much for being responsive to feedback. And thanks for putting this article together - very informative. That PCIe scaling analysis will be referenced for years to come, in my opinion.By the way, I agree that stock voltage overclocking is something worthy of being explored. It is a totally separate beast from overvolted overclocking, which not everyone has the skill or knowledge to do. The promise of higher performance and essentially no risk of hardware damage is truly a freebie, as you noted.