nVidia GeForce GTX 580: The Fastest GPU Money Can Buy
By Jason Cross
It has been more than a year since nVidia revealed its new GPU architecture, called Fermi. The flagship GPU of the Fermi line, GF100, is a monster at more than 500 square millimeters and 3 billion transistors. Its size and complexity led to manufacturing problems that caused a six-month delay before it finally reached gamers in the GeForce GTX 480. Even after the delay, nVidia had to disable some parts of the GF100 chip and still had on its hands a graphics card that was widely criticized for being too hot and too noisy. Now, six months later, the GF110 GPU debuts in the nVidia’s new flagship graphics card, the GeForce GTX 580. It is essentially a remaking of the GF100 that corrects the problems that plagued that chip earlier this year.
Let’s take a look at the specs for the new graphics card, matched against nVidia’s previous flagship graphics card and against AMD’s fastest two competing cards. The Radeon HD 5870, now a year old, is still the fastest AMD-based graphics card equipped with a single GPU. Though the Radeon HD 5970 is the fastest single graphics card from the AMD camp, it is essentially two 5870 graphics cards on the same board; call it “CrossFire on a stick.” This design yields high performance, but the HD 5970 is quite expensive in addition to being big, heavy, and hot.
The GeForce GTX 580 is very much like the GTX 480. The 480 had one of the GF100’s 16 shader modules disabled, which effectively removed 32 of the shader units (nVidia calls them CUDA cores), four of the texture processing units, and one of the geometry processing engines. The new GF110 chip in the GTX 580 is nearly the same, but this time nVidia fully enables all of the chip’s functional units. Note the discrepancy in number of shader units between the AMD and nVidia cards in the chart above; this reflects the fact that the numbers are not directly comparable. Due to the different ways in which the nVidia and AMD chips are designed, a single shader unit in nVidia’s chip can do more work than one in AMD’s chip. It is also larger, which explains why there aren’t as many of them in the GPU.
At the chip level, the GeForce GTX 580 is essentially the same as the GTX 480. The new chip that powers it, called GF110, is made using TSMC’s 40-nanometer manufacturing process. It’s architecturally similar to the GF100, with the same dimensions and the same transistor count. If you were to look at a block diagram of the chip, it would look identical. Features such as cache sizes and the composition of the shader processors are the same. But with the GF110, nVidia fully retooled the chip from the transistor level, fixing many of the problems that make the GF100 hard to manufacture. This enabled the company to release a chip that has all the functional units enabled and yet draws less power and produces less heat than its predecessor. Together with better manufacturing and an enhanced cooling system, the GTX 580 runs the GF110 chip at a somewhat higher clock speed than the GF100 runs in the GTX 480.
There are no major new technologies in the GF110 GPU. It doesn’t have support for new display output types, for instance. Cards will have two dual-link DVI connectors, one mini-HDMI connector, and no DisplayPort. There is no new video decoder unit and no additional render back ends. That’s not to say that nVidia didn’t take the opportunity of remaking the chip to sneak in a few enhancements.
Cooler and quieter: Reworking the GF110 GPU has permitted nVidia to run it at a roughly 10 percent faster clock speed while drawing less wattage (about 20 watts less, in our tests). This is analogous to when a CPU company like Intel produces a new “stepping” of its CPU: The hardware is functionally identical, but it runs cooler. A new vapor-chamber heat spreader and a quieter fan design allow nVidia to cool the GeForce GTX 580 cards more efficiently and quietly, too.
Full-speed FP16 texture filtering: In the GF100, 16-bit floating point textures, often used in high-dynamic-range lighting, were filtered at half speed. Later chips in the Fermi line–for instance, the GF104 that powers the GeForce GTX 460–made some tweaks to filter these textures at full speed. The tweaks were rolled up into the GF110 GPU.
Faster z-culling: Modern graphics chips have a feature called z-culling. With z-culling in place, the graphics chip checks the depth of each part of an object in a scene to see whether something closer to the camera obscures it. If so, the chip rejects that part of the object so that it doesn’t have to do all the work necessary to draw it–you can’t see it anyway. This hardware is slightly improved in the GF110.
Power draw safeguard: The GF100 and nearly every nVidia card made to date will run as fast as possible when it’s in use; and the power draw, heat output, and cooling setups are all geared toward a worst-case scenario in which the GPU is being worked extremely hard. In rare conditions, a GPU may be asked to do too much, may get crazy hot, and may draw too much power from the PCIe power plugs. A perfect example of such conditions is the synthetic FurMark test, which made nVidia graphics cards run unusually hot and draw far more power than they were designed to. A few games have simple 3D rendered menu screens that do not impose an artificial cap on frame rate and can similarly cause too much heat and power draw. The GeForce GTX 580 has a new hardware feature that monitors power draw and will limit the GPU if necessary. This protective mechanism won’t affect performance in regular game situations, but it could affect performance on benchmarks like FurMark, and it should keep the card from getting loud and hot in those odd games that have misbehaving menus.
Next: Impressive Performance
Performance: Synthetic Benchmarks
nVidia promises that this $500 card, its new flagship, will be the fastest GPU that money can buy. That’s hedging a little bit: Technically, the Radeon HD 5970 is a single graphics card, but it uses two GPUs. Still, we think it’s worthwhile to compare the GeForce GTX 580 to the Radeon 5970, if only because the latter is the fastest single graphics card you can buy that uses AMD’s technology; it’s also a longer card that doesn’t easily fit into a midsize desktop PC case. In addition, we’ll compare the GeForce GTX 580 to the Radeon HD 5870 (the fastest single-GPU card that uses AMD’s tech) and to the GeForce GTX 480 (nVidia’s previous best card, which uses a very similar GPU).
We performed all of our benchmarks on a system configured with an Intel Core i7 980X CPU, and 6GB of RAM, and running 64-bit Windows 7.
Let’s start with the Unigine Heaven benchmark, a synthetic test of a real DirectX 11 game engine, currently licensed by a number of smaller games. The test is rather strenuous and forward-looking, featuring high detail levels, dynamic lighting and shadows, and lots of tessellation. We ran the test at the middle “Normal” mode. This geometry-heavy test favors nVidia’s architecture, and the new GTX 580 did great on the measure–around 20 percent faster than the GTX 480 and nearly 80 percent faster than the Radeon HD 5870.
FurMark is a synthetic OpenGL-based test that renders a torus covered in fur. It’s rather simple, but no test we’re aware of stresses a GPU more thoroughly. It’s a great way to see just how hot your graphics card will get, and how much power it will use. In the test results, you can see the effect of the new power draw safeguard kicking in. During this test, the GeForce GTX 480 got extremely loud and hot, and drew far more power than it was supposed to. It sounded like a leaf blower, though it also ran very fast. In contrast, the GTX 580 limited the power draw and scaled everything back to a reasonable level. If anything, the power restraint was too aggressive, as the AMD chips significantly outpaced the GTX 580. This test isn’t a very useful example of real-world performance, but it does nicely illustrate the power safeguard in action.
Though it’s getting a little long in the tooth, 3DMark Vantage is a standard still commonly used in synthetic graphics benchmarks. The engine utilizes DirectX 10 only, though a new version of 3DMark geared for DirectX 11 should be coming soon. We present the 3DMark score with standard settings for the “High” and “Extreme” profiles. AMD’s dual-GPU Radeon HD 5970 won this contest, but not by much.For its part, nVidia’s new card ran impressively fast, given that it is equipped with a single GPU. AMD’s best single-GPU card, the Radeon HD 5870, was left far behind.
Next: Real Game Performance
Synthetic tests can be useful for evaluating features that will be common in tomorrow’s games, but performance in real games is far more important. We tested with five modern games that can push a modern graphics card to the limit.
Codemasters’ rally racer Dirt 2, one of the first DirectX 11 games, features an excellent built-in benchmark. We used the demo version (whose benchmark track differs from the track in the retail game), so you can run the game at home and compare your results. We enabled DirectX 11 and turned all of the detail levels up to full. The GeForce GTX 580 delivered very strong performance here, easily outpacing the Radeon HD 5970 (by 25 to 40 percent) and the Radeon HD 5870 (by as much as 80 percent). The new GTX 580 is about 20 percent faster than the GTX 480.
Tom Clancy’s H.A.W.X. is a graphically rich arcade flight game that uses DirectX 10.1 to enable features such as Screen Space Ambient Occlusion (SSAO), God Rays, and Soft Particles. Again, we turned all of the detail levels up to the maximum for our testing. Historically, AMD’s cards have performed extremely well on this test, and the dual-GPU Radeon HD 5970 outpaced the GeForce GTX 580 as well (though both cards achieved extremely high frame rates). The GTX 580 was about 20 percent faster than the GTX 480 on H.A.W.X., and roughly 30 to 50 percent faster than the Radeon HD 5870.
World in Conflict is aging a bit, but it’s still a beautiful real-time strategy game with a DirectX 10 based graphics engine that can stress all but the most powerful graphics cards when you maximize the detail levels, as we did. This is another game that AMD cards usually handle quite well. In our tests, the GTX 580 ran about 15 percent faster than its top-of-the-line nVidia predecessor, and 25 to 40 percent faster than the Radeon HD 5870. Only the dual-chip 5970 outpaced it.
The S.T.A.L.K.E.R. series has always been on the leading edge of graphics technology. We used the demo benchmark for the Call of Pripyat sequel with DirectX 11 lighting enabled and all detail settings maximized. The scores charted below represent the average of the four tests that the benchmark rans. With antialiasing applied, nVidia’s new card matched the Radeon HD 5970, and it dramatically outperformed the single-GPU Radeon HD 5870.
Last but not least, we used the excellent benchmark built in to Just Cause 2. We maximized graphics settings and ran the Concrete Jungle test, which is the most strenuous of benchmarks. Again the 5970 performed well, thanks to its essentially combining two Radeon HD 5870s on a single long card. The GTX 580 handily beat the solo 5870, especially when we turned on antialiasing. Interestingly, on this game only, the new GTX 580 was no faster than the GTX 480 with antialiasing off. This odd behavior is probably attributable to immature drivers.
Next: Value and Efficiency
Value and Efficiency
Our test lineup consists entirely of high-end graphics cards–products for performance-oriented enthusiasts who aren’t terribly concerned with finding the best bargain available. The GeForce GTX 580 is likely to sell for a suggested price of $500. nVidia tells us that the supply of GTX 580 cards will be small for the first few weeks, so prices may temporarily go a bit higher. The GeForce GTX 480 has dropped to $450, while AMD has adjusted the prices on its high-end cards a little: The Radeon HD 5870 starts at about $340 and the dual-GPU Radeon HD 5970 took a big price cut down to $500 to match nVidia’s latest and greatest.
Nobody wants to spend more than is necessary, and everyone wants to know which product delivers the most bang for the buck. To find out, we averaged the benchmark results for all of our real-world game tests and then divided by the price to arrive at a metric we call dollars per frames per second. On the chart below, lower numbers are better: They signify spending less to get equivalent performance.
Thanks to AMD’s recent price cuts, all four cards deliver fairly similar performance per dollar. Though the Radeon HD 5870 is significantly slower than the other three cards, it is also considerably less expensive. The only clear advantage appears at the very high resolution of 2650 by 1600 with no antialiasing, where AMD’s cards offer more oomph for the money.
nVidia says that it worked hard to optimize power utilization on the GeForce GTX 580, and it shows. Despite running at a higher clock speed, the GTX 580 delivered power reductions of about 20 watts both at idle and under full load. The lower-performing Radeon HD 5870 took the crown for power use here; but among the faster and more-expensive cards, the GTX 580 doesn’t look bad at all. Somewhat surprisingly, it uses more power under load than AMD’s dual-GPU card, but it uses less power at idle.
By dividing the average frames per second for each card on all of our game tests by its power use under load from the previous chart, we arrive at a measure of watts per frames per second. Instead of simply identifying how fast the cards were or how much power they used, this chart calculates their power efficiency. Here again, lower numbers are better.
As you can see, the purple bar is consistently much shorter than the green bar. This indicates how much progress nVidia has made with the GF110 GPU. Despite being the same size as the GF100 and having the same transistor count, the GF110 enabled the GeForce GTX 580 to deliver significantly better performance than the GTX 480 did, while lowering power consumption. It even turned in better performance per watt than the Radeon HD 5870.
Next: The Fastest Graphics Card Around…for Now
The Fastest GPU Around, but No New Features
The GeForce GTX 480 was supposed to be last year’s graphics champ, but it didn’t launch until March of this year. If the new GeForce GTX 580’s quick arrival on the market (only six months after its predecessor) is surprising, that is only because its the GTX 480 was so late. Under the circumstances, it’s hard not to be slightly disappointed by the GeForce GTX 580.
Its performance, mind you, is stellar. Thanks to a reworking of the GF100 GPU, nVidia can finally demonstrate what the architecture can accomplish when the chip is uncrippled and runs at a high clock speed. In our tests, the GTX 580 was roughly 20 to 30 percent faster than the GTX 480 (already quite a fast card) while drawing significantly less power; it’s quieter, too. We couldn’t be happier with its performance, and we can’t wait to see what AMD’s answer will be; the company’s high-end Radeon 6900 series is expected soon.
Our modest disappointment is that the GeForce GTX 580 is little more than a fixed GeForce GTX 480. It’s the graphics card that the GTX 480 should have been. nVidia is behind the curve on what we feel are important display options such as DisplayPort support, and the ability to drive three displays simultaneously from a single graphics card. New high-end products are the obvious place to introduce these types of features. With the exception of a couple tweaks to texture filtering and z-culling, however, the GPU didn’t receive any architectural enhancements at all.
In some ways, it just goes to show how hard it is to bring a 3-billion-transistor, over-500-square-millimeter graphics processor to market. nVidia took six months longer than it expected to get its new design out the door, and another six months to get it right. Now that it has, we have no reservations in recommending the GeForce GTX 580 as an enthusiast-class graphics card for price-be-damned gamers. True, the Radeon HD 5970 is slightly faster on average, but cards that rely on two GPUs carry their own set of drawbacks–for instance, extremely long board length, high idle power use, and the inability to perform as well as they should on games that run in windowed mode (multiple-GPU cards work best in full-screen mode).
Though it took the company a year longer than intended, nVidia has finally released a graphics card that can fully display the Fermi architecture’s capabilities, and they’re impressive. Whatever qualms we may have about its lack of DisplayPort or its inability to drive more than two displays with a single card, the GeForce GTX 580 certainly makes good on nVidia’s promise to deliver the fastest single-GPU graphics card.