Doesn’t blow away Radeon cards in heavily AMD-optimized games
The Nvidia GeForce GTX 1080 is the first graphics card built using 16nm technology after GPUs stalled on 28nm for four long years. The performance and power efficiency gains are nothing short of astounding.
Price comparison from over 24,000 stores worldwide
“It’s insane,” Nvidia CEO Jen-Hsun Huang proudly proclaimed at the GeForce GTX 1080’s reveal, holding the graphics card aloft. “The 1080 is insane. It’s almost irresponsible amounts of performance… the 1080 is the new king.”
He wasn’t joking. The long, desolate years of stalled GPU technology are over, and this beast is badass.
A giant leap for GPU-kind
As wondrous as it is, the outrageous performance leap of the GTX 1080 (starting at $599 MSRP, $699 Nvidia Founders Edition reviewed) doesn’t exactly come as a surprise.
Faltering graphics processor process technology left graphics cards from both Nvidia and AMD stranded on the 28-nanometer transistor node for four long years—an almost unfathomable length of time in the lightning-fast world of modern technology. Plans to move to 20nm GPUs fell by the wayside due to technical woes. That means the 16nm Pascal GPUs beating inside the GTX 1080’s heart (and AMD’s forthcoming 14nm Polaris GPUs) represent a leap of two full process generations.
That’s nuts, and it alone could create a big theoretical jump in performance. But Nvidia didn’t stop there.
Pascal GPUs adopted the advanced FinFET “3D” transistor technology that made its first mainsteam appearance in Intel’s Ivy Bridge computer processors, and the GTX 1080 is the first graphics card powered by GDDR5X memory, a supercharged new version of the GDDR5 memory that’s come standard in graphics cards for a few years now.
On top of all that, Nvidia invested significantly in the new Pascal architecture itself, particularly in tweaking efficiencies to increase clock speeds while simultaneously reducing power requirements, as well as many more under-the-hood goodies that we’ll get to later—including enhanced asynchronous compute features that should help Nvidia’s cards perform better in DirectX 12 titles and combat a major Radeon advantage.
Oh, and did I mention all the new features and performance-enhancing software landing alongside the GTX 1080?
Note: Because this is a major GPU advancement, we’ll spend more time than usual discussing under-the-hood details and tech specs. If that’s not your thing, jump to page two for discussion on the GTX 1080’s big new technical wonders and page three for its new consumer-facing features. Performance talk starts on page four.
Let’s kick things off with an Nvidia-supplied spec sheet comparison of the GTX 1080 vs. its predecessor, the GTX 980. (Side note: The mere fact that the company’s comparing the GTX 1080 directly against the GTX 980 is noteworthy. Usually, GPU makers compare new graphics cards against GPUs two generations back in review materials. The GTX 960 was compared against the GTX 660—not the GTX 760—in Nvidia’s official materials, for example.)
Here, some of the benefits to switching to 16nm jump out immediately. While the “GP104” Pascal GPU’s 314mm2 die size is considerably smaller than 398mm2 die in the older GTX 980, it still manages to squeeze in 2 billion more transistors overall, as well as 25 percent more CUDA cores—2560 in the GTX 1080, versus 2048 in the GTX 980.
And pick up your jaw! The GTX 1080 indeed rocks utterly ridonkulous 1,607MHz base clock and 1,733MHz (!!!!) boost clock speeds—and that’s just the stock speeds. We managed to crank it to over 2GHz on air without breaking a sweat or tinkering with the card’s voltage. Add it all up and the new graphics card blows its predecessor out of the water in both gaming performance and compute tasks, leaping from 4,981 GFLOPS in the GTX 980 all the way to 8,873 GFLOPS in the GTX 1080.
Diving even deeper, each Pascal Streaming Multiprocessor (SM) features 128 CUDA cores, 256KB of register file capability, a 96KB shared memory unit, 48KB of L1 cache, and eight texture units. Each SM is paired with a GP104 PolyMorph engine that handles vertex fetch, tessellation, viewport transformation, vertex attribute setup, perspective correction, and the intriguing new Simultaneous Multi-Projection technology (which we’ll get to later), according to Nvidia.
A group of five SM/PolyMorph engines with a dedicated raster engine forms a Graphics Processing Cluster, and there are four GPCs in the GTX 1080. The GPU also features eight 32-bit memory controllers for a 256-bit memory bus, with a total of 2,048KB L2 cache and 64 ROP units among them.
That segues nicely into another technological advance in Nvidia’s card: the memory. Despite rocking a 256-bit bus the same size as its predecessor, the GTX 1080 managed to push the overall memory bandwidth all the way to 320GBps, from 224GBps in the GTX 980. That’s thanks to the 8GB of cutting-edge Micron GGDR5X memory inside, which runs at a blistering 10Gbps—a full 3Gbps faster than the GTX 980’s already speedy memory. How fast is that, really? Nvidia’s GTX 1080 whitepaper sums it up:
“To put that speed of signaling in context, consider that light travels only about an inch in a 100 picosecond time interval. And the GDDR5X IO circuit has less than half that time available to sample a bit as it arrives, or the data will be lost as the bus transitions to a new set of values.”
Implementing such speedy memory required Nvidia to redesign both the GPU circuit architecture as well as the board channel between the GPU and memory dies to exacting specifications—a process that will also benefit graphics cards equipped with standard GDDR5 memory, Nvidia says.
Pascal achieves even greater data transfers capabilities thanks to enhanced memory compression technology. Specifically, it builds on the delta color compression already found in today’s Maxwell-based graphics cards, which reduces memory bandwidth demands of grouping like colors together. Here’s how Nvidia’s whitepaper describes the technology:
“With delta color compression, the GPU calculates the differences between pixels in a block and stores the block as a set of reference pixels plus the delta values from the reference. If the deltas are small then only a few bits per pixel are needed. If the packed together result of reference values plus delta values is less than half the uncompressed storage size, then delta color compression succeeds and the data is stored at half size (2:1 compression).”
The new Pascal GPUs perform 2:1 delta color compression more effectively, and added 4:1 and 8:1 delta color compression for scenarios where the per-pixel color variation is minimal, such as a darkened night sky. Those are targets of opportunity, though, since the compression needs to be lossless. Gamers and developers would gripe if GeForce cards started screwing with image quality.
Using color compression to reduce memory needs isn’t new at all—AMD’s Radeon GPUs also do it—but Nvidia says that between this new, more effective form of compression and GDDR5X’s benefits, the GTX 1080 offers 1.7x the total effective memory bandwidth of the GTX 980. That’s not shabby at all, and it takes some of the sting out of the card’s lack of revolutionary high-bandwidth memory, which debuted in AMD’s Radeon Fury cards, albeit in capacities limited to 4GB.
The Pascal GPU’s technological enhancements and leap to 16nm FinFET also make it incredibly power efficient. Despite firmly outpunching a Titan X, the GTX 1080 sips only 180 watts of power over a single 8-pin power connector. By comparison, the GTX 980 Ti sucks 250W through 6-pin and 8-pin connectors, while the 275W Fury X uses a pair of 8-pin connectors. The GTX 1080 does a lot more performance with a lot less power.
Next page: New features! Async compute, simultaneous multi-projection, and more
This dedicated hardware essentially allows multiple tasks to be run concurrently. The async shaders didn’t provide much of an advantage in DirectX 11 games, which run tasks in a largely linear fashion, but they can give certain DX12 titles a major performance boost, as you’ll see in our Ashes of the Singularity benchmark results later. And it can make a major difference in the asynchronous timewarp feature that the Oculus Rift VR headset uses to keep you from blowing chunks if there’s a hiccup in processing.
Nvidia’s Maxwell GPU-based GeForce 900-series cards don’t have a hardware-based equivalent for that. Instead, they rely on software-based “pre-emption” that allows a GPU to pause a task to perform a more critical one, then switch back to the original task. (Think of it like a traffic light.) Maxwell’s pre-emption gets the job done, but nowhere near as well as AMD’s dedicated hardware (which behaves more like the flow of cars yielding in traffic).
Pascal GPUs introduces several new hardware and software features to beef up its async compute capabilities, though none behave exactly like the async hardware in Radeon GPUs.
The GeForce GTX 1080 adds flexibility in task execution with the introduction of dynamic load balancing, a new hardware-based feature that allows the GPU to adjust task partitioning on the fly rather than letting resources sit idle.
With the static partitioning technique used exclusively by all previous generation GeForce cards, resources for overlapping tasks each claimed a portion of the GPU resources available—let’s say 50 percent for PhysX compute and 50 percent for graphics, for example. But if the graphics finishes its task first, that 50 percent of resources allocated to it sits idle until the compute portion also completes. The Pascal GPU’s new dynamic load partitioning allows unfinished tasks to tap into idle GPU resources, so the PhysX task in the previous example gains access to the resources available when the graphics task wrapped up, which would obviously allow the PhysX task to finish sooner than it would with the older static partitioning scheme.
A fluid particle demo shown at Nvidia’s GTX 1080 Editors Day hit 78 frames per second with the feature disabled, and climbed to 94fps when it was turned on.
The Pascal GPU also adds “Pixel level pre-emption” and “Thread level pre-emption” to its bag of async tricks, which are designed to help minimize the cost of switching tasks on the fly when time-critical tasks (like Oculus’ asynchronous timewarp) come in hot.
Previously, pre-emption occurred at a fairly high level of the computing process, between rendering commands from the game engine. Each rendering command can consist of up to hundreds of individual draw calls in the command push buffer, Nvidia says, with each draw call containing hundreds of triangles, and each triangle requiring hundreds of individual pixels to be rendered. Performing all that work before switching tasks can take a long time. (Well, relatively speaking.)
Pixel level pre-emption—which is achieved using a blend of hardware and software, Nvidia says—allows Pascal GPUs to save their current workload at pixel-level granularity rather than the high rendering command state, switch to another time-critical task (like asynchronous timewarp), then pick up exactly where they left off. That lets the GTX 1080 pre-empt tasks quickly, with minimal overhead; Nvidia says pixel-level pre-emption takes under 100 microseconds to kick into gear. We’ll talk about real-world results with Pascal’s new async compute tools when we dive into our DirectX 12 testing with Ashes of the Singularity. (Spoiler alert: They’re impressive.)
Thread level pre-emption will be available later this summer and performs similarly, but for CUDA computing tasks rather than graphical commands.
Simultaneous multi-projection (SMP) is a highly intriguing new technology that improves performance when a game needs to render multiple “viewports” for the same game, be it for a multi-monitor setup or the dual lenses inside a virtual reality headset. A more granular SMP feature can also greatly improve frame rates in games on standard displays by building on the groundwork laid by the multi-resolution shading feature already enabled in Nvidia’s Maxwell GPUs.
This fancy new technology’s at the heart of Nvidia’s claim that the GeForce GTX 1080 is faster than two GTX 980s configured in SLI. The card never hits that lofty milestone in traditional gaming benchmarks—though it can come pretty damn close in some titles. But it’s theoretically possible in VR applications coded to take advantage of SMP, which uses dedicated hardware inside the Pascal GPU’s PolyMorph engine hardware.
Displaying scenes on multiple displays traditionally involves some sort of compromise. In dual-lens VR, the scene has to have its geometry fully calculated and the scene fully rendered twice—once for each eye. Multi-monitor setups, on the other hand, tend to distort the imagery on the periphery screens, because they’re angled slightly to envelop the user, as shown above. Think of straight line drawn across a piece of paper: Folding the paper in half makes the line appeared slightly angled instead of truly straight.
Simultaneous multi-projection separates the geometry and rendering portions of creating a scene to fix both of those problems. The Pascal GPU calculates a scene’s geometry just once, then draws the scene to match the exact perspective of up to 16 different viewpoints as needed—a technique Nvidia calls “single-pass stereo.” Any parts of the scene that aren’t in view aren’t rendered.
If you’re using SMP with multi-monitors rather than a VR headset, new Perspective Surround settings in the Nvidia Control Panel will let you configure the output to match your specific setup, so those straight lines in games no longer appear angled and render as the developers intended. Sweet!
But that’s not all simultaneous multi-projection does. A technique called “lens-matched shading”—the part that builds on Maxwell’s multi-res shading—pre-distorts output images to match the warped, curved lenses on VR headsets, rendering the edges of the scene at lower resolution rather than rendering them at full fidelity and throwing all that work away. Like SMP’s single-pass stereo, the idea is to render only the parts of the image that will actually be seen by the user in order to improve efficiency.
Interestingly, lens-matched shading can also be used to improve overall frame rates even on traditional single-display setups. In a single screen demo of Obduction, Cyan Worlds’s upcoming spiritual successor to Myst, frame rates hovered around 42fps in a particular scene with SMP disabled at 4K resolution. Activating SMP caused frame rates to leap to the 60fps maximum supported by the display, and you could only notice the reduced pixel fidelity at the edges of the display if you were standing still and actively looking for blemishes.
Simultaneous multi-projection is fascinating, potentially portentous stuff—and that’s why it’s a major bummer that developers have to explicitly add support for it, and it works only on GeForce cards running on Pascal GPUs. It’s a killer selling point for the GTX 1080, but whether games will support a feature that excludes every graphics card sold up until today is a big question mark.
Next page: Cool new consumer-facing GTX 1080 features
Ansel: The supercharged future of screenshots
Speaking of super cool features limited to Nvidia’s new graphics cards, there’s Ansel, which Nvidia calls “an in-game 3D camera” and I call the supercharged future of screenshots.
Rather than simply capturing a 2D image like Steam’s F12 functionality, Ansel lets you pause a game, then freely roam the environment with a floating camera (though developers will be able to disable free roaming in their games if desired). You’re able to apply a several filters and effects to the scene using easy-to-use tools, as shown in the image below, as well as crank the resolution to ludicrous levels. Nvidia plans to release more filters as time goes on, plus a post-processing shader API so developers can create custom filters.
In a demo of Ansel running on The Witness, for example, I was able to jack the resolution to a whopping 61,440×34,560. Out of the box, the tool can support up to 4.5-gigapixel images
Creating a masterpiece like that takes Ansel several minutes to stitch together files of considerably large size, however. Ansel snaps up to 3,600 smaller images to capture the entire scene—including 360-degree pictures that can be viewed in a VR headset or even Google Cardboard—and processes them with CUDA-based stitching technology to create a clean, final picture that doesn’t need any additional lighting or tone-mapping tweaks. It’s also capable of capturing RAW or EXR files from games, if you feel like tinkering around in HDR.
Ansel’s a driver-level tool, and games will need to explicitly code in support for it. On the plus side, doing so takes minimal effort—Nvidia says The Witness’s Ansel support required 40 lines of code, while Witcher 3’s integration took 150 lines. The company also plans to offer Ansel for Maxwell-based GeForce 700- and 900-series graphics cards. Look for The Division, The Witness, Lawbreakers, Witcher 3, Paragon, Unreal Tournament, Obduction, No Man’s Sky, and Fortnite to roll out Ansel support in the coming months.
How Fast Sync fixes latency and tearing
The GeForce GTX 1080 has a big problem: It’s almost too powerful, at least for the popular e-sports titles with modest visual demands. Running Counter-Strike: Global Offensive, League of Legends, or Dota 2 on a modern high-end graphics card can mean your hardware’s pumping out hundreds of frames per second, blowing away the refresh rates of most monitors.
That puts gamers in a pickle. The disparity between the monitor’s refresh rate and the extreme frame output can create screen tearing, a nasty artifact introduced when your monitor’s showing results from numerous frames at once. But enabling V-sync to fix the issue adds high latency to the game as it essentially tells the entire engine to slow down, and high latency in the fast-paced world of e-sports can put you at a serious competitive disadvantage.
The new Fast Sync option in the GTX 1080 aims to solve both problems by separating the rendering and displays stages of the graphics process. Because V-sync isn’t enabled, the game engine spits out frames at full speed—which prevents latency issues—and the graphics card uses flip logic to determine which frames to scan to the display in full, eliminating screen tearing.
Some excess frames will be cast aside to maintain smooth frame pacing, Nvidia’s Tom Peterson says, but remember that Fast Sync’s made for games where the frame rendering rate output far exceeds the refresh rate of your monitor. In fact, enabling Fast Sync in games with standard frame rates could theoretically introduce stuttering. So yeah, don’t do that.
The results seem impressive. Here are Nvidia-supplied latency measurements tested with CS:GO.
Look for Fast Sync to expand beyond Pascal-based graphics cards in the future. “Expect [GPU support] to be fairly broad,” says Peterson.
GPU Boost 3.0
Nvidia’s rolling out a potentially killer new overclocking addition in the GTX 1080, dubbed GPU Boost 3.0.
The previous methods of overclocking are still supported, but GPU Boost 3.0 adds the ability to customize clock frequency offsets for individual voltage points in order to eke out every tiny little bit of overclocking headroom, rather than forcing you to use the same clock speed offset across the board. Overclocking tools will scan for your GPU’s theoretical maximum clock at numerous voltage points, then apply a custom V/F curve to match your specific card’s capabilities. It takes all the guesswork out of overclocking, letting you crank performance to 11 with minimal hassle.
Nvidia supplied reviewers with an early, mildly janky copy of a new EVGA Precision X build that supports GPU Boost 3.0, and finding then pushing your card’s limits proved pretty straightforward. Settings let you choose the minimum and maximum clock speed offset to test, as well as the “step” value, or how much the clock frequency increases from one offset to the next. After my card repeatedly crashed with Precision X’s normal OC scanner settings, decreasing the step value increase from 12.5MHz to 5MHz calmed things down—but also caused the scan session to become abominably slow.
If I’d had time to let it run in full, I would’ve been left with a highly granular overclocking profile specific to my individual GPU. But because the tool landed my hands late in the testing process, I went the manual route, overclocking the GPU by hand with a copy of the Unigine Heaven benchmark. I’ll share the final results in the performance section.
HDR and DRM support
The GeForce GTX 1080 continues Nvidia’s tradition of supporting technology built for home theater PCs. After the GTX 960 and 950 became the first major graphics cards to support HDCP 2.2 for copyrighted 4K videos over HDMI, the GTX 1080 embraces high dynamic range video technology, a.k.a. HDR. HDR displays boost brightness to create more range between darkness and light. As simple as it sounds, the improvement in visual quality is borderline startling—I think the difference between HDR and non-HDR displays is much more impressive than the leap from 1080p resolution to 4K displays. AMD’s Polaris GPUs will also support HDR.
Pascal GPUs support HDR gaming, as well as HEVC HDR video encoding and decoding. Pairing the GTX 1080 (and its HEVC 10b encoding abilities) with an Nvidia Shield Android TV console (and its HEVC 10b decoding abilities) enables another nifty trick: GameStream HDR. Basically, you can stream a PC game from your Pascal GPU-equipped computer to your TV via the Nvidia Shield, and because both devices support HDR, those deep, deep blacks and vibrant colors will appear on your television screen just fine. It’s a smart way for Nvidia to leverage its ecosystem and skirt around the fact that HDR display support is limited to traditional televisions right now, though it won’t roll out until later this summer.
Currently, Obduction, The Witness, Lawbreakers, Rise of the Tomb Raider, Paragon, The Talos Principle, and Shadow Warrior 2 are the only games with pledged HDR support, though you can expect more titles to embrace the technology as hardware support for it becomes more widespread.
Pascal GPUs are also certified for Microsoft’s PlayReady 3.0, which allows protected 4K videos to be played on PCs. Presumably thanks to that, Pascal-based graphics cards will be able to stream 4K content from Netflix at some point later this year. Embracing 4K video on the PC means embracing Windows 10 and DRM as well, it seems.
To push out all those fancy new videos, the GTX 1080 packs a single HDMI 2.0b connection, a single dual-link DVI-D connector, and three full-sized DisplayPorts that are DP 1.2 certified, but ready for DP 1.3 and 1.4. That readiness enables support for 4K monitors running at 120Hz, 5K displays at 60Hz, and even 8K displays at 60Hz—though you’ll need a pair of cables to run that last scenario.
Next page: Testing setup, SLI changes, and WTF is the GTX 1080 Founders Edition?
High-bandwidth SLI bridges
Nvidia’s making some big changes to the way it handles multi-GPU support in the wake of DirectX 12. Starting with the GTX 1080, Nvidia will offer rigidly constructed high-bandwidth bridges dubbed SLI HB, which occupy not one, but both SLI connectors on the graphics card to handle the high flow of information flowing between the cards.
To match that design—and presumably to cut engineering costs on 3- and 4-way configurations that few people use—Nvidia’s graphics cards will officially support only 2-way SLI going forward, though 3- and 4-way configurations will be unofficially supported with help from an Nvidia tool you’ll have to download separately.
The SLI changes don’t matter in this review, as we have only a single Nvidia GeForce GTX 1080 Founders Edition to test. Confusion reigned in the wake of the Founders Edition’s hazy reveal, but in a nutshell: It’s what Nvidia used to call its reference design. There’s no hefty overclock or cherry-picked GPUs whatsoever. Here’s the twist: While the MSRP for the GTX 1080 is $600, the Founders Edition costs $700.
While there’s no doubt a bit of an early adopter’s fee going on here—the Founders Edition is the only GTX 1080 guaranteed by Nvidia to be available on May 27—the pricing isn’t as crazy as it seems at first blush.
Nvidia’s recent GeForce reference cards are marvels of premium engineering. The GTX 1080 continues that trend, with an angular die-cast aluminum body, vapor chamber cooling that blows air out of the rear of your machine, a low-profile backplate (with a section that can be removed for improved airflow in SLI setups), and new under-the-hood niceties like 5-phase dual-FET design and tighter electrical design. It screams “premium” and oozes quality, and the polygon-like design of the metal shroud is more attractive—and subtle—than early leaks indicated it would be.
But previous-gen Nvidia reference cards were paragons sold at a loss only during the first few weeks around launch in order to kickstart adoption of new GPUs. Nvidia plans to sell its Founders Edition cards for the GTX 1080’s lifetime. That lets Nvidia faithful buy directly from the company and allows boutique system sellers to certify a single GTX 1080 model for their PCs over the lifetime of the card, rather than worrying about the ever-changing specifications in product lineups from Nvidia partners like EVGA, Asus, MSI, and Zotac. In fact, Falcon Northwest owner Kelt Reeves told HardOCP that he actively lobbied Nvidia to create these cards for just that reason.
You’ll probably be able to find graphics cards from those board partners rocking hefty overclocks, additional power connectors, and custom cooling setups for the same $700 price as Nvidia’s Founders Edition once the GPU starts rolling out en masse. In other words, the Founders Edition probably won’t be a worthwhile purchase going forward if sheer price-to-performance is your major concern. But that $100 premium is steep enough to keep EVGA and its ilk from getting pissed about the newfound direct competition from Nvidia, while still allowing Nvidia to satisfy system builders.
But enough chit-chat. It’s time to see how badass this beast really is.
Along with upgrading the test system to Windows 10, we also updated our list of benchmarking games with a healthy mix of AMD Gaming Evolved and Nvidia-centric titles, which we’ll get into as we dive into performance results.
To see what the GTX 1080 Founders Edition is truly made of, we compared it against the reference $500 GTX 980 and $460 MSI Radeon 390X Gaming 8GB, and also the $650 Radeon Fury X and $1,000 Titan X. Because the GTX 980 Ti’s performance closely mirrors the Titan X’s—and Nvidia made a point to repeatedly compare the GTX 1080 against the Titan X—we didn’t test that card. Sadly, AMD never sent us a Radeon Pro Duo to test, so we can’t tell you whether the GTX 1080 or AMD’s dual-Fiji beast is the single most powerful graphics card in the world.
But the GTX 1080 is easily, hands-down, no-comparison the most powerful single-GPU graphics card ever released—especially when you overclock it.
Ignoring the auto-overclocking tools in the Precision X beta, I was able to manually boost the core clock speed by 250MHz and the memory clock speed by an additional 100MHz. Depending on what was going on in a given game’s given scene, and how Nvidia’s GPU Boost technology reacted to it, doing so resulted in clock speeds ranging from 1,873MHz to 2,088MHz. Yes, that’s clock speeds in excess of 2GHz on air, with no voltage tweaks.
In other words: Buckle up. This is going to be a wild ride.
Next page: The Division and Hitman benchmarks
First up: Ubisoft’s The Division, a third-person shooter/RPG set that mixes elements of Destiny and Gears of War. The game’s set in a gorgeous and gritty recreation of post-apocalyptic New York, running on Ubisoft’s new Snowdrop engine. Despite incorporating Nvidia Gameworks features—which we disabled during benchmarking to even the playing field—the game scales well across all hardware and isn’t part of Nvidia’s “The Way It’s Meant to be Played” lineup. In fact, it tends to perform better on Radeon cards.
Until the GTX 1080 enters the fray.
As you can see, the reference GTX 1080 offers a whopping 71-percent performance increase over the GTX 980 at 4K resolution and Ultra graphics settings. The GTX 1080 is designed as a generational replacement for the GTX 980, remember—not the Titan X. That said, the GTX 1080 outpunches the Titan X by 34.7 percent at 4K, and 24 percent at 1440p resolution, despite costing $400 less than Nvidia’s flagship.
This glorious murder-simulating sandbox’s Glacier engine is heavily optimized for AMD titles, with Radeon cards significantly outpunching their GeForce GTX 900-series counterparts, especially at higher resolutions. Because of that, while the GTX 1080 offers a significant performance leap over the GTX 980 (72.7 percent at 4K) and Titan X (33.8 percent), the performance gain over the Fury X is much more modest (8.8 percent) with all settings cranked to Ultra and FXAA enabled. That drives home how important in-engine support for a particular graphics architecture can be.
Note that these results are using DirectX 11. Hitmantheoretically supports DirectX 12, but a recent update broke it, and the game refused to launch in DX12 on both PCWorld’s GPU testbed as well as my personal gaming rig despite ostensibly being fixed. Alas.
Next page: Rise of the Tomb Raider benchmarks
Rise of the Tomb Raider
The gorgeous Rise of the Tomb Raider scales fairly well across all GPU hardware, though it clearly prefers the Titan X to the Fury X once you reach the upper echelon of graphics cards. But that doesn’t really matter, because the performance gains with the GTX 1080 are insane—especially once you overclock it.
The GTX 1080 pushes 70.5 percent more frames than the GTX 980 at 4K resolution on the Very High graphics setting (Nvidia’s HBAO+ and AMD’s PureHair technology disabled). The gap increases to a whopping 94.5 percent after overclocking. That’s damn near twice the performance.
Wow. Just wow.
The performance increase over the Titan X is a more modest 29 percent, but that leaps to a full 47 percent overclocked. The Fury and Fury X’s defeat here is likely limited to HBM’s 4GB memory capacity, as the game specifically warns that enabling Very High textures can cause problems on cards with 4GB or less of memory.
We didn’t include performance results from RoTR’s DirectX 12 mode here because running it actually causes average frame rates to drop across the board, but it’s important to note that DX12 also caused minimum frame rates to increase by double-digits across the board. That means playing the game in DX12 results in fewer frames, but less stutter.
Next page: Far Cry Primal
Far Cry Primal
Far Cry Primal is another Ubisoft game, but running on the latest version of the long-respected Dunia engine that’s been underpinning the series for years now. We tested these GPUs with the game’s free 4K HD texture pack enabled.
It scales well, though the Fiji GPU’s super-fast high-bandwidth memory gives AMD’s Fury cards an edge at higher resolutions. The tables turn at lower resolutions. At 4K/Ultra, the GTX 1080 offers a 78-percent performance increase over the GTX 980, and a 33-percent increase of the Titan X and Fury X.
Next page: Ashes of the Singularity and DX12
Ashes of the Singularity and DX12
We were hoping to test the GTX 1080’s DirectX 12 performance in several games, but Hitman and Rise of the Tomb Raider’s DX12 implementations left us wanting for the reasons previously discussed. Windows Store-only DirectX 12 games aren’t really usable for benchmarking due to the inherent limitations of Windows Store apps. That left us with a single DX12 game to test: Ashes of the Singularity, running on Oxide’s custom Nitrous engine.
AoTS was an early flag-bearer for DirectX 12, and the performance gains AoTS offers in DX12 over DX11 are mind-blowing—at least for AMD cards. AoTS’s DX12 implementation makes heavy use of asynchronous compute features, which are supported by dedicated hardware in Radeon GPUs, but not GTX 900-series Nvidia cards. In fact, the software pre-emption workaround that Maxwell-based Nvidia cards use to mimic the async compute capabilities tank performance so hard that Oxide’s game is coded to ignore async compute when it detects a GeForce GPU.
That creates some interesting takeaways in performance benchmarks. Maxwell-based Nvidia GPUs actually perform worse in DirectX 12 mode, while AMD’s Radeon cards see massive performance gains with DX12 enabled—to the point that the Fury X in DX12 is able to essentially equal and sometimes even outpunch the reference GTX 1080’s baseline DX11 results, even though the GTX 1080 clobbers the Fury X’s DX11 results. That’s a big win for AMD.
That said, once you take the pedal off the metal and look at results below 4K/crazy, the GTX 1080 starts to see decent performance increases in DX11 vs DX12 performance, though it never nears the mammoth leaps that Radeon graphics cards enjoy. At 1440p/high settings, shifting to DirectX 12 gives the GTX 1080 a 20.3-percent performance leap. Therefore, even though AoTS explicitly disables basic async compute in Nvidia cards, the new async compute enhancements Nvidia’s built into Pascal can indeed provide tangible benefits in DX12 games with heavy async compute utilization.
Looking directly at Nvidia-to-Nvidia performance, the GTX 1080 provides frame rate increases similar to what we’ve seen in other games: Roughly 72 percent more performance than the GTX 980, and 35 to 40 percent over the Titan X.
Next page: Virtual reality and 3DMark Fire Strike results
SteamVR benchmark and virtual reality
The biggest bummer for me in this review is that VR benchmarks haven’t been able to keep up with graphics technology.
Nvidia’s biggest claim to performance fame with the GTX 1080 lies in virtual reality. While the traditional performance games are sizeable enough, Nvidia’s loftiest performance claims—faster than two GTX 980s in SLI! 2.7x performance increases!—are firmly tied to VR games that make full use of Nvidia software like simultaneous multi-projection. Unfortunately, the granular VR benchmark tools coming from Crytek and Basemark haven’t hit the streets, and no released VR games support the GTX 1080’s new software features yet. That leaves us with no way to quantify the GTX 1080’s potential performance increase over the competition except for the SteamVR benchmark, which is better for determining whether your rig is capable of VR than direct head-to-head GPU comparisons.
Oh well. In any case, the GTX 1080 is the first graphics card to ever push the PCWorld graphics testbed all the way to 11 in the SteamVR benchmark—though the Titan X and GTX 980 Ti came damn close before.
3DMark Fire Strike and Fire Strike Ultra
We also tested the GTX 1080 using 3DMark’s highly respected Fire Strike and Fire Strike Ultra synthetic benchmarks. Fire Strike runs at 1080p, while Fire Strike Ultra renders the same scene, but with more intense effects, at 4K resolution.
No surprise here after seeing the in-game results: Nvidia’s new card absolutely whomps on all comers, becoming the first graphics card to crack the 5000 score barrier in Fire Strike Ultra. Heck, the Fury X is the only other card to even break 4000, and even then just barely.
Next page: Power and temperature results
Power and heat
Finally, let’s take a look at the GTX 1080’s power and thermal results.
All of AMD’s recent graphics cards use vastly more power than their Nvidia counterparts, full stop. The GTX 1080 only uses 20 watts more power under extreme load than the GTX 980, and considerably less power than a hot-and-bothered Titan X, even though it pushes out significantly more performance. The Pascal architecture is indeed incredibly power-efficient, in other words.
Power is measured by plugging the entire system into a Watts Up meter, then running a stress test with Furmark for 15 minutes. It’s basically a worst-case scenario, pushing graphics cards to their limits.
On the flip side of the coin, the GTX 1080 runs slightly hotter than the GTX 980 under a Furmark load, hitting temperatures on a par with the Titan X. (The water-cooled Fury X is the outlier in the results.) Furmark truly is a worst-case scenario here, though—temperatures during actual gameplay tended to hover around 70 to 75 degrees Celsius—and the cooling system on reference cards are never as efficient as custom third-party solutions. Look for temperatures to drop significantly in custom versions of the card.
Next page: Bottom line
A new king
So we’re back to where we started: Nvidia’s first long-awaited Pascal-based graphics card truly is a beast in every sense of the word, smashing performance records while veritably sipping on power.
The leap over the GTX 980 is nothing short of insane. While the GTX 980 delivered frame rates roughly 18 to 35 percent higher than its direct predecessor, the GTX 780, the new GeForce GTX 1080 surges ahead by a whopping 70-plus percent in every game tested. That’s crazy. The entire time I was testing this monster, I felt like David Bowman at the end of 2001: A Space Odyssey, staring wide-eyed into a new world full of stars. Moving on from 28nm GPUs is every bit as wonderful as gamers had hoped, and the GTX 1080 is everything Nvidia promised and more.
Hail the conquering hero, indeed.
Nvidia’s powerhouse isn’t quite capable of hitting 60fps at 4K in every game, as the results from Division and Far Cry Primal show, but it’s awfully close—especially if you invest in a G-Sync monitor to smooth out sub-60fps gameplay. And it’s worth noting that game engine optimization can play a big role in potential performance as the disparity in Hitman and AoTS results between AMD and Nvidia cards clearly show. Regardless, the GTX 1080 annihilates everything you throw at it.
Don’t rush to bend your knee to the freshly crowned champion just yet, though. Be patient. Give it a few weeks.
AMD promised that its Polaris GPUs would show up in the middle of the year, and numerous leaks hint that its big unveiling will come at Computex in early June. Every indication from the company seems to suggest that its initial Polaris salvo will target more mainstream prices rather than starting from the top like the GTX 1080, but who knows? AMD’s Radeon cards are leaping to 14nm FinFET technology as well, and if Team Red has something up its sleeve, things could get very interesting, very quickly. That goes doubly so if Radeon cards continue to hold a commanding lead over Nvidia in DX12 async compute performance.
Waiting a few weeks will also likely give custom cards from Nvidia’s board partners time to trickle into the market, and who knows how high those beasts will be able to fly? Plus, the GTX 1070 is scheduled to hit the streets on June 10 with what Nvidia claims is Titan X-level performance—and at $370, it’s almost half the price of the GTX 1080 itself. Considering that the GTX 1080 delivers roughly a third more performance than the Titan X, the GTX 1070 may assume the GTX 970’s position as the enthusiast price/performance sweet spot in Nvidia’s lineup. Patience is a virtue!
Similarly, I’d counsel GTX 980 Ti and Titan X owners to sit tight before bolting out the door for a GTX 1080. Sure, Nvidia’s new card outpunches the 900-series heavyweights, but it doesn’t render them obsolete—and can you imagine how mind-blowing a full-fat GTX 1080 Ti or Pascal-based Titan would be, if the GTX 1080 can do this?
Don’t feel bad if you can’t resist the urge to grab a GTX 1080 though. A Pascal-based Titan would be unlikely to launch before spring 2017, and Nvidia’s new flagship is hands-down the new paragon for graphics cards. The GTX 1080 is badass incarnate.