NVIDIA is releasing the GeForce GTX 1080 Ti, a $699 GPU with performance and specifications similar to that of the NVIDIA Titan X:
Unveiled last week at GDC and launching [March 10th] is the GeForce GTX 1080 Ti. Based on NVIDIA's GP102 GPU – aka Bigger Pascal – the job of GTX 1080 Ti is to serve as a mid-cycle refresh of the GeForce 10 series. Like the GTX 980 Ti and GTX 780 Ti before it, that means taking advantage of improved manufacturing yields and reduced costs to push out a bigger, more powerful GPU to drive this year's flagship video card. And, for NVIDIA and their well-executed dominance of the high-end video card market, it's a chance to run up the score even more. With the GTX 1080 Ti, NVIDIA is aiming for what they're calling their greatest performance jump yet for a modern Ti product – around 35% on average. This would translate into a sizable upgrade for GeForce GTX 980 Ti owners and others for whom GTX 1080 wasn't the card they were looking for.
[...] Going by the numbers then, the GTX 1080 Ti offers just over 11.3 TFLOPS of FP32 performance. This puts the expected shader/texture performance of the card 28% ahead of the current GTX 1080, while the ROP throughput advantage stands 26%, and memory bandwidth at a much greater 51.2%. Real-world performance will of course be influenced by a blend of these factors, with memory bandwidth being the real wildcard. Otherwise, relative to the NVIDIA Titan X, the two cards should end up quite close, trading blows now and then.
Speaking of the Titan, on an interesting side note, NVIDIA isn't going to be doing anything to hurt the compute performance of the GTX 1080 Ti to differentiate the card from the Titan, which has proven popular with GPU compute customers. Crucially, this means that the GTX 1080 Ti gets the same 4:1 INT8 performance ratio of the Titan, which is critical to the cards' high neural networking inference performance. As a result the GTX 1080 Ti actually has slightly greater compute performance (on paper) than the Titan. And NVIDIA has been surprisingly candid in admitting that unless compute customers need the last 1GB of VRAM offered by the Titan, they're likely going to buy the GTX 1080 Ti instead.
The card includes 11 GB of Micron's second-generation GDDR5X memory operating at 11 Gbps compared to 12 GB of GDDR5X at 10 Gbps for the Titan X.
Previously: GDDR5X Standard Finalized by JEDEC
Nvidia Announces Tesla P100, the First Pascal GPU
Nvidia Unveils GTX 1080 and 1070 "Pascal" GPUs
Related Stories
JEDEC has finalized the GDDR5X SGRAM specification:
The new technology is designed to improve bandwidth available to high-performance graphics processing units without fundamentally changing the memory architecture of graphics cards or memory technology itself, similar to other generations of GDDR, although these new specifications are arguably pushing the phyiscal[sic] limits of the technology and hardware in its current form. The GDDR5X SGRAM (synchronous graphics random access memory) standard is based on the GDDR5 technology introduced in 2007 and first used in 2008. The GDDR5X standard brings three key improvements to the well-established GDDR5: it increases data-rates by up to a factor of two, it improves energy efficiency of high-end memory, and it defines new capacities of memory chips to enable denser memory configurations of add-in graphics boards or other devices. What is very important for developers of chips and makers of graphics cards is that the GDDR5X should not require drastic changes to designs of graphics cards, and the general feature-set of GDDR5 remains unchanged (and hence why it is not being called GDDR6).
[...] The key improvement of the GDDR5X standard compared to the predecessor is its all-new 16n prefetch architecture, which enables up to 512 bit (64 Bytes) per array read or write access. By contrast, the GDDR5 technology features 8n prefetch architecture and can read or write up to 256 bit (32 Bytes) of data per cycle. Doubled prefetch and increased data transfer rates are expected to double effective memory bandwidth of GDDR5X sub-systems. However, actual performance of graphics cards will depend not just on DRAM architecture and frequencies, but also on memory controllers and applications. Therefore, we will need to test actual hardware to find out actual real-world benefits of the new memory.
What purpose does GDDR5X serve if superior 1st and 2nd generation High Bandwidth Memory (HBM) are around? GDDR5X memory will be cheaper than HBM and its use is more of an evolutionary than revolutionary change from existing GDDR5-based hardware.
During a keynote at GTC 2016, Nvidia announced the Tesla P100, a 16nm FinFET Pascal graphics processing unit with 15.3 billion transistors intended for high performance and cloud computing customers. The GPU includes 16 GB of High Bandwidth Memory 2.0 with 720 GB/s of memory bandwidth and a unified memory architecture. It also uses the proprietary NVLink, an interconnect with 160 GB/s of bandwidth, rather than the slower PCI-Express.
Nvidia claims the Tesla P100 will reach 5.3 teraflops of FP64 (double precision) performance, along with 10.6 teraflops of FP32 and 21.2 teraflops of FP16. 3584 of a maximum possible 3840 stream processors are enabled on this version of the GP100 die.
At the keynote, Nvidia also announced a 170 teraflops (FP16) "deep learning supercomputer" or "datacenter in a box" called DGX-1. It contains eight Tesla P100s and will cost $129,000. The first units will be going to research institutions such as the Massachusetts General Hospital.
Nvidia revealed key details about its upcoming "Pascal" consumer GPUs at a May 6th event. These GPUs are built using a 16nm FinFET process from TSMC rather than the 28nm processes that were used for several previous generations of both Nvidia and AMD GPUs.
The GeForce GTX 1080 will outperform the GTX 980, GTX 980 Ti, and Titan X cards. Nvidia claims that GTX 1080 can reach 9 teraflops of single precision performance, while the GTX 1070 will reach 6.5 teraflops. A single GTX 1080 will be faster than two GTX 980s in SLI.
Both the GTX 1080 and 1070 will feature 8 GB of VRAM. Unfortunately, neither card contains High Bandwidth Memory 2.0 like the Tesla P100 does. Instead, the GTX 1080 has GDDR5X memory while the 1070 is sticking with GDDR5.
The GTX 1080 starts at $599 and is available on May 27th. The GTX 1070 starts at $379 on June 10th. Your move, AMD.
Nvidia has announced the Titan V, a $3,000 Volta-based flagship GPU capable of around 15 teraflops single-precision and 110 teraflops of "tensor performance (deep learning)". It has slightly greater performance but less VRAM than the Tesla V100, a $10,000 GPU aimed at professional users.
Would you consider it a card for "consumers"?
It seems like Nvidia announces the fastest GPU in history multiple times a year, and that's exactly what's happened again today; the Titan V is "the most powerful PC GPU ever created," in Nvidia's words. It represents a more significant leap than most products that have made that claim, however, as it's the first consumer-grade GPU based around Nvidia's new Volta architecture.
That said, a liberal definition of the word "consumer" is in order here — the Titan V sells for $2,999 and is focused around AI and scientific simulation processing. Nvidia claims 110 teraflops of performance from its 21.1 billion transistors, with 12GB of HBM2 memory, 5120 CUDA cores, and 640 "tensor cores" that are said to offer up to 9x the deep-learning performance of its predecessor.
Previously: Nvidia Releases the GeForce GTX 1080 Ti: 11.3 TFLOPS of FP32 Performance
More Extreme in Every Way: The New Titan Is Here – NVIDIA TITAN Xp
(Score: -1, Spam) by Anonymous Coward on Saturday March 11 2017, @03:43AM
Thug niggas.
(Score: 3, Interesting) by martyb on Saturday March 11 2017, @11:44AM (2 children)
As a bit of a gray-going-on-white beard, and having followed the computer industry since the early 1970's I'm curious how this graphics card's performance compares to a Cray-1. I'm very much short on time so can't look it up myself; I would appreciate it if someone could check the TOP-500 web site and do the math.
At the rate we have been going with cell phone performance, I bet the old tongue-in-cheek chestnut "Is that a Cray in your pocket?" becoming a reality! Or has that happened already?
Wit is intellect, dancing.
(Score: 3, Informative) by opinionated_science on Saturday March 11 2017, @01:57PM (1 child)
the FP64 is gimped on these devices (1/32).
They do make the P100(?) which is 1/2, like all other major chips.
https://www.top500.org/system/167094 [top500.org]
There y'go, 1.41 Gflops. This card is perhaps 300Gflops (DP), so 2 orders of magnitude...;-)
What is more impressive is the power (250W?). I imagine the major research on the Cray machines was keeping them cool!!!
The thing is as a molecular modeller , I'm still waiting for the next revision... the threshold for having sufficient computational power to calculate entirely within the GPU is almost upon us!!!
(Score: 2) by martyb on Monday March 13 2017, @11:08PM
thanks for the reply! Yeah, I had figured we were well past a cray by now, but didn't realize by how much; nice!
I don't suppose when the next generation card comes around with upgraded memory capacity, that you'll wish it had even more memory so that you could do a finer grained analysis? =)
Wit is intellect, dancing.
(Score: 0) by Anonymous Coward on Saturday March 11 2017, @07:32PM
that is all