NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator
Like the Volta reveal 3 years ago – and is now traditional for NVIDIA GTC reveals – today's focus is on the very high end of the market. In 2017 NVIDIA launched the Volta-based GV100 GPU, and with it the V100 accelerator. V100 was a massive success for the company, greatly expanding their datacenter business on the back of the Volta architecture's novel tensor cores and sheer brute force that can only be provided by a 800mm2+ GPU. Now in 2020, the company is looking to continue that growth with Volta's successor, the Ampere architecture.
[...] Designed to be the successor to the V100 accelerator, the A100 aims just as high, just as we'd expect from NVIDIA's new flagship accelerator for compute. The leading Ampere part is built on TSMC's 7nm process and incorporates a whopping 54 billion transistors, 2.5x as many as the V100 before it. NVIDIA has put the full density improvements offered by the 7nm process in use, and then some, as the resulting GPU die is 826mm2 in size, even larger than the GV100. NVIDIA went big on the last generation, and in order to top themselves they've gone even bigger this generation.
We'll touch more on the individual specifications a bit later, but at a high level it's clear that NVIDIA has invested more in some areas than others. FP32 performance is, on paper, only modestly improved from the V100. Meanwhile tensor performance is greatly improved – almost 2.5x for FP16 tensors – and NVIDIA has greatly expanded the formats that can be used with INT8/4 support, as well as a new FP32-ish format called TF32. Memory bandwidth is also significantly expected[sic], with multiple stacks of HBM2 memory delivering a total of 1.6TB/second of bandwidth to feed the beast that is Ampere.
See also: Nvidia's first Ampere GPU is designed for data centers and AI, not your PC
Nvidia unveils Ampere GPU architecture for AI boost, and the first target is coronavirus
Previously: NVIDIA's Volta Architecture Unveiled: GV100 and Tesla V100
Related Stories
NVIDIA has detailed the full GV100 GPU as well as the first product based on the GPU, the Tesla V100:
The Volta GV100 GPU uses the 12nm TSMC FFN process, has over 21 billion transistors, and is designed for deep learning applications. We're talking about an 815mm2 die here, which pushes the limits of TSMC's current capabilities. Nvidia said it's not possible to build a larger GPU on the current process technology. The GP100 was the largest GPU that Nvidia ever produced before the GV100. It took up a 610mm2 surface area and housed 15.3 billion transistors. The GV100 is more than 30% larger.
Volta's full GV100 GPU sports 84 SMs (each SM [streaming multiprocessor] features four texture units, 64 FP32 cores, 64 INT32 cores, 32 FP64 cores) fed by 128KB of shared L1 cache per SM that can be configured to varying texture cache and shared memory ratios. The GP100 featured 60 SMs and a total of 3840 CUDA cores. The Volta SMs also feature a new type of core that specializes in Tensor deep learning 4x4 Matrix operations. The GV100 contains eight Tensor cores per SM and deliver a total of 120 TFLOPS for training and inference operations. To save you some math, this brings the full GV100 GPU to an impressive 5,376 FP32 and INT32 cores, 2688 FP64 cores, and 336 texture units.
[...] GV100 also features four HBM2 memory emplacements, like GP100, with each stack controlled by a pair of memory controllers. Speaking of which, there are eight 512-bit memory controllers (giving this GPU a total memory bus width of 4,096-bit). Each memory controller is attached to 768KB of L2 cache, for a total of 6MB of L2 cache (vs 4MB for Pascal).
The Tesla V100 has 16 GB of HBM2 memory with 900 GB/s of memory bandwidth. NVLink interconnect bandwidth has been increased to 300 GB/s.
Note the "120 TFLOPS" for machine learning operations. Microsoft is "doubling down" on AI, and NVIDIA's sales to data centers have tripled in a year. Sales of automotive-oriented GPUs (more machine learning) also increased.
IBM Unveils New AI Software, Will Support Nvidia Volta
Also at AnandTech and HPCWire.
(Score: 4, Informative) by takyon on Thursday May 14 2020, @04:13PM
Nvidia picked AMD for its supercomputer in a box:
NVIDIA Ditches Intel Xeon, Goes All Onboard With AMD’s EPYC CPUs With Next-Gen Ampere GPUs [wccftech.com]
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: -1, Redundant) by Anonymous Coward on Thursday May 14 2020, @08:07PM
fuck you, nvidia!
(Score: 0) by Anonymous Coward on Thursday May 14 2020, @11:23PM
Is anybody running any of these compute cards on their home rig? If so, what for?