Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 9 submissions in the queue.
posted by martyb on Monday December 03 2018, @07:41PM   Printer-friendly
from the moah-powah dept.

Nvidia has announced its $2,500 Turing-based Titan RTX GPU. It is said to have a single precision performance of 16.3 teraflops and "tensor performance" of 130 teraflops. Double precision performance has been neutered down to 0.51 teraflops, down from 6.9 teraflops for last year's Volta-based Titan V.

The card includes 24 gigabytes of GDDR6 VRAM clocked at 14 Gbps, for a total memory bandwidth of 672 GB/s.

Drilling a bit deeper, there are really three legs to Titan RTX that sets it apart from NVIDIA's other cards, particularly the GeForce RTX 2080 Ti. Raw performance is certainly once of those; we're looking at about 15% better performance in shading, texturing, and compute, and around a 9% bump in memory bandwidth and pixel throughput.

However arguably the lynchpin to NVIDIA's true desired market of data scientists and other compute users is the tensor cores. Present on all NVIDIA's Turing cards and the heart and soul of NVIIDA's success in the AI/neural networking field, NVIDIA gave the GeForce cards a singular limitation that is none the less very important to the professional market. In their highest-precision FP16 mode, Turing is capable of accumulating at FP32 for greater precision; however on the GeForce cards this operation is limited to half-speed throughput. This limitation has been removed for the Titan RTX, and as a result it's capable of full-speed FP32 accumulation throughput on its tensor cores.

Given that NVIDIA's tensor cores have nearly a dozen modes, this may seem like an odd distinction to make between the GeForce and the Titan. However for data scientists it's quite important; FP32 accumulate is frequently necessary for neural network training – FP16 accumulate doesn't have enough precision – especially in the big money fields that will shell out for cards like the Titan and the Tesla. So this small change is a big part of the value proposition to data scientists, as NVIDIA does not offer a cheaper card with the chart-topping 130 TFLOPS of tensor performance that Titan RTX can hit.

Previously: More Extreme in Every Way: The New Titan Is Here – NVIDIA TITAN Xp
Nvidia Announces Titan V
Nvidia Announces Turing Architecture With Focus on Ray-Tracing and Lower-Precision Operations
Nvidia Announces RTX 2080 Ti, 2080, and 2070 GPUs, Claims 25x Increase in Ray-Tracing Performance
Nvidia's Turing GPU Pricing and Performance "Poorly Received"


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Informative) by takyon on Tuesday December 04 2018, @01:41AM (1 child)

    by takyon (881) <{takyon} {at} {soylentnews.org}> on Tuesday December 04 2018, @01:41AM (#769380) Journal

    130 TFLOPS is lower precision "tensor performance", not the number given by LINPACK. It's useful to machine learning users, but misleading.

    However if a technology like this [soylentnews.org] pans out, we could see 1 petaflops smartphone SoCs or maybe 1 exaflops desktop PCs.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    Starting Score:    1  point
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 2) by opinionated_science on Tuesday December 04 2018, @02:22PM

    by opinionated_science (4031) on Tuesday December 04 2018, @02:22PM (#769547)

    indeed a more appropriate label would be "Titan RTX now 510 GFlops Double precision!!!!".

    If you want a real comparison, the ASCI red machine (1997) had a peak of 1.3 Tflops (LINPACK), remained number 1 for 7 years (2 upgrades).

    That was an x86 chip and they used 9152 of them.

    two decades supercomputer to desktop.

    Let that sink in as you have your coffee....