Stories
Slash Boxes
Comments

SoylentNews is people

posted by chromas on Thursday September 30 2021, @03:54AM   Printer-friendly
from the but-still-no-gpus dept.

AMD wants to make its chips 30 times more energy-efficient by 2025

Today, [AMD] announced its most ambitious goal yet—to increase the energy efficiency of its Epyc CPUs and Instinct AI accelerators 30 times by 2025. This would help data centers and supercomputers achieve high performance with significant power savings over current solutions.

If it achieves this goal, the savings would add up to billions of kilowatt-hours of electricity saved in 2025 alone, meaning the power required to perform a single calculation in high-performance computing tasks will have decreased by 97 percent.

Increasing energy efficiency this much will involve a lot of engineering wizardry, including AMD's stacked 3D V-Cache chiplet technology. The company acknowledges the difficult task ahead of it, now that "energy-efficiency gains from process node advances are smaller and less frequent."

What does it mean?

In addition to compute node performance/Watt measurements, to make the goal particularly relevant to worldwide energy use, AMD uses segment-specific datacenter power utilization effectiveness (PUE) with equipment utilization taken into account. The energy consumption baseline uses the same industry energy per operation improvement rates as from 2015-2020, extrapolated to 2025. The measure of energy per operation improvement in each segment from 2020-2025 is weighted by the projected worldwide volumes multiplied by the Typical Energy Consumption (TEC) of each computing segment to arrive at a meaningful metric of actual energy usage improvement worldwide.

See the 25x20 Initiative from a few years ago.

See also: NVIDIA CEO Jensen Huang to unveil new AI technologies and products at GTC Keynote in November


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by Tork on Thursday September 30 2021, @10:14PM (2 children)

    by Tork (3914) Subscriber Badge on Thursday September 30 2021, @10:14PM (#1183212)
    Question: What does an AI Accelerator do? What sort of problem is being sped up? Is it that different it can't be done on a modern 3d card?
    --
    🏳️‍🌈 Proud Ally 🏳️‍🌈
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by takyon on Friday October 01 2021, @12:35AM (1 child)

    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Friday October 01 2021, @12:35AM (#1183245) Journal

    "AI accelerators" are typically more efficient at accelerating machine learning training and algorithms than general-purpose GPUs. To the point where smartphone SoCs have separate AI/ML/NPU/TPU/whatever cores (taking up their own die space) in addition to the GPU cores.

    For example, look at Apple [wikipedia.org]:

    A11 Bionic (2017): 2-core, 0.6 trillion operations per second (TOPS)
    A12 Bionic (2018): 8-core, 5.0 TOPS
    A13 Bionic (2019): 8-core, 5.5 TOPS
    A14 Bionic (2020): 16-core, 11.0 TOPS
    A15 Bionic (2021): 16-core, 15.8 TOPS

    On-device machine learning capabilities are arguably more relevant to smartphones and augmented reality, but will still be coming to new consumer desktop and laptop CPUs over the next couple of years.

    AMD's Instinct AI accelerators are the competition to products like Nvidia's A100 [nvidia.com].

    For AI training, recommender system models like DLRM have massive tables representing billions of users and billions of products. A100 80GB delivers up to a 3x speedup, so businesses can quickly retrain these models to deliver highly accurate recommendations.

    The A100 80GB also enables training of the largest models with more parameters fitting within a single HGX-powered server such as GPT-2, a natural language processing model with superhuman generative text capability. This eliminates the need for data or model parallel architectures that can be time consuming to implement and slow to run across multiple nodes.

    [...] On a big data analytics benchmark for retail in the terabyte-size range, the A100 80GB boosts performance up to 2x, making it an ideal platform for delivering rapid insights on the largest of datasets. Businesses can make key decisions in real time as data is updated dynamically.

    For scientific applications, such as weather forecasting and quantum chemistry, the A100 80GB can deliver massive acceleration. Quantum Espresso, a materials simulation, achieved throughput gains of nearly 2x with a single node of A100 80GB.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 2) by Tork on Friday October 01 2021, @03:23PM

      by Tork (3914) Subscriber Badge on Friday October 01 2021, @03:23PM (#1183392)
      Thank you!
      --
      🏳️‍🌈 Proud Ally 🏳️‍🌈