AMD Announces Milan-X Epyc With 3D V-Cache, Bergamo, and First MCM GPU: Instinct MI200

AMD Reveals ‘Instinct’ for Machine Intelligence 15 comments

Arthur T Knackerbracket has found the following story:

At the AMD Tech Summit in Sonoma, Calif., last week (Dec. 7-9), CEO Lisa Su unveiled the company's vision to accelerate machine intelligence over the next five to ten years with an open and heterogeneous computing approach and a new suite of hardware and open-source software offerings.
The roots for this strategy can be traced back to the company's acquisition of graphics chipset manufacturer ATI in 2006 and the subsequent launch of the CPU-GPU hybrid Fusion generation of computer processors. In 2012, the Fusion platform matured into the Heterogeneous Systems Architecture (HSA), now owned and maintained by the HSA Foundation.
Ten years since launching Fusion, AMD believes it has found the killer app for heterogeneous computing in machine intelligence, which is driven by exponential data surges.
"We generate 2.5 quintillion bytes of data every single day – whether you're talking about Tweets, YouTube videos, Facebook, Instagram, Google searches or emails," said Su. "We have incredible amounts of data out there. And the thing about this data is it's all different – text, video, audio, monitoring data. With all this different data, you really are in a heterogeneous system and that means you need all types of computing to satisfy this demand. You need CPUs, you need GPUs, you need accelerators, you need ASICS, you need fast interconnect technology. The key to it is it's a heterogeneous computing architecture.
"Why are we so excited about this? We've actually been talking about heterogeneous computing for the last ten years," Su continued. "This is the reason we wanted to bring CPUs and GPUs together under one roof and we were doing this when people didn't understand why we were doing this and we were also learning about what the market was and where the market needed these applications, but it's absolutely clear that for the machine intelligence era, we need heterogeneous compute."
Aiming to boost the performance, efficiency, and ease of implementation of deep learning workloads, AMD is introducing a brand-new hardware platform, Radeon Instinct, and new Radeon open source software solutions.
[...] "We are going to address key verticals that leverage a common infrastructure," said Raja Koduri, senior vice president and chief architect of Radeon Technologies Group. "The building block is our Radeon Instinct hardware platform, and above that we have the completely open source Radeon software platform. On top of that we're building optimized machine learning frameworks and libraries."
AMD is also investing in open interconnect technologies for heterogeneous accelerators; the company is a founding member of CCIX, Gen-Z and OpenCAPI.
[...] The AMD Tech Summit is a follow-on to the inaugural summit that debuted last December (2015). That first event was initiated by Raja Koduri as a team-building activity for the newly minted Radeon Technologies Group. The initial team of about 80, essentially hand-picked by Koduri to focus on graphics, met in Sonoma along with about 15 members of the press. The event was expanded this year to accommodate other AMD departments and nearly 100 media and analyst representatives.

-- submitted from IRC

Original Submission

AMD Launches "Milan" Epyc Server CPUs, with Zen 3 and up to 64 Cores 16 comments

takyon writes:

AMD Unveils EPYC 'Milan' 7003 CPUs, Zen 3 Comes to 64-Core Server Chips

AMD unveiled its EPYC 7003 'Milan' processors today, claiming that the chips, which bring the company's powerful Zen 3 architecture to the server market for the first time, take the lead as the world's fastest server processor with its flagship 64-core 128-thread EPYC 7763. Like the rest of the Milan lineup, this chip comes fabbed on the 7nm process and is drop-in compatible with existing servers. AMD claims it brings up to twice the performance of Intel's competing Xeon Cascade Lake Refresh chips in HPC, Cloud, and enterprise workloads, all while offering a vastly better price-to-performance ratio.
Milan's agility lies in the Zen 3 architecture and its chiplet-based design. This microarchitecture brings many of the same benefits that we've seen with AMD's Ryzen 5000 series chips that dominate the desktop PC market, like a 19% increase in IPC and a larger unified L3 cache. Those attributes, among others, help improve AMD's standing against Intel's venerable Xeon lineup in key areas, like single-threaded work, and offer a more refined performance profile across a broader spate of applications.

One interesting new SKU is the EPYC 7663, a 56-core, 112-thread CPU with 7 working cores on each of the 8-core chiplets. There is also a 28-core EPYC 7453.

Next up, Zen 4 "Genoa".

Also at AnandTech, The Next Platform, Phoronix, and Ars Technica.

Original Submission

AMD at Computex 2021: 5000G APUs, 6000M Mobile GPUs, FidelityFX Super Resolution, and 3D Chiplets 10 comments

takyon writes:

AMD's Ryzen 5000G APUs now have a release date for the DIY market: August 5th. The 8-core Ryzen 7 5700G has a suggested price of $359, while the 6-core Ryzen 5 5600G will be $259.

AMD announced the Radeon RX 6800M, 6700M, and 6600M discrete GPUs for laptops, promising better performance, efficiency, and battery-constrained performance. The Radeon RX 6800M is a 40 compute unit design (equivalent to the Radeon RX 6700 XT on desktop) with 12 GB of VRAM.

AMD biggest announcements were the introduction of FidelityFX Super Resolution (FSR) and the demonstration of a 3D chiplet design. FSR uses a spatial scaling algorithm to upscale game graphics for higher frame rates at a given resolution. The algorithm competes with Nvidia's Deep Learning Super Sampling (DLSS), but will be released as open source and work with some older AMD GPUs, integrated graphics, as well as competing products from Nvidia and Intel (it was shown running on an Nvidia GTX 1060).

AMD CEO Lisa Su also showed off a modified, delidded Ryzen 9 5900X CPU prototype, with "3D V-Cache technology". It was identical to the standard 5900X with the exception of through-silicon via (TSV) stacked L3 cache. This allowed the 5900X prototype to have 192 MB of total L3 cache instead of 64 MB (96 MB per 8-core chiplet). AMD claims it can run games with an average of +15% performance (simply due to the larger cache size), and some version of this will appear in products that are starting production at the end of 2021.

Original Submission

AMD Unveils New Ryzen V-Cache Details at HotChips 33 7 comments

owl writes:

AMD Unveils New Ryzen V-Cache Details at HotChips 33:

AMD gave us more information about its upcoming V-Cache at Hot Chips this year, the annual conference where semiconductor engineers from all over the industry come together to ~~crow over~~ disclose details regarding their technical achievements in the past 12 months.
Earlier this year, AMD announced that it would not advance directly from Zen 3 to Zen 4. Instead, it would iterate on the Zen 3 core by stacking a full 64MB of 7nm L3 cache vertically on the core. AMD claims this can improve performance by up to 15 percent based on 1080p gaming results. The improvement in other applications is unknown.

Original Submission

AMD Aims to Increase Energy Efficiency of Epyc CPUs and Instinct AI Accelerators 30x by 2025 17 comments

takyon writes:

AMD wants to make its chips 30 times more energy-efficient by 2025

Today, [AMD] announced its most ambitious goal yet—to increase the energy efficiency of its Epyc CPUs and Instinct AI accelerators 30 times by 2025. This would help data centers and supercomputers achieve high performance with significant power savings over current solutions.
If it achieves this goal, the savings would add up to billions of kilowatt-hours of electricity saved in 2025 alone, meaning the power required to perform a single calculation in high-performance computing tasks will have decreased by 97 percent.
Increasing energy efficiency this much will involve a lot of engineering wizardry, including AMD's stacked 3D V-Cache chiplet technology. The company acknowledges the difficult task ahead of it, now that "energy-efficiency gains from process node advances are smaller and less frequent."

What does it mean?

In addition to compute node performance/Watt measurements, to make the goal particularly relevant to worldwide energy use, AMD uses segment-specific datacenter power utilization effectiveness (PUE) with equipment utilization taken into account. The energy consumption baseline uses the same industry energy per operation improvement rates as from 2015-2020, extrapolated to 2025. The measure of energy per operation improvement in each segment from 2020-2025 is weighted by the projected worldwide volumes multiplied by the Typical Energy Consumption (TEC) of each computing segment to arrive at a meaningful metric of actual energy usage improvement worldwide.

See the 25x20 Initiative from a few years ago.

Original Submission

Frontier Supercomputer Breaks the ExaFLOPS Barrier, Tops the TOP500 List 19 comments

takyon writes:

The Frontier supercomputer at the Oak Ridge National Laboratory (ORNL) has exceeded 1.1 exaFLOPS (Rmax), leading the June 2022 TOP500 list as the world's fastest supercomputer and the first truly "exascale" system.

Frontier uses 9,408 64-core Epyc 7A53 CPUs and 37,632 AMD Instinct MI250X GPUs. It has 4.6 petabytes each of DDR4 and High Bandwidth Memory.

Frontier also reached #2 on the June 2022 Green500 list at 52.227 gigaFLOPS/Watt, behind the smaller Frontier Test & Development System:

Previously, Frontier had been characterized as a two peak exaflops system, but its first Top500 benchmark measures some 1.686 peak exaflops. (Oak Ridge said that there remains "much higher headroom on the GPUs and the CPUs" to achieve the two peak exaflops target.) Outside of Linpack and the Top500, the system benchmarks at 6.88 exaflops of mixed-precision performance on HPL-AI. The team ran out of time and was not able to submit an HPCG benchmark.
[...] Frontier also achieved another win out of the gate: second place on the spring 2022 Green500 list, which ranks supercomputers by their flops per watt. The Oak Ridge team accomplished this by delivering those 1.102 Linpack exaflops in a 21.1-megawatt power envelope, an efficiency of 52.23 gigaflops per watt (which works out to one exaflops at 19.15 megawatts). This puts the system well within the 20-megawatt exascale power envelope target set by DARPA in 2008—a target that had been viewed with much skepticism over the ensuing 14 years. Frontier was only outpaced in efficiency by its own test and development system (Frontier TDS, aka "Crusher"), which delivered 62.68 gigaflops per watt.

#10: 30.05 petaflops (Nov. 2021) → 46.10 petaflops (June 2022)
#100: 4.79 petaflops → 5.39 petaflops
#500: 1.65 petaflops → 1.65 petaflops (both are Lenovo C1040, Xeon E5-2673v4 20C 2.3GHz systems)

Previously: New TOP500 List Released -- Fugaku Holds Top Spot, Exascale Remains Elusive; Green500 Released Too!
Top500: No Exascale, Fugaku Still Reigns, Polaris Debuts at #12

Original Submission

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Log In