Slash Boxes

SoylentNews is people

posted by janrinok on Saturday March 18, @12:44AM   Printer-friendly
from the nuka-flops dept.

Getting To Zettascale Without Needing Multiple Nuclear Power Plants:

There's no resting on your laurels in the HPC world, no time to sit back and bask in a hard-won accomplishment that was years in the making. The ticker tape has only now been swept up in the wake of the long-awaited celebration last year of finally reaching the exascale computing level, with the Frontier supercomputer housed at the Oak Ridge National Labs breaking that barrier.

With that in the rear-view mirror, attention is turning to the next challenge: Zettascale computing, some 1,000 times faster than what Frontier is running. In the heady months after his heralded 2021 return to Intel as CEO, Pat Gelsinger made headlines by saying the giant chip maker was looking at 2027 to reach zettascale.

Lisa Su, the chief executive officer who has led the remarkable turnaround at Intel's chief rival AMD, took the stage at ISSCC 2023 to talk about zettascale computing, laying out a much more conservative – some would say reasonable – timeline.

Looking at supercomputer performance trends over the past two-plus decades and the ongoing innovation in computing – think advanced package technologies, CPUs and GPUs, chiplet architectures, the pace of AI adoption, among others – Su calculated that the industry could reach the zettabyte scale within the next 10 years or so.

"We just recently passed a very significant milestone last year, which was the first exascale supercomputer," she said during her talk, noting that Frontier – built using HPE systems running on AMD chips – is "using a combination of CPUs and GPUs. Lots of technology in there. We were able to achieve an exascale of supercomputing, both from a performance standpoint and, more importantly, from an efficiency standpoint. Now we draw the line, assuming that [we can] keep that pace of innovation going. ... That's a challenge for all of us to think through. How might we achieve that?"

Supercomputing efficiency is doubling every 2.2 years, but that still projects to a zettascale system around 2035 consuming 500 megawatts at 2,140 gigaflops per watt (Nuclear Power Plant ~ 1 gigawatt).

Intel to Explore RISC-V Architecture for Zettascale Supercomputers
Intel CEO Pat Gelsinger Says Moore's Law is Back
Supercomputers with Non-Von Neumann Architectures Could Reach "Zettascale" and "Yottascale"

Original Submission

Related Stories

Supercomputers with Non-Von Neumann Architectures Could Reach "Zettascale" and "Yottascale" 23 comments

In an interview posted just before the release of the latest TOP500 list, high performance computing expert Dr. Thomas Sterling (one of the two builders of the original "Beowulf cluster") had this to say about the possibility of reaching "zettascale" (beyond 1,000 exaflops):

I'll close here by mentioning two other possibilities that, while not widely considered currently, are nonetheless worthy of research. The first is superconducting supercomputing and the second is non-von Neumann architectures. Interestingly, the two at least in some forms can serve each other making both viable and highly competitive with respect to future post-exascale computing designs. Niobium Josephson Junction-based technologies cooled to four Kelvins can operate beyond 100 and 200 GHz and has slowly evolved over two or more decades. When once such cold temperatures were considered a show stopper, now quantum computing – or at least quantum annealing – typically is performed at 40 milli-Kelvins or lower, where four Kelvins would appear like a balmy day on the beach. But latencies measured in cycles grow proportionally with clock rate and superconducting supercomputing must take a very distinct form from typical von Neumann cores; this is a controversial view, by the way.

Possible alternative non-von Neumann architectures that would address this challenge are cellular automata and data flow, both with their own problems, of course – nothing is easy. I introduce this thought not to necessarily advocate for a pet project – it is a pet project of mine – but to suggest that the view of the future possibilities as we enter the post-exascale era is a wide and exciting field at a time where we may cross a singularity before relaxing once again on a path of incremental optimizations.

I once said in public and in writing that I predicted we would never get to zettaflops computing. Here, I retract this prediction and contribute a contradicting assertion: zettaflops can be achieved in less than 10 years if we adopt innovations in non-von Neumann architecture. With a change to cryogenic technologies, we can reach yottaflops by 2030.

The rest of the interview covers a number of interesting topics, such as China's increased presence on the supercomputing list.

Also at NextBigFuture.

Previously: Thomas Sterling: 'I Think We Will Never Reach Zettaflops' (2012)

Related: IBM Reduces Neural Network Energy Consumption Using Analog Memory and Non-Von Neumann Architecture
IEEE Releases the International Roadmap for Devices and Systems (IRDS)
June 2018 TOP500 List: U.S. Claims #1 and #3 Spots

Original Submission

Intel CEO Pat Gelsinger Says Moore's Law is Back 24 comments

Intel Targeting Zettascale (1000 Exaflops) by 2027?

'We will not rest until the periodic table is exhausted' says Intel CEO on quest to keep Moore's Law alive

[Intel CEO Pat Gelsinger] showed a chart tracking the semiconductor giant progressing along a trend line to 1 trillion transistors per device by 2030. "Today we are predicting that we will maintain or even go faster than Moore's law for the next decade,"[*] Gelsinger said.

[...] In a Q&A session after his keynote, Gelsinger revealed that achieving zettascale computing using Intel technology "in 2027 is a huge internal initiative."

Intel Aims For Zettaflops By 2027, Pushes Aurora Above 2 Exaflops

"But to me, the other thing that's really exciting in the space is our Zetta Initiative, where we have said we are going to be the first to zettascale by a wide margin," Gelsinger told The Next Platform. "And we are laying out as part of the Zetta Initiative what we have to do in the processor, in the fabric, in the interconnect, and in the memory architecture — what we have to do for the accelerators, and the software architecture to do it. So, zettascale in 2027 is a huge internal initiative that is going to bring many of our technologies together. 1,000X in five years? That's pretty phenomenal."

[...] If you built a zettaflops Aurora machine today, assuming all of the information that we have is correct, it would take 411.5X as many nodes to do the job. So, that would be somewhere around 3.7 million nodes with 7.4 million CPUs and 22.2 million GPUs burning a mind-sizzling 24.7 gigawatts. Yes, gigawatts. Clearly, we are going to need some serious Moore's Law effects in transistors and packaging.

Intel to Explore RISC-V Architecture for Zettascale Supercomputers 20 comments

From Tom's Hardware:

Intel and the Barcelona Supercomputing Centre (BSC) said they would invest €400 million (around $426 million) in a laboratory that will develop RISC-V-based processors that could be used to build zettascale supercomputers. However, the lab will not focus solely on CPUs for next-generation supercomputers but also on processor uses for artificial intelligence applications and autonomous vehicles.

The research laboratory will presumably be set up in Barcelona, Spain, and will receive €400 million from Intel and the Spanish Government over 10 years. The fundamental purpose of the joint research laboratory is to develop chips based on the open-source RISC-V instruction set architecture (ISA) that could be used for a wide range of applications, including AI accelerators, autonomous vehicles, and high-performance computing.

The creation of the joint laboratory does not automatically mean that Intel will use RISC-V-based CPUs developed in the lab for its first-generation zettascale supercomputing platform but rather indicates that the company is willing to make additional investments in RISC-V. After all, last year, Intel tried to buy SiFive, a leading developer of RISC-V CPUs and is among the top sponsors of RISC-V International, a non-profit organization supporting the ISA.

[....] throughout its history, Intel invested hundreds of millions in non-x86 architectures (including RISC-based i960/i860 designs in the 1980s, Arm in the 2000s, and VLIW-based IA64/Itanium in the 1990s and the 2000s). Eventually, those architectures were dropped, but technologies developed for them found their way into x86 offerings.

I would observe that a simple well designed instruction set could require less silicon. Possibly more cores per chip using same fabrication technology. Or more speculative execution branch prediction using up some of that silicon. I would mention compiler back ends, but that is a subject best not discussed in public.

Original Submission

This discussion was created by janrinok (52) for logged-in users only. Log in and try again!
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by krishnoid on Saturday March 18, @04:13AM (1 child)

    by krishnoid (1156) on Saturday March 18, @04:13AM (#1296808)

    At least we'd then have a big player backing, supporting, and marketing wider acceptance of nuclear power. Unless they decide to run the datacenters on coal-fired steam engines.

    • (Score: 3, Insightful) by takyon on Saturday March 18, @07:20AM

      by takyon (881) <{takyon} {at} {}> on Saturday March 18, @07:20AM (#1296832) Journal

      Zettascale is an arbitrary milestone like all the others. They're not going to want to use much more than 50 megawatts to reach it, less preferred. The more efficient computers become, the more processing power you will get.

      [SIG] 10/28/2017: Soylent Upgrade v14 []
  • (Score: 2) by Ken_g6 on Saturday March 18, @06:32AM (4 children)

    by Ken_g6 (3706) on Saturday March 18, @06:32AM (#1296827) big a "chip" can be made out of chiplets? A square foot? More?
    ...whether they'll finally integrate FPGAs into chips? They're not fast, but they can make any ASIC you need, and that can be much more efficient than the usual processor.
    ...if they'll use superconductors []?

    • (Score: 3, Interesting) by takyon on Saturday March 18, @07:32AM (2 children)

      by takyon (881) <{takyon} {at} {}> on Saturday March 18, @07:32AM (#1296834) Journal

      3D packaging is the way forward. The flatlands of 2D planar chips will become a city of skyscrapers. All the memory needed will have to go into the chip to hit efficiency targets.

      Optical interconnect to speed up non-stacked chiplet communication.

      Everything and the kitchen sink can be thrown into consumer and server chips. FPGAs but mostly ASICs I think. ML accelerators should be coming to Intel 14th/15th gen and Zen 5 desktop (already in Phoenix mobile).

      The Wafer Scale Engine approach could be used by supercomputers if it's worth it, but with 3D stacking as well.

      [SIG] 10/28/2017: Soylent Upgrade v14 []
      • (Score: 3, Informative) by guest reader on Saturday March 18, @08:05AM (1 child)

        by guest reader (26132) Subscriber Badge on Saturday March 18, @08:05AM (#1296835)

        More information about FPGAs and accelerators in HPC can be found in paper Myths and Legends of High-Performance Computing [], written by Satoshi Matsuoka, the head of Japan's largest supercomputing center.

        "Myth 3: Extreme Specialization as Seen in Smartphones Will Push Supercomputers Beyond Moore’s Law!"

        [...]In fact, the only successful “accelerator” in the recent history of HPC is a GPU.

        [...]The reason for the acceleration is primarily that the majority of the HPC workloads are memory bandwidth bound (Domke et al. 2021).

        [...]In fact, there are mainly three reasons why the plethora of customized accelerated hardware approach would fail. The first is the most important, in that acceleration via SoC integration of various SFU is largely to enable strong scaling at a compute node level, and will be subject to the limitations of the Amdahl’s law, i.e., reducing the time to solution, the potential speedup is bound by the ratio of accelerated and non-accelerable fractions of the algorithm, which quickly limits the speedup.

        "Myth 4: Everything Will Run on Some Accelerator!"

        By proper analysis of the workloads, we may find that CPUs may continue to play a dominant role, with accelerator being an important but less dominant sidekick.

        "Myth 5: Reconfigurable Hardware Will Give You 100X Speedup!"

        The question whether reconfigurable logic can replace or ament GPUs as accelerators is interesting. FPGAs will certainly have a harder time due to their high flexibility that comes at a cost. Units built from reconfigurable logic are 10–20x less energy and performance efficient in silicon area.

        • (Score: 0) by Anonymous Coward on Monday March 20, @07:52AM

          by Anonymous Coward on Monday March 20, @07:52AM (#1297135)
          Of what use are all those teraflops, if you don't also have a very-high bandwidth interconnect capable of feeding the system with all the numbers you want it to crunch enough to keep it busy? High-performance I/O is every bit as much a part of HPC as fast computation.
    • (Score: 2) by turgid on Saturday March 18, @11:36AM

      by turgid (4318) Subscriber Badge on Saturday March 18, @11:36AM (#1296853) Journal

      In the embedded world the Programmable System on a Chip has been popular for quite a few years now. You get maybe two ARM cores plus an FPGA on the same chip. The manufacturers all provide Linux ports for them plus often things like FreeRTOS.