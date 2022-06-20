The new Fugaku supercomputer is bigger than Summit in practically every way. It has 3.05x cores, it has 2.8x the score in the official LINPACK tests, and consumes 2.8x the power. It also marks the first time that an Arm based system sits at number one on the top 500 list.

High performance computing is now at a point in its existence where to be the number one, you need very powerful, very efficient hardware, lots of it, and lots of capability to deploy it. Deploying a single rack of servers to total a couple of thousand cores isn't going to cut it. The former #1 supercomputer, Summit, is built from 22-core IBM Power9 CPUs paired with NVIDIA GV100 accelerators, totaling 2.4 million cores and consuming 10 MegaWatts of power. The new Fugaku supercomputer, built at Riken in partnership with Fujitsu, takes the top spot on the June 2020 #1 list, with 7.3 million cores and consuming 28 MegaWatts of power.

Fujitsu Fugaku report by Jack Dongarra (3.3 MB PDF)

The Fujitsu A64FX is a 64-bit ARM CPU with 48 cores and 2-4 cores assistant cores for the operating system. It uses 32 GiB of on-package High Bandwidth Memory 2. There are no GPUs or accelerators used in the the Fugaku supercomputer.

Fugaku can reach as high as 537 petaflops of FP64 (boost mode), or 1.07 exaflops of FP32, 2.15 exaflops of FP16, and 4.3 exaOPS of INT8. Theoretical peak memory bandwidth is 163 petabytes per second.

RMAX of #10 system: 18.2 petaflops (November 2019), 21.23 petaflops (June 2020)

RMAX of #100 system: 2.57 petaflops (November 2019), 2.802 petaflops (June 2020)

RMAX of #500 system: 1.142 petaflops (November 2019), 1.23 petaflops (June 2020)

Every six months TOP500.org announces its list of the top 500 fastest supercomputers. The new TOP500 list -- their 55th -- was announced today with a brand new system at the top.

Installed at the RIKEN Center for Computational Science, the system is named Fugaku. It is comprised of Fujitsu A64FX SoCs, each of which sports 48 cores at 2.2 GHz and is based on the ARM architecture. In total, it has 7,299,072 cores and attains an Rmax of 415.5 (PFlop/s) on the High Performance Linpack benchmark.

The previous top system is now in 2nd place. The Summit is located at the Oak Ridge National Laboratory and was built by IBM. Each node has two 22-core 3.07 GHz Power9 CPUs and six NVIDIA Tesla V100 GPUs. With a total of 2,414,592 cores, it is rated at an Rmax of 148.6 (PFlop/s).

Rounding out the top 3 is the Sierra which is also by IBM. It has 22-core POWER9 CPUs running at 3.1GHz and NVIDIA Volta GV100 GPUs. Its score is 94.6 (PFlop/s).

When the list was first published in June of 1993, the top system on the list, installed at Los Alamos National Laboratory, was a CM-5/1024 by Thinking Machines Corporation. Comprised of 1,024 cores, it was rated at a peak of 59.7 Rmax (GFlop/s). (It would require over 8.6 million of them to match the compute power of today's number one system.) in June 1993, #100 was a Cray Y-MP8/8128 installed at Lawrence Livermore National Laboratory and rated at 2.1 Rmax (GFlop/s). On that first list, 500th place went to an HPE C3840 having 4 cores and an Rmax of 0.4 (GFlop/s). Yes, that is 400 KFlop/s.

I wonder how today's cell phones would rate against that first list?

For the curious, the benchmark code can be downloaded from http://www.netlib.org/benchmark/hpl/.