The U.S. leads the June 2018 TOP500 list with a 122.3 petaflops system:
The TOP500 celebrates its 25th anniversary with a major shakeup at the top of the list. For the first time since November 2012, the US claims the most powerful supercomputer in the world, leading a significant turnover in which four of the five top systems were either new or substantially upgraded.
Summit, an IBM-built supercomputer now running at the Department of Energy's (DOE) Oak Ridge National Laboratory (ORNL), captured the number one spot with a performance of 122.3 petaflops on High Performance Linpack (HPL), the benchmark used to rank the TOP500 list. Summit has 4,356 nodes, each one equipped with two 22-core Power9 CPUs, and six NVIDIA Tesla V100 GPUs. The nodes are linked together with a Mellanox dual-rail EDR InfiniBand network.
[...] Sierra, a new system at the DOE's Lawrence Livermore National Laboratory took the number three spot, delivering 71.6 petaflops on HPL. Built by IBM, Sierra's architecture is quite similar to that of Summit, with each of its 4,320 nodes powered by two Power9 CPUs plus four NVIDIA Tesla V100 GPUs and using the same Mellanox EDR InfiniBand as the system interconnect.
The #100 system has an Rmax of 1.703 petaflops, up from 1.283 petaflops in November. The #500 system has an Rmax of 715.6 teraflops, up from 548.7 teraflops in June.
273 systems have a performance of at least 1 petaflops, up from 181 systems. The combined performance of the top 500 systems is 1.22 exaflops, up from 845 petaflops.
On the Green500 list, Shoubu system B's efficiency has been adjusted to 18.404 gigaflops per Watt from 17.009 GFLOPS/W. The Summit supercomputer, #1 on TOP500, debuts at #5 on the Green500 with 13.889 GFLOPS/W. Japan's AI Bridging Cloud Infrastructure (ABCI) supercomputer, #5 on TOP500 (19.88 petaflops Rmax), is #8 on the Green500 with 12.054 GFLOPS/W.
Previously: TOP500 List #50 and Green500 List #21: November 2017
Related Stories
The fiftieth TOP500 list has been released. Although there has been little change at the top of the list, China now dominates the list in terms of the number of systems, rising to 202 from 160 in June, with the U.S. falling to 143 systems from 169. However, this seems to be the result of Chinese vendors pushing more commercial systems to get on the list:
An examination of the new systems China is adding to the list indicates concerted efforts by Chinese vendors Inspur, Lenovo, Sugon and more recently Huawei to benchmark loosely coupled Web/cloud systems that strain the definition of HPC. To wit, 68 out of the 96 systems that China introduced onto the latest list utilize 10G networking and none are deployed at research sites. The benchmarking of Internet and telecom systems for Top500 glory is not new. You can see similar fingerprints on the list (current and historical) from HPE and IBM, but China has doubled down. For comparison's sake, the US put 19 new systems on the list and eight of those rely on 10G networking. [...] Snell provided additional perspective: "What we're seeing is a concerted effort to list systems in China, particularly from China-based system vendors. The submission rules allow for what is essentially benchmarking by proxy. If Linpack is run and verified on one system, the result can be assumed for other systems of the same (or greater) configuration, so it's possible to put together concerted efforts to list more systems, whether out of a desire to show apparent market share, or simply for national pride."
Sunway TaihuLight continues to lead the list at just over 93 petaflops. The Gyoukou supercomputer has jumped from #69 (~1.677 petaflops) in the June list to #4 (~19.136 petaflops). Due to its use of PEZY "manycore" processors, Gyoukou is now the supercomputer with the highest number of cores in the list's history (19,860,000). The Trinity supercomputer has been upgraded with Xeon Phi processors, more than tripling the core count and bringing performance to ~14.137 petaflops (#7) from ~8.1 petaflops (#10). Each of the top 10 supercomputers now has a measured LINPACK performance of at least 10 petaflops.
The #100 system has an Rmax of 1.283 petaflops, up from 1.193 petaflops in June. The #500 system has an Rmax of 548.7 teraflops, up from 432.2 teraflops in June. 181 systems have a performance of at least 1 petaflops, up from 138 systems. The combined peformance of the top 500 systems is 845 petaflops, up from 749 petaflops.
Things are a little more interesting on the Green500 list. The Shoubu system B has an efficiency of 17.009 gigaflops per Watt, up from TSUBAME3.0's 14.11 GFLOPS/W at the #1 position in June (TSUBAME3.0 quadrupled its performance while its efficiency dipped to 13.704 GFLOPS/W (#6) on the new list). The top 4 systems all exceed 15 GFLOPS/W. #5 on the Green500 list is Gyoukou, which is #4 on the TOP500. Piz Daint is hanging in there at #10 on the Green500 list and #3 on the TOP500.
All of the new top 3 systems on the Green500 list (and Gyoukou at #5) use the PEZY-SC2 manycore processor. The SC2 has 2,048 cores and 8 threads per core, and has a single-precision peak performance of about 8.192 TFLOPS. Each SC2 also includes six MIPS management cores, making it possible to eliminate the need for an Intel Xeon host processor, although that has not been done in any of the new systems.
At 17 GFLOPS/W, it would take about 58.8 megawatts to power a 1 exaflops supercomputer. 20-25 MW is the preferred power level for initial exascale systems, although we may see a 40 MW system.
Previously: New List of TOP500 Supercomputers [Updated]
In an interview posted just before the release of the latest TOP500 list, high performance computing expert Dr. Thomas Sterling (one of the two builders of the original "Beowulf cluster") had this to say about the possibility of reaching "zettascale" (beyond 1,000 exaflops):
I'll close here by mentioning two other possibilities that, while not widely considered currently, are nonetheless worthy of research. The first is superconducting supercomputing and the second is non-von Neumann architectures. Interestingly, the two at least in some forms can serve each other making both viable and highly competitive with respect to future post-exascale computing designs. Niobium Josephson Junction-based technologies cooled to four Kelvins can operate beyond 100 and 200 GHz and has slowly evolved over two or more decades. When once such cold temperatures were considered a show stopper, now quantum computing – or at least quantum annealing – typically is performed at 40 milli-Kelvins or lower, where four Kelvins would appear like a balmy day on the beach. But latencies measured in cycles grow proportionally with clock rate and superconducting supercomputing must take a very distinct form from typical von Neumann cores; this is a controversial view, by the way.
Possible alternative non-von Neumann architectures that would address this challenge are cellular automata and data flow, both with their own problems, of course – nothing is easy. I introduce this thought not to necessarily advocate for a pet project – it is a pet project of mine – but to suggest that the view of the future possibilities as we enter the post-exascale era is a wide and exciting field at a time where we may cross a singularity before relaxing once again on a path of incremental optimizations.
I once said in public and in writing that I predicted we would never get to zettaflops computing. Here, I retract this prediction and contribute a contradicting assertion: zettaflops can be achieved in less than 10 years if we adopt innovations in non-von Neumann architecture. With a change to cryogenic technologies, we can reach yottaflops by 2030.
The rest of the interview covers a number of interesting topics, such as China's increased presence on the supercomputing list.
Also at NextBigFuture.
Previously: Thomas Sterling: 'I Think We Will Never Reach Zettaflops' (2012)
Related: IBM Reduces Neural Network Energy Consumption Using Analog Memory and Non-Von Neumann Architecture
IEEE Releases the International Roadmap for Devices and Systems (IRDS)
June 2018 TOP500 List: U.S. Claims #1 and #3 Spots
(Score: 2) by opinionated_science on Monday June 25 2018, @11:21PM (5 children)
any one know the linpack score for just the power9+ chips? Or even some real results?
I toured the ORNL facility a few months back, but all the focus was on the GPU's - which Nvidia has plenty of info on!!
(Score: 2) by takyon on Tuesday June 26 2018, @12:00AM (4 children)
I was under the impression that Xeons and Power9s just managed the GPUs and/or manycore chips.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 0) by Anonymous Coward on Tuesday June 26 2018, @12:13AM (2 children)
do you really need 44 cores running at 4Ghz to manage 6 V100 ?
(Score: 3, Funny) by bob_super on Tuesday June 26 2018, @12:17AM (1 child)
Let me tell you about that wonderful Javascript/Ruby Framework I just found for your application ...
(Score: 2) by c0lo on Tuesday June 26 2018, @12:34AM
Sooo outdated, pops!
We now have [wikipedia.org]:. Dart, Go, Python, Rust, Shell, TypeScript
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 2) by opinionated_science on Wednesday June 27 2018, @11:36AM
well that was certainly the case for Titan (I know the guys who LINPACK'd it).
I was curious because a bit of arithmetic gives each GPU at 4.679TFlop/s (28TF/node).
Nvidia's marketing blurb gives the V100 at 7.8TF. So this is approximately %60 efficiency.
Since many applications still use the CPU for a lot of work, until I get to play with it, I was wondering what the Power9 pulls on it's own.
I managed to get the Xeon phi's to give 990Gflops (almost 1T), though you could use it to heat an apt!!!
(Score: 2) by takyon on Tuesday June 26 2018, @12:32AM (15 children)
Pretty soon, all of the top 500 systems will exceed 1 petaflops. No more crappy terascale machines!
And finally, all 500 systems combined total over 1 exaflops. If you compared the ratio of combined top 500 performance to the top system, you could come up with a number of exaflops that could be reached by the time the #1 system hits 1 exaflops:
1.22 exaflops / 122.3 petaflops ~= 10
845 petaflops / 93 petaflops ~= 9 (Nov. 2017)
566.7 petaflops / 93 petaflops ~= 6 (Jun. 2016)
420 petaflops / 33.9 petaflops ~= 12.4 (Nov. 2015)
We'll probably have a combined 5-8 exaflops when the first 1 exaflops system lands. Maybe more if two or three countries try to rush to claim the milestone.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by c0lo on Tuesday June 26 2018, @12:39AM (13 children)
And what exactly are 'we' doing with the combined exaflop?
other than listing them in the top 500, like sorta pissing context
My point: combined 'flopping power' seems a meaningless metric.
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 3, Informative) by takyon on Tuesday June 26 2018, @12:45AM
It depends on the system or country. China is said to have had trouble fully utilizing Sunway TaihuLight, but most U.S. systems likely exceed 90% most of the time:
http://www.nersc.gov/users/live-status/ [nersc.gov]
https://portal.tacc.utexas.edu [utexas.edu]
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by bob_super on Tuesday June 26 2018, @01:19AM
> combined 'flopping power' seems a meaningless metric.
Wait a couple Years for the Italians to come back, synchronous flopping will amaze you.
/world_cup
(Score: 2) by c0lo on Tuesday June 26 2018, @01:27AM (8 children)
So there's no 'we' as in 'we using the exaflopping for a common purpose/project', it's "each one for oneself".
How's that 'combined exaflop comp-power' meaningful in these circumstances?
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 2) by takyon on Tuesday June 26 2018, @02:29AM
Wow, someone is a standard deviation more cynical than usual today.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by MostCynical on Tuesday June 26 2018, @03:05AM (1 child)
World peace!
No more disease!
No more famine!
Nup, just moar! We have a bigger one! Yay team!
"I guess once you start doubting, there's no end to it." -Batou, Ghost in the Shell: Stand Alone Complex
(Score: 3, Touché) by takyon on Tuesday June 26 2018, @03:09AM
-1, Not as Cynical as Nu-c0lo
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 1) by khallow on Tuesday June 26 2018, @04:13AM (4 children)
So... one really big computer for everyone? What happens when some admin decides I can't read SN anymore because they're using the computer for more important stuff?
(Score: 2) by c0lo on Tuesday June 26 2018, @04:41AM (3 children)
I didn't imply that 'we should have a single super-computer'. I was just making the point that we don't have one.
And I made this point to serve the context for the question "What use the 'total-compute-power' has if we don't actually have a single computer nor a single project that require the use of all computers?"
And it's a genuine question, no implication that we should or should not use the 'total-compute-power'.
I hope it's clearer now. And, as usual, any pertinent (to my mind) answer will get the deserved upmod from me.
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 1) by khallow on Tuesday June 26 2018, @04:50AM (2 children)
Why wouldn't we be interested in understanding the quantity of computing power available?
(Score: 2) by c0lo on Tuesday June 26 2018, @05:06AM (1 child)
I don't know why I would not.
I don't know why I would, either.
This is why I asked, "what would be the use of this metric?"
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 1) by khallow on Tuesday June 26 2018, @11:14AM
For example, if you're thinking about building a machine that does computations that would be closely measured by the High Performance Linpack benchmark, this total would give you some idea of where your machine would stack up against current registered competition.
And who knows, there are real world problems that one can throw all that computing power at. Perhaps some day, one of those problems will become important enough to do so.
(Score: 4, Interesting) by martyb on Tuesday June 26 2018, @11:18AM (1 child)
Well, not entirely meaningless...
Let's look at the TOP500 list for 1993 [top500.org] which was the first time the list was published.
According to my calculations, the sum of the "Rpeak" for all systems on the list in June of 1993 came to: 1798.1 (GFlop/s).
tl;dr: If you gathered every single one of the 500 fastest computers on planet Earth in 1993 and put them all in one place, and somehow found a way to make them all work together, you would have a combined *peak* performance of ~1.8 teraflops... OR... you could just buy a *single* graphics *card* today.
Though effective useless in an analytical sense, I find it quite effectively gives me a subjective sense of the amazing march of computer processing power over the past quarter century.
Wit is intellect, dancing.
(Score: 2) by c0lo on Tuesday June 26 2018, @11:40AM
Thanks, the angle is interesting.
See, they tried to do it 1 year later with the Beowulf cluster [wikipedia.org]... and inadvertently triggered the creation of the green site, the meme absolutely needed a place to come into existence.
https://www.youtube.com/watch?v=aoFiw2jMy-0
(Score: 3, Informative) by martyb on Tuesday June 26 2018, @06:12PM
Well, THAT was interesting!
I downloaded all the top500 reports from the beginning (June of 1993).
Then, for each of those reports, and for each of Rmax and Rpeak, computed what the sum was.
Here is what I found:
So, the top system in June of 1993 had Rmax of 59 (GFlops). The sum of Rmax for all 500 systems on that list totalled 1127. That means the top system had 5.2% of the Rmax performance of all 500 computers on that list, combined. Similarly, the top system had Rpeak of 131 (GFlops). The sum of Rpeak for all 500 systems totalled 1798. Thus, the top system had 7.3% of the Rpeak performance of all the systems, combined.
Wit is intellect, dancing.
(Score: 2) by jasassin on Tuesday June 26 2018, @12:40AM
Yeah, but can it run Gnome with decent performance?
jasassin@gmail.com GPG Key ID: 0x663EB663D1E7F223
(Score: 2) by Gaaark on Tuesday June 26 2018, @01:01AM (1 child)
wonder what the super secret NSA computers can do?
How fast is a de-encrypto-flop?
--- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
(Score: 2) by takyon on Tuesday June 26 2018, @01:24AM
An order of magnitude or two isn't going to make a big difference. What they want is a working quantum computer. Do it right and 4,096-bit RSA is dead.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 0) by Anonymous Coward on Tuesday June 26 2018, @03:41AM (1 child)
And yet, still no sign of emergence...
(Score: 2) by FatPhil on Tuesday June 26 2018, @01:13PM
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves