A new list was published on top500.org. It might be noteworthy that the NSA, Google, Amazon, Microsoft etc. are not submitting information to this list. Currently, the top two places are occupied by China, with a comfortable 400% head-start in peak-performance and 370% Rmax performance to the 3rd place (Switzerland). US appears on rank 4, Japan on rank 7, and Germany is not in the top ten at all.
All operating systems in the top-10 are Linux and derivates. It seems obvious that, since it is highly optimized hardware, only operating systems are viable which can be fine-tune (so, either open source or with vendor-support for such customizations). Still I would have thought that, since a lot of effort needs to be invested anyway, maybe other systems (BSD?) could be equally suited to the task.
takyon: TSUBAME3.0 leads the Green500 list with 14.110 gigaflops per Watt. Piz Daint is #3 on the TOP500 and #6 on the Green500 list, at 10.398 gigaflops per Watt.
According to TOP500, this is only the second time in the history of the list that the U.S. has not secured one of the top 3 positions.
The #100 and #500 positions on June 2017's list have an Rmax of 1.193 petaflops and 432.2 teraflops respectively. Compare to 1.0733 petaflops and 349.3 teraflops for the November 2016 list.
[Update: Historical lists can be found on https://www.top500.org/lists/. There was a time when you only needed 0.4 gigaflops to make the original Top500 list — how do today's mobile phones compare? --martyb]
(Score: 2) by TheRaven on Wednesday June 21 2017, @08:46AM
IBM used to use a proprietary BSD derivative, but switched to Linux because of the brand recognition. Talking to a friend who runs a few of these at Argonne, they're not even particularly interested in clever OpenMP runtimes for the same reason: the job of the OS and OpenMP runtime is to get out of the way while the carefully optimised code runs. If your OpenMP task scheduler is a bit more clever, you'll still probably lose overall from spending more CPU time in it (this may change with more accelerators, if you can designate a CPU core to running profiling and scheduling tasks and run all of the real work on more throughput-optimised cores).
sudo mod me up