Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday June 20 2017, @03:34PM   Printer-friendly
from the Is-that-a-Cray-in-your-pocket? dept.

A new list was published on top500.org. It might be noteworthy that the NSA, Google, Amazon, Microsoft etc. are not submitting information to this list. Currently, the top two places are occupied by China, with a comfortable 400% head-start in peak-performance and 370% Rmax performance to the 3rd place (Switzerland). US appears on rank 4, Japan on rank 7, and Germany is not in the top ten at all.

All operating systems in the top-10 are Linux and derivates. It seems obvious that, since it is highly optimized hardware, only operating systems are viable which can be fine-tune (so, either open source or with vendor-support for such customizations). Still I would have thought that, since a lot of effort needs to be invested anyway, maybe other systems (BSD?) could be equally suited to the task.

RankSiteSystemCoresRmax (TFlop/s)Rpeak (TFlop/s)Power (kW)
1China: National Supercomputing Center in WuxiSunway TaihuLight - Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway - NRCPC10,649,60093,014.6125,435.915,371
2China: National Super Computer Center in GuangzhouTianhe-2 (MilkyWay-2) - TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P - NUDT3,120,00033,862.754,902.417,808
3Switzerland: Swiss National Supercomputing Centre (CSCS)Piz Daint - Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries interconnect , NVIDIA Tesla P100 - Cray Inc.361,76019,590.025,326.32,272
4U.S.: DOE/SC/Oak Ridge National LaboratoryTitan - Cray XK7, Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x - Cray Inc.560,64017,590.027,112.58,209
5U.S.: DOE/NNSA/LLNLSequoia - BlueGene/Q, Power BQC 16C 1.60 GHz, Custom - IBM1,572,86417,173.220,132.77,890
6U.S.: DOE/SC/LBNL/NERSCCori - Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries interconnect - Cray Inc.622,33614,014.727,880.73,939
7Japan: Joint Center for Advanced High Performance ComputingOakforest-PACS - PRIMERGY CX1640 M1, Intel Xeon Phi 7250 68C 1.4GHz, Intel Omni-Path - Fujitsu556,10413,554.624,913.52,719
8Japan: RIKEN Advanced Institute for Computational Science (AICS)K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect - Fujitsu705,02410,510.011,280.412,660
9U.S.: DOE/SC/Argonne National LaboratoryMira - BlueGene/Q, Power BQC 16C 1.60GHz, Custom - IBM786,4328,586.610,066.33,945
10U.S.: DOE/NNSA/LANL/SNLTrinity - Cray XC40, Xeon E5-2698v3 16C 2.3GHz, Aries interconnect - Cray Inc.301,0568,100.911,078.94,233

takyon: TSUBAME3.0 leads the Green500 list with 14.110 gigaflops per Watt. Piz Daint is #3 on the TOP500 and #6 on the Green500 list, at 10.398 gigaflops per Watt.

According to TOP500, this is only the second time in the history of the list that the U.S. has not secured one of the top 3 positions.

The #100 and #500 positions on June 2017's list have an Rmax of 1.193 petaflops and 432.2 teraflops respectively. Compare to 1.0733 petaflops and 349.3 teraflops for the November 2016 list.

[Update: Historical lists can be found on https://www.top500.org/lists/. There was a time when you only needed 0.4 gigaflops to make the original Top500 list — how do today's mobile phones compare? --martyb]


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Tuesday June 20 2017, @04:40PM (1 child)

    by Anonymous Coward on Tuesday June 20 2017, @04:40PM (#528574)

    Doesn't "Piz Daint Cray" sound like something you'd hear on the streets of Philadelphia?

    Seriously now, how do you design software and toolchains to take advantage of a large number of cores? Do you need to use specialized languages?

    How do you manage I/O? I/O is a bottleneck in desktop computing, so it must be a serious consideration in supercomputing.

    How hot does the room get when you have all of those cores running at once? If you wanted to build a supercomputer that could run at room temperature with minimal air conditioning, how big a performance hit would you have to take?

  • (Score: 1, Insightful) by Anonymous Coward on Tuesday June 20 2017, @06:32PM

    by Anonymous Coward on Tuesday June 20 2017, @06:32PM (#528636)

    [...] how do you design software and toolchains to take advantage of a large number of cores? Do you need to use specialized languages?

    How do you manage I/O? I/O is a bottleneck in desktop computing, so it must be a serious consideration in supercomputing.

    How hot does the room get when you have all of those cores running at once? If you wanted to build a supercomputer that could run at room temperature with minimal air conditioning, how big a performance hit would you have to take?

    1) A lot can be done with fairly standard languages. Fortran does very well in the HPC (high-performance computing) space. Many younger people think of Fortran as an outdated dinosaur, but this couldn't be further from the truth. Fortran is a modern language with features such as various kinds of closure, object oriented support, and much, much more. Now, I don't personally care for some aspects of the syntax, but Fortran is no dog. In fact, due to the way that arrays in Fortran are first class language constructs (unlike C where everything is just a bare pointer to a chunk of memory and an offset), Fortran code often (nearly always) outperforms other languages like C and C++. This is because more information is available for the computer to use during optimization. The compiler just can't "see" as much of what's going on in C code, with bare pointers all over the place. That said, more and more HPC codes are being written or "ported" to C/C++ now than ever before. C and C++ work just fine on supercomputers as long as the code is very carefully written. Start with MPI and OpenMP in Fortran or plain C if you want to get started the easiest way possible for programming supercomputers. The fun part is, you can even run thse MPI+OMP codes on your home desktop to test them out (at a tiny scale). -- That said, there are many custom languages, frameworks, libraries, etc. available on these machines, if one feels like getting really fancy.

    2) I/O is a big issue. Cray tends to use the Lustre parallel filesystem on their machines, but other supercomputers use different parallel filesystems like GPFS. If you've ever setup a NFS server, you can think of what supercomputers use as the same basic idea, but "on steroids". One filesystem will span many servers and many disks, so as to provide a high level of parallelism to the highly parallel application code. Cray also offers nodes with "Burst Buffers", which are just SSDs sitting on the compute nodes along with some nice software to expose these to the compute processes in an easy to use way. That said, I should make a comment for those not in the HPC space: the best way to avoid I/O is not to do it. So, at supercomputer scales, many people take the stance that nothing should touch disk unless it has to. So, data is not passed around in temp files on disk. Instead, data is communicated directly between compute nodes, in memory, using the high-speed interconnect whenever possible. I/O is a large and complicated subject, and I'm just scratching the surface here.

    3) I'm going to guess that this is mostly a cost issue, and a lot of it comes down to compute density, in terms of floor space in the datacenter. If you have air cooling, that's not really a problem per se, but then the same amount of compute power will take more floor space, because it can't be as dense and still be cooled as well as if it were liquid cooled. So, there's no performance reason, really, that things are liquid cooled, it's just that you can get denser, and this can use a smaller datacenter footprint to get the same performance. Remember that the datacenter can at times cost just as much if not more than a machine (depending on the machine and datacenter in question).