Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday June 20 2017, @03:34PM   Printer-friendly
from the Is-that-a-Cray-in-your-pocket? dept.

A new list was published on top500.org. It might be noteworthy that the NSA, Google, Amazon, Microsoft etc. are not submitting information to this list. Currently, the top two places are occupied by China, with a comfortable 400% head-start in peak-performance and 370% Rmax performance to the 3rd place (Switzerland). US appears on rank 4, Japan on rank 7, and Germany is not in the top ten at all.

All operating systems in the top-10 are Linux and derivates. It seems obvious that, since it is highly optimized hardware, only operating systems are viable which can be fine-tune (so, either open source or with vendor-support for such customizations). Still I would have thought that, since a lot of effort needs to be invested anyway, maybe other systems (BSD?) could be equally suited to the task.

RankSiteSystemCoresRmax (TFlop/s)Rpeak (TFlop/s)Power (kW)
1China: National Supercomputing Center in WuxiSunway TaihuLight - Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway - NRCPC10,649,60093,014.6125,435.915,371
2China: National Super Computer Center in GuangzhouTianhe-2 (MilkyWay-2) - TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P - NUDT3,120,00033,862.754,902.417,808
3Switzerland: Swiss National Supercomputing Centre (CSCS)Piz Daint - Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries interconnect , NVIDIA Tesla P100 - Cray Inc.361,76019,590.025,326.32,272
4U.S.: DOE/SC/Oak Ridge National LaboratoryTitan - Cray XK7, Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x - Cray Inc.560,64017,590.027,112.58,209
5U.S.: DOE/NNSA/LLNLSequoia - BlueGene/Q, Power BQC 16C 1.60 GHz, Custom - IBM1,572,86417,173.220,132.77,890
6U.S.: DOE/SC/LBNL/NERSCCori - Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries interconnect - Cray Inc.622,33614,014.727,880.73,939
7Japan: Joint Center for Advanced High Performance ComputingOakforest-PACS - PRIMERGY CX1640 M1, Intel Xeon Phi 7250 68C 1.4GHz, Intel Omni-Path - Fujitsu556,10413,554.624,913.52,719
8Japan: RIKEN Advanced Institute for Computational Science (AICS)K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect - Fujitsu705,02410,510.011,280.412,660
9U.S.: DOE/SC/Argonne National LaboratoryMira - BlueGene/Q, Power BQC 16C 1.60GHz, Custom - IBM786,4328,586.610,066.33,945
10U.S.: DOE/NNSA/LANL/SNLTrinity - Cray XC40, Xeon E5-2698v3 16C 2.3GHz, Aries interconnect - Cray Inc.301,0568,100.911,078.94,233

takyon: TSUBAME3.0 leads the Green500 list with 14.110 gigaflops per Watt. Piz Daint is #3 on the TOP500 and #6 on the Green500 list, at 10.398 gigaflops per Watt.

According to TOP500, this is only the second time in the history of the list that the U.S. has not secured one of the top 3 positions.

The #100 and #500 positions on June 2017's list have an Rmax of 1.193 petaflops and 432.2 teraflops respectively. Compare to 1.0733 petaflops and 349.3 teraflops for the November 2016 list.

[Update: Historical lists can be found on https://www.top500.org/lists/. There was a time when you only needed 0.4 gigaflops to make the original Top500 list — how do today's mobile phones compare? --martyb]


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by LoRdTAW on Tuesday June 20 2017, @05:05PM (2 children)

    by LoRdTAW (3755) on Tuesday June 20 2017, @05:05PM (#528588) Journal

    I have a few:
    1) What's next in terms of supercomputing hardware? Are we still building what amounts to a Intel PC with a video card and a fancy interconnect? Or will we see more exotic hardware like those Google AI chips or the proposed DARPA CPU: https://soylentnews.org/article.pl?sid=17/06/12/1959259 [soylentnews.org]? What about Xeon PHI's, Arm, AMD, GPU/APU, FPGA, or ASIC's?

    2) What bottlenecks do you currently have to deal with and how do you get around them? e.g. I/O, bandwidth, storage, Memory, CPU/GPU/Etc?

    3) Is AI becoming a factor in supercomputing?

    4) Lastly, Outside of AI and large government research projects, do you see any future applications for supercomputers?

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2, Interesting) by Anonymous Coward on Tuesday June 20 2017, @06:09PM (1 child)

    by Anonymous Coward on Tuesday June 20 2017, @06:09PM (#528619)

    1) What's next in terms of supercomputing hardware? Are we still building what amounts to a Intel PC with a video card and a fancy interconnect? Or will we see more exotic hardware like those Google AI chips or the proposed DARPA CPU: https://soylentnews.org/article.pl?sid=17/06/12/1959259 [soylentnews.org] [soylentnews.org]? What about Xeon PHI's, Arm, AMD, GPU/APU, FPGA, or ASIC's?

    2) What bottlenecks do you currently have to deal with and how do you get around them? e.g. I/O, bandwidth, storage, Memory, CPU/GPU/Etc?

    3) Is AI becoming a factor in supercomputing?

    4) Lastly, Outside of AI and large government research projects, do you see any future applications for supercomputers?

    1) I have to mostly recuse myself from this particular question due to NDA concerns. However, I could perhaps point you to coverage of a recent announcement from the US government on funding for exascale research: Six Exascale PathForward Vendors Selected; DoE Providing $258M [hpcwire.com]. I could also say that Moore's Law and Dennard Scaling are showing signs of slowing down. This means that easy performance gains from scaling to a smaller manufacturing node (making transistors smaller) are not coming in like they used to. This means that there may be a little more room for doing some clever hardware designing instead of just scaling the same old things down. This may also mean there is a little more room to get a real HPC-oriented CPU instead of just commodity/server parts, just maybe. -- Cray does currently sell systems with the Xeon PHI: consider Cori, #6 on the list.

    2) Yes. All of those are issues, and they all matter. If I were to pick one to focus on, I would probably pick the interconnect. I would say something like: a large number of CPUs in the same room does not a supercomputer make. That's just a lot of individual computers. To make a true supercomputer, you need to be able to have all those tens of thousands of CPUs working together on the same problem. This requires a very high-bandwidth and low-latency interconnect. Cray has historically placed a very strong emphasis on the interconnect, and Cray has created several custom interconnects in the past when commodity parts were simply not good enough. Today, one can limp along with EDR InfiniBand and do OK on the smaller end of the supercomputer market. At the top end, IB gets very expensive actually. Cray's Aries interconnect still holds its own on real world workloads, despite being a smidge old now. I can't comment on if or when Cray plans to introduce a new interconnect as a follow on to Aries. In addition to tackling the communications issues by using a high-performance interconnect, a lot of work is also done on the software side. Cray has an optimized MPI library (if you want to learn to drive a supercomputer, learn MPI and OMP), and decades of experience scaling and optimizing codes. To give just one tip, always write your code to overlap communication and computation as much as possible. That is, initiate some communication, do some other work without waiting for the comms to finish, then finally wait for confirmation that the previously initiated communication has completed only after you've done said computation. Writing good code which does a good job of this comm/comp overlap, where possible, is a good place to start.

    (3) Yes. While NVIDIA likes to tell stories about being able to put a "supercomputer on a desk" or make a "supercomputer fit in a PCIe slot", people actually in the HPC (High Performance Computing) industry tend to laugh at this in private. While Deep Learning / Machine Learning does run very well on GPUs, a computer with 16 GPUs is not a supercomputer. Consider that Piz Daint, #3 on the list, has 5,320 P100 GPUs. Cray is actually uniquely positioned to have some of the highest-performing Deep Learning training runs in the world take place on some machines they've built. This has a lot to do with the interconnect and communications stack, but that's not Cray's only advantage here. Note that the size of the Deep Neural Networks in use in industry is increasing. These nets and their training data sets are growing very quickly, and soon, they may not fit well on small machines that can't scale efficiently to 1000s of nodes, whether each of those nodes is a CPU, GPU, or other accelerator.

    (4) Yes. Actually, I'm very bullish about the future of the supercomputing market, particularly in the commercial (non-gov) space. More and more companies are increasing their investment and reliance on computing in general, and this is also true at the high end of the market. From oil and gas (reservoir simulation) to traditional manufacturing (CFD, etc.), supercomputing is no longer solely an activity of national governments. While the largest machines may continue to be owned by governments, more and more companies are realizing the competitive advantages that can come from effective utilization of true supercomputer-class machines.