
from the but-does-it-run...-OK,-you've-heard-it-before dept.
The TOP500 List of the world's fastest supercomputers for June 2015 has been released. China's Tianhe-2 remains the leader with 33.86 petaflops on the LINPACK benchmark. It has topped the list since June 2013. The only new supercomputer in the top 10 is the Shaheen II in Saudi Arabia, a 5.536 PFlop/s Cray XC40 system using 196,608 Intel Xeon E5-2698v3 cores.
The Platform has an analysis of the results. Although performance growth is slowing, pre-exascale supercomputers (100+ petaflops) can be expected within the next two to three years. The U.S. Department of Energy's Aurora supercomputer will deliver 180 petaflops of performance in 2018. Around the same time, the Summit supercomputer is expected to reach 150-300 petaflops while Sierra will reach 100+ petaflops. ~1 exaflop supercomputers are expected to appear around 2018-2022.
The June 2015 Green500 list ranking supercomputers by megaflops per watt will be available sometime later in the month. Here is the November 2014 Green500 list. The Piz Daint supercomputer appears within the top 10 on both lists.
Stats from the press release:
Although the United States remains the top country in terms of overall systems with 233, up from 231 six months ago and the same as in June 2014 and down from 265 on the November 2013 list. The U.S. is nearing its historical low number on the list. The number of European systems rose to 141, up from 130 on the last list, while the number of systems across Asia dropped to 108 from 120. The number of Chinese systems on the list also dropped to 37, compared to 61 last November, China has only half as many systems on the newest list as it did one year ago. Japan continues to increase its count on the list, claiming 39 spots this time, up from 32 last November. However, China's role in high performance computing is increasing in the manufacturing arena, with Lenovo now being counted among the vendors of systems on the TOP500 list. 3 new systems are solely attributed to Lenovo, while 20 systems previously listed as IBM are now labeled jointly between IBM and Lenovo.
Cray Inc., a company long associated with supercomputers, is on a resurgence and emerges in the latest list as the clear leader in performance, claiming a 24 percent share of installed total performance (up from 18.2 percent). IBM takes the second spot with a 22.2 percent share, down from 28 percent last November. On the latest edition of the list, the No. 500 system recorded a performance of 153.6 teraflops (trillions of calculations per second, 133.7 teraflop/s six months ago. The last system on the newest list was listed at position 421 in the previous TOP500. This represents the lowest turnover rate in the list in two decades.
- Total combined performance of all 500 systems has grown to 363 Pflop/s, compared to 309 Pflop/s last November and 274 Pflop/s one year ago. This increase in installed performance also exhibits a noticeable slowdown in growth compared to the previous long-term trend.
- There are 68 systems with performance greater than 1 petaflop/s on the list, up from 50 last November.
- A total of 88 systems on the list are using accelerator/co-processor technology, up from 75 on November 2014. Fifty-two (52) of these use NVIDIA chips, four use ATI Radeon, and there are now 33 systems with Intel MIC technology (Xeon Phi). Four systems use a combination of Nvidia and Intel Xeon Phi accelerators/co-processors.
- HP has the lead in the total number of systems with 178 (35.6 percent) compared to IBM with 111 systems (22.2 percent). Last November, HP had 179 systems and IBM had 153 systems. In the system category, Cray remains third with 71 systems (14.2 percent).
Original Submission
Related Stories
The Register's new sister site, The Platform, broke news of an upcoming 180 petaflops supercomputer named "Aurora" to be installed at the Argonne National Laboratory. The system will reportedly use 2.7x the power (from 4.8 megawatts to 13 megawatts) to deliver 18x the peak performance of Argonne's existing Mira supercomputer (more detail here).
Aurora will use Intel's upcoming 10nm "Knights Hill" Xeon Phi processors and a second-generation Omni-Path optical interconnect with far greater bandwidth than current designs. The storage capacity will exceed 150 petabytes. Cray Inc. will manufacture the system, which will cost $200 million and round out the CORAL trio of supercomputers, including the 150-300 PFLOPS Summit at Oak Ridge National Laboratory and the 100+ PFLOPS Sierra at Lawrence Livermore National Laboratory. The other two systems will use IBM Power9 and NVIDIA Volta chips.
An 8.5 petaflops, 1.7 MW secondary system named Theta will be built in 2016.
According to Intel and Argonne National Laboratory:
Research goals for the Aurora system include: more powerful, efficient and durable batteries and solar panels; improved biofuels and more effective disease control; improving transportation systems and enabling production of more highly efficient and quieter engines; and wind turbine design and placement for improved efficiency and reduced noise.
Editor's Note: For the purists, and from a maintainer of the TOP500 list, What is a Mflop/s?:
Mflop/s is a rate of execution, millions of floating point operations per second. Whenever this term is used it will refer to 64 bit floating point operations and the operations will be either addition or multiplication. Gflop/s refers to billions of floating point operations per second and Tflop/s refers to trillions of floating point operations per second.
The Platform reports that CPU export restrictions to Chinese supercomputing centers may have backfired. Tianhe-2 has remained the world's top supercomputer for the last five iterations of the TOP500 list using a heterogeneous architecture that mixes Intel's Xeon and Xeon Phi chips. Tianhe-2 will likely be upgraded to Tianhe-2A within the next year (rather than by the end of 2015 as originally planned), nearly doubling its peak performance from 54.9 petaflops to around 100 petaflops, while barely raising peak power usage. However, instead of using a new Intel Xeon Phi chip, a homegrown "China Accelerator" and novel architecture will be used.
A few details about the accelerator are known:
Unlike other [digital signal processor (DSP)] efforts that were aimed at snapping into supercomputing systems, this one is not a 32-bit part, but is capable of supporting 64-bit and further, it can also support both single (as others do) and double-precision. As seen below, the performance for both single and double precision is worth remarking upon (around 2.4 single, 4.8 double teraflops for one card) in a rather tiny power envelope. It will support high bandwidth memory as well as PCIe 3.0. In other words, it gives GPUs and Xeon Phi a run for the money—but the big question has far less to do with hardware capability and more to do with how the team at NUDT will be able to build out the required software stack to support applications that can gobble millions of cores on what is already by far the most core-dense machine on the planet.
Original Submission
(Score: 2) by Gaaark on Monday July 13 2015, @04:10PM
Did i miss something?
Are they all running linux (the top 10 are)?
Didn't it used to show percentages of linux/bsd/windows machines? Do you have to peruse EVERY link from every machine to see what it runs? Is linux now the default?
My enquiring mind wants to know? Statistics are EVERYTHING!
--- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
(Score: 2) by Gaaark on Monday July 13 2015, @04:12PM
Top 25!
--- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
(Score: 4, Informative) by takyon on Monday July 13 2015, @04:18PM
http://top500.org/statistics/list/ [top500.org]
"Operating system family". 486 run Linux, 12 Unix, 1 Windows, 1 Mixed.
"Operating system" has more details:
Linux 359
Cray Linux Environment 42
SUSE Linux Enterprise Server 11 27
CentOS 19
AIX 11
bullx SCS 6
Bullx Linux 5
Scientific Linux 4
RHEL 6.2 4
Redhat Enterprise Linux 6.4 4
Redhat Enterprise Linux 6.5 4
etc.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: -1, Flamebait) by Anonymous Coward on Monday July 13 2015, @04:23PM
Since these top machines are just a room full of "pc"s / giant Beowulf clusters, and if it does include distributed projects (protein folding, seti, distributed.net) and movie studios (render farms) it is incomplete and meaningless.
Then again it does seam to be "mine is bigger than yours" between countries.
(Score: 5, Informative) by takyon on Monday July 13 2015, @04:34PM
I don't think there's any reason it can't include movie studio clusters. The main requirement is that it has to run LINPACK, which is seen as a burden as it can now take up to a few days.
Intelligence agencies and movie studios may be reluctant to give up details on their supercomputers. But that's not always the case:
http://www.engadget.com/2014/10/18/disney-big-hero-6/ [engadget.com]
http://www.hpcwire.com/2012/10/08/dreamworks_outsources_animation_work_to_chinese_petaflopper/ [hpcwire.com]
Distributed projects have latency and other problems that aren't found in the standalone supercomputers. But there are attempts to quantify their size and "speed". If you click the petaflops link in the summary, distributed computing records [wikipedia.org] show that Folding@home has reached 20+ petaflops and other large projects are in the hundreds of teraflops
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 3, Funny) by nukkel on Monday July 13 2015, @08:24PM
I hear the latest Dreamworks title is a giant flop!
Sorry, gotta get back to Folding@home, I'm hosting a poker game and getting lame hands all night ...
(Score: 2) by cafebabe on Monday July 13 2015, @10:15PM
What is the Mean Time Between Failure [wikipedia.org] of 196,608 Xeons? About 20 minutes?
1702845791×2
(Score: 2) by zugedneb on Monday July 13 2015, @11:03PM
http://ark.intel.com/products/81060/Intel-Xeon-Processor-E5-2698-v3-40M-Cache-2_30-GHz [intel.com]
Intel® Xeon® Processor E5-2698 v3
(40M Cache, 2.30 GHz)
so, an underclocked cpu with ecc memory =)
not as good as AMD, but pretty safe...
old saying: "a troll is a window into the soul of humanity" + also: https://en.wikipedia.org/wiki/Operation_Ajax
(Score: 2) by takyon on Tuesday July 14 2015, @12:48AM
12,288 Xeons x 16 cores.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 0) by Anonymous Coward on Tuesday July 14 2015, @04:06AM
can i get an ip list of some of these machines, please?
(Score: 2, Disagree) by CirclesInSand on Tuesday July 14 2015, @07:28AM
I really think that this is an extremely dishonest way to report the speed of supercomputers. Many of the most computationally demanding math problems don't easily lend themselves to parallel execution. The fastest computer in the world would be the one which executes the fastest serial process, not this hype about duct taping hundreds of computers together and pretending that you have accomplished something.
This doesn't even begin to take into account that the worst bottleneck in computing is RAM access time (usually HD access time can be avoided with enough RAM), which is mitigated by cache but not negated.
It is possible to use multiple "cores" to increase serial execution time. Basically you run several steps simultaneously, and each step "fills in" the last result of the previous step at the last possible moment. Modern pipeline processors do this. But without proper scheduling and fast connections, this will slow down your computations.
And finally, no one who is doing serious computations should ever use floating point (or double, etc). Those are only used by lazy graduate students who run "models" and make "predictions" to try to get their "degree". Any serious scientific or mathematical or especially engineering application should use guaranteed confidence intervals (inclusive and exclusive) for any measured calculation. Floating point is for video games.