Once a seething cauldron of competition, the twice-yearly Top500 listing of the world’s most powerful supercomputers has grown nearly stagnant of late.
In the most recent Top500 compilation, released Monday ( http://top500.org/lists/2014/11/ ), the Chinese National University of Defense Technology’s Tianhe-2 has retained its position as the world’s fastest system, for the fourth time in a row.
Tianhe-2 is no more powerful than when it debuted atop the Top500 in June 2013 ( http://www.computerworld.com/article/2497811/computer-hardware/china-trounces-us-in-top500-supercomputer-race.html ): In a Linpack benchmark, it steamed along offering 33.86 petaflop/s of computing power. A petaflop is one quadrillion floating point operations.
Only one new entrant debuted in the top 10 in this Top500—a system built by Cray for an unnamed U.S. government agency, which brought 3.57 petaflop/s to the table.
http://www.pcworld.com/article/2848812/top500-supercomputer-race-loses-momentum.html
Related Stories
For years, Linux has dominated supercomputing. The November 2014 Top 500 supercomputer ranking found 485 out of the world's fastest 500 computers running Linux. That's 97 percent for those of you without a calculator at hand.
I became a little curious about what distro supercomputers run, and ran across a distro targeted directly at them: Rocks. The fastest supercomputer in the world today, Tianhe-2, runs a distro called Kylin which interestingly, used to be based on FreeBSD but is now Linux-based.
[Ed's note: See our earlier story: Top-500 Supercomputer Race Goes Cold.]
(Score: 1, Interesting) by Anonymous Coward on Tuesday November 18 2014, @02:55PM
"...a system built by Cray for an unnamed U.S. government agency..."
My curiosity of the race to the top was peaked at knowing the unknown.
Who is the unnamed agency?
(Score: 2, Funny) by Anonymous Coward on Tuesday November 18 2014, @02:57PM
It's probably the Postal Service. I said move along!
(Score: 1) by TheB on Tuesday November 18 2014, @03:30PM
The agency is only unnamed because the U.S. government is running out of 3 letter acronyms. The shift to 4 letter words, while arguably more descriptive of the agencies practices, is still meeting resistance from their PR departments. Adding to the problem, many inter-agency forms only have room for three characters, and the transition to ACRO4 will take some time to complete. This ACRO3 shortage is slowing the creation of new government agencies and poses a significant threat to the security of our nation.
(Score: 1) by PiMuNu on Tuesday November 18 2014, @03:34PM
Are distributed systems are replacing big centralised clusters? Particle physics/LHC, for example, has opted for a big distributed computing system spread over many sites...
(Score: 2) by Rivenaleem on Tuesday November 18 2014, @03:37PM
Every gamer knows that current processors are 'good enough' now for all top end activities, and real performance is to be found in the GPU. Making a faster computer is not really necessary anymore. They only need now to make it more efficient for the activities it is intended to perform.
(Score: 2) by Covalent on Tuesday November 18 2014, @03:38PM
but is it possible that maybe really powerful supercomputers aren't as valuable / necessary as they used to be? Perhaps our government and universities are writing more effective algorithms, or are networking these machines together. Perhaps they are doing distributed computing over a wide range of ordinary machines.
You can't rationally argue somebody out of a position they didn't rationally get into.
(Score: 2) by VLM on Tuesday November 18 2014, @04:45PM
Another interesting interpretation might be workload. You made millions off better CFD simulations in the 80s, but now a desktop is "good enough" and its not like air has changed its properties much, so once you calculate a decent airfoil for 450 knots, you don't need to again. On the other hand, you can make money on HFT by lowering latency. So all the money and minds go into latency, while this is just reporting long term average thruput.
The stereotypical "supercomputer takes a CPU bound problem and makes it IO bound" doesn't really sound like a winning HFT strategy.
By careful pipelining you can always increase your average thruput by trading off latency, both at the ALU pipeline level and giant cluster level, but if avg long term thruput is not financially as valuable anymore compared to latency...
(Score: 3, Interesting) by novak on Tuesday November 18 2014, @05:52PM
You make millions off better CFD simulations today. No, air hasn't changed it's properties much, but 'calculating a decent airfoil' is rarely even a CFD problem. Today, most (of the ones I've worked with) CFD customers are interested in more complex questions. This includes:
-> 3D CFD. airfoils are usually simple enough you don't even run actual CFD, and even a 3D wing does not always use CFD in the preliminary design phase. In my limited experience, for wing design, something like vortex lattice methods are used, possibly a reduced CFD.
-> Advanced turbulence models. LES turbulence models do not just introduce more simultaneous equations, they also require two orders of magnitude more mesh, meaning that your problem requires hundreds of times the memory.
-> Time accurate CFD. Time accuracy lets you see eddies or variations in time which are just averaged out by steady state CFD. Running time accurate CFD is very disk space intensive, but also requires very long run times, because there is an upper bound on the time step you can take without degrading accuracy.
-> Reacting flow. In high temperature gases, chemical reactions are a very important part of what's happening, even for solving the fluid dynamics, because it influences how much energy is in the fluid. Chemical reactions also take place on an even smaller timescale so usually you have to use different time steps for the fluid and chemical reactions.
-> Optimization. Once you've done something once, maybe you would like to recalculate it hundreds or thousands of times for an optimization study.
If you think I'm describing a hypothetical situation, I'm not. All of these are used in gas turbine design, and are becoming more common every day. No one that actually does CFD does it without a cluster if they are analyzing anything complex at all. The algorithms for CFD are not always getting more efficient, unlike most industries, because instead people are using increased computing resources to make them more accurate but less efficient.
I can't speak for other kinds of HPC but in an infiniband cluster (or even gigE for a very small cluster), the bottleneck on CFD is not I/O but memory bandwidth.
If you think people aren't making millions off of better CFD, take a look at how much the licensing costs. For ANSYS/CFX, a popular industry software, in my experience it costs about as much to buy licenses for one year as it does to buy the cluster (that's going to depend on how many people are using the cluster for what, of course).
novak
(Score: 2) by VLM on Tuesday November 18 2014, @06:26PM
ANSYS/CFX
I'm aware of their HFSS product, famous for the academic discount being like $1K and list price being about $80K.
(Score: 2) by novak on Tuesday November 18 2014, @06:38PM
Their CFD product is also incredibly pricy, but apparently it's one of the gas turbine industry favorites. You pay somewhere around $5k per preprocesser license, similar for a postprocessor license, and the solver licenses will cost you per core that you want to run it on. The only way that isn't prohibitively expensive is that they sell license 'packs' which are basically a multiplier for the number of cores you can use.
novak
(Score: 0) by Anonymous Coward on Wednesday November 19 2014, @11:52AM
Why don't you pirate it and run it on a cluster in russia.
(Score: 3, Funny) by marcello_dl on Tuesday November 18 2014, @03:44PM
maybe they succeeded in running crysis at max settings?
(Score: 3, Interesting) by SrLnclt on Tuesday November 18 2014, @05:01PM
Not every new supercomputer is automatically part of this list. There is a bunch of work you have to do for benchmarking to be considered for the list.
This link is a couple years old, but the supercomputer at the University of Illinois [illinois.edu] did not bother wasting time with the benchmarking process, and decided to focus on actual work. I'm guessing they are not the only ones.
(Score: 1, Informative) by Anonymous Coward on Tuesday November 18 2014, @06:07PM
Another very good possibility is that those "in the know" realize that being in the top 500 on the LINPACK benchmark is a mere dick size competition. I work with the DoD supercomputers. While having a machine that can blow the LINPACK benchmark out of the water is a good thing, it is not the only thing. I would even argue that it is not the most important thing; I'm sure others would agree. There are several factors that go into making a good supercomputing infrastructure and the LINPACK benchmark is only one of them.
(Score: 2) by opinionated_science on Tuesday November 18 2014, @06:51PM
it has one massive advantage, it cuts through the marketing BS. It shows that at least ONE problem can be solved a) correctly b) within certain parameters.
In this regards it is one of the better tests. John McAlpins STREAM is useful too...
(Score: 0) by Anonymous Coward on Wednesday November 19 2014, @06:44PM
By the way Google and Facebook have huge amounts of computers that can do some of those benchmarks, I wonder how well they'd do :).
(Score: 3, Informative) by Leebert on Wednesday November 19 2014, @04:21AM
Totally with you on this. Having worked in HPC in a former life, my eyes would roll out of their sockets when a vendor would come in, assemble a new system in one configuration to benchmark the system, then rip it all apart to put it back together into an actually useful configuration. It was such a ridiculous waste of time.
(Score: 3, Interesting) by takyon on Tuesday November 18 2014, @10:36PM
CORAL Signals New Dawn for Exascale Ambitions [hpcwire.com]
NVIDIA performance/Watt roadmap [netdna-cdn.com]
Intel Announces Several New and Enhanced HPC Technologies [hpcwire.com]
Knights Corner performance was about 1 TFLOPs double precision, and about 4.5 GFLOPS/Watt.
Knights Landing performance is expected to be 3 TFLOPs double precision, and about 14-16 GFLOPs/Watt.
Knights Hill... 9-10 TFLOPs and 40-50 GFLOPS/Watt?
If there aren't any big new chips coming out, of course the list is going to stall. The biggest supercomputers are usually announced in advance, so we know that 100-300 PFLOPs is coming around 2017, and 1 EFLOPs around 2020-2022. June 2015's list might show a performance upgrade for Tianhe-2, since that was rumored for this one.
Finally, China’s bevy of supercomputers goes unused [marketwatch.com]
China’s Supercomputing Strategy Called Out [hpcwire.com]
(Score: 2) by bootsy on Wednesday November 19 2014, @08:42AM
Thank link was good enough to make the front page in my opinion. Thanks for positing. Just timed out on the mod points or I would have rated your post.
(Score: 0) by Anonymous Coward on Wednesday November 19 2014, @09:03AM
+1 this - the takyon comment should be an article submission by itself...