Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Tuesday November 18 2014, @01:49PM   Printer-friendly
from the fast-enough-is-good-enough dept.

Once a seething cauldron of competition, the twice-yearly Top500 listing of the world’s most powerful supercomputers has grown nearly stagnant of late.

In the most recent Top500 compilation, released Monday ( http://top500.org/lists/2014/11/ ), the Chinese National University of Defense Technology’s Tianhe-2 has retained its position as the world’s fastest system, for the fourth time in a row.

Tianhe-2 is no more powerful than when it debuted atop the Top500 in June 2013 ( http://www.computerworld.com/article/2497811/computer-hardware/china-trounces-us-in-top500-supercomputer-race.html ): In a Linpack benchmark, it steamed along offering 33.86 petaflop/s of computing power. A petaflop is one quadrillion floating point operations.

Only one new entrant debuted in the top 10 in this Top500—a system built by Cray for an unnamed U.S. government agency, which brought 3.57 petaflop/s to the table.

http://www.pcworld.com/article/2848812/top500-supercomputer-race-loses-momentum.html

Related Stories

Linux Owns Supercomputer Market 10 comments

For years, Linux has dominated supercomputing. The November 2014 Top 500 supercomputer ranking found 485 out of the world's fastest 500 computers running Linux. That's 97 percent for those of you without a calculator at hand.

I became a little curious about what distro supercomputers run, and ran across a distro targeted directly at them: Rocks. The fastest supercomputer in the world today, Tianhe-2, runs a distro called Kylin which interestingly, used to be based on FreeBSD but is now Linux-based.

[Ed's note: See our earlier story: Top-500 Supercomputer Race Goes Cold.]

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Interesting) by Anonymous Coward on Tuesday November 18 2014, @02:55PM

    by Anonymous Coward on Tuesday November 18 2014, @02:55PM (#117238)

    "...a system built by Cray for an unnamed U.S. government agency..."

    My curiosity of the race to the top was peaked at knowing the unknown.

    Who is the unnamed agency?

    • (Score: 2, Funny) by Anonymous Coward on Tuesday November 18 2014, @02:57PM

      by Anonymous Coward on Tuesday November 18 2014, @02:57PM (#117239)

      It's probably the Postal Service. I said move along!

    • (Score: 1) by TheB on Tuesday November 18 2014, @03:30PM

      by TheB (1538) on Tuesday November 18 2014, @03:30PM (#117253)

      The agency is only unnamed because the U.S. government is running out of 3 letter acronyms. The shift to 4 letter words, while arguably more descriptive of the agencies practices, is still meeting resistance from their PR departments. Adding to the problem, many inter-agency forms only have room for three characters, and the transition to ACRO4 will take some time to complete. This ACRO3 shortage is slowing the creation of new government agencies and poses a significant threat to the security of our nation.

  • (Score: 1) by PiMuNu on Tuesday November 18 2014, @03:34PM

    by PiMuNu (3823) on Tuesday November 18 2014, @03:34PM (#117256)

    Are distributed systems are replacing big centralised clusters? Particle physics/LHC, for example, has opted for a big distributed computing system spread over many sites...

  • (Score: 2) by Rivenaleem on Tuesday November 18 2014, @03:37PM

    by Rivenaleem (3400) on Tuesday November 18 2014, @03:37PM (#117258)

    Every gamer knows that current processors are 'good enough' now for all top end activities, and real performance is to be found in the GPU. Making a faster computer is not really necessary anymore. They only need now to make it more efficient for the activities it is intended to perform.

  • (Score: 2) by Covalent on Tuesday November 18 2014, @03:38PM

    by Covalent (43) on Tuesday November 18 2014, @03:38PM (#117259) Journal

    but is it possible that maybe really powerful supercomputers aren't as valuable / necessary as they used to be? Perhaps our government and universities are writing more effective algorithms, or are networking these machines together. Perhaps they are doing distributed computing over a wide range of ordinary machines.

    --
    You can't rationally argue somebody out of a position they didn't rationally get into.
    • (Score: 2) by VLM on Tuesday November 18 2014, @04:45PM

      by VLM (445) on Tuesday November 18 2014, @04:45PM (#117292)

      Another interesting interpretation might be workload. You made millions off better CFD simulations in the 80s, but now a desktop is "good enough" and its not like air has changed its properties much, so once you calculate a decent airfoil for 450 knots, you don't need to again. On the other hand, you can make money on HFT by lowering latency. So all the money and minds go into latency, while this is just reporting long term average thruput.

      The stereotypical "supercomputer takes a CPU bound problem and makes it IO bound" doesn't really sound like a winning HFT strategy.

      By careful pipelining you can always increase your average thruput by trading off latency, both at the ALU pipeline level and giant cluster level, but if avg long term thruput is not financially as valuable anymore compared to latency...

      • (Score: 3, Interesting) by novak on Tuesday November 18 2014, @05:52PM

        by novak (4683) on Tuesday November 18 2014, @05:52PM (#117322) Homepage

        You make millions off better CFD simulations today. No, air hasn't changed it's properties much, but 'calculating a decent airfoil' is rarely even a CFD problem. Today, most (of the ones I've worked with) CFD customers are interested in more complex questions. This includes:

        -> 3D CFD. airfoils are usually simple enough you don't even run actual CFD, and even a 3D wing does not always use CFD in the preliminary design phase. In my limited experience, for wing design, something like vortex lattice methods are used, possibly a reduced CFD.
        -> Advanced turbulence models. LES turbulence models do not just introduce more simultaneous equations, they also require two orders of magnitude more mesh, meaning that your problem requires hundreds of times the memory.
        -> Time accurate CFD. Time accuracy lets you see eddies or variations in time which are just averaged out by steady state CFD. Running time accurate CFD is very disk space intensive, but also requires very long run times, because there is an upper bound on the time step you can take without degrading accuracy.
        -> Reacting flow. In high temperature gases, chemical reactions are a very important part of what's happening, even for solving the fluid dynamics, because it influences how much energy is in the fluid. Chemical reactions also take place on an even smaller timescale so usually you have to use different time steps for the fluid and chemical reactions.
        -> Optimization. Once you've done something once, maybe you would like to recalculate it hundreds or thousands of times for an optimization study.

        If you think I'm describing a hypothetical situation, I'm not. All of these are used in gas turbine design, and are becoming more common every day. No one that actually does CFD does it without a cluster if they are analyzing anything complex at all. The algorithms for CFD are not always getting more efficient, unlike most industries, because instead people are using increased computing resources to make them more accurate but less efficient.

        I can't speak for other kinds of HPC but in an infiniband cluster (or even gigE for a very small cluster), the bottleneck on CFD is not I/O but memory bandwidth.

        If you think people aren't making millions off of better CFD, take a look at how much the licensing costs. For ANSYS/CFX, a popular industry software, in my experience it costs about as much to buy licenses for one year as it does to buy the cluster (that's going to depend on how many people are using the cluster for what, of course).

        --
        novak
        • (Score: 2) by VLM on Tuesday November 18 2014, @06:26PM

          by VLM (445) on Tuesday November 18 2014, @06:26PM (#117341)

          ANSYS/CFX

          I'm aware of their HFSS product, famous for the academic discount being like $1K and list price being about $80K.

          • (Score: 2) by novak on Tuesday November 18 2014, @06:38PM

            by novak (4683) on Tuesday November 18 2014, @06:38PM (#117347) Homepage

            Their CFD product is also incredibly pricy, but apparently it's one of the gas turbine industry favorites. You pay somewhere around $5k per preprocesser license, similar for a postprocessor license, and the solver licenses will cost you per core that you want to run it on. The only way that isn't prohibitively expensive is that they sell license 'packs' which are basically a multiplier for the number of cores you can use.

            --
            novak
            • (Score: 0) by Anonymous Coward on Wednesday November 19 2014, @11:52AM

              by Anonymous Coward on Wednesday November 19 2014, @11:52AM (#117596)

              Why don't you pirate it and run it on a cluster in russia.

  • (Score: 3, Funny) by marcello_dl on Tuesday November 18 2014, @03:44PM

    by marcello_dl (2685) on Tuesday November 18 2014, @03:44PM (#117264)

    maybe they succeeded in running crysis at max settings?

  • (Score: 3, Interesting) by SrLnclt on Tuesday November 18 2014, @05:01PM

    by SrLnclt (1473) on Tuesday November 18 2014, @05:01PM (#117299)

    Not every new supercomputer is automatically part of this list. There is a bunch of work you have to do for benchmarking to be considered for the list.

    This link is a couple years old, but the supercomputer at the University of Illinois [illinois.edu] did not bother wasting time with the benchmarking process, and decided to focus on actual work. I'm guessing they are not the only ones.

    • (Score: 1, Informative) by Anonymous Coward on Tuesday November 18 2014, @06:07PM

      by Anonymous Coward on Tuesday November 18 2014, @06:07PM (#117331)

      Not every new supercomputer is automatically part of this list. There is a bunch of work you have to do for benchmarking to be considered for the list.

      This link is a couple years old, but the supercomputer at the University of Illinois did not bother wasting time with the benchmarking process, and decided to focus on actual work. I'm guessing they are not the only ones.

      Another very good possibility is that those "in the know" realize that being in the top 500 on the LINPACK benchmark is a mere dick size competition. I work with the DoD supercomputers. While having a machine that can blow the LINPACK benchmark out of the water is a good thing, it is not the only thing. I would even argue that it is not the most important thing; I'm sure others would agree. There are several factors that go into making a good supercomputing infrastructure and the LINPACK benchmark is only one of them.

      • (Score: 2) by opinionated_science on Tuesday November 18 2014, @06:51PM

        by opinionated_science (4031) on Tuesday November 18 2014, @06:51PM (#117352)

        it has one massive advantage, it cuts through the marketing BS. It shows that at least ONE problem can be solved a) correctly b) within certain parameters.

        In this regards it is one of the better tests. John McAlpins STREAM is useful too...

      • (Score: 0) by Anonymous Coward on Wednesday November 19 2014, @06:44PM

        by Anonymous Coward on Wednesday November 19 2014, @06:44PM (#117793)
        Or they don't want other people to know about it.

        By the way Google and Facebook have huge amounts of computers that can do some of those benchmarks, I wonder how well they'd do :).
    • (Score: 3, Informative) by Leebert on Wednesday November 19 2014, @04:21AM

      by Leebert (3511) on Wednesday November 19 2014, @04:21AM (#117524)

      Totally with you on this. Having worked in HPC in a former life, my eyes would roll out of their sockets when a vendor would come in, assemble a new system in one configuration to benchmark the system, then rip it all apart to put it back together into an actually useful configuration. It was such a ridiculous waste of time.

  • (Score: 3, Interesting) by takyon on Tuesday November 18 2014, @10:36PM

    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Tuesday November 18 2014, @10:36PM (#117429) Journal

    CORAL Signals New Dawn for Exascale Ambitions [hpcwire.com]

    The result of all of this are two systems that will be installed in the 2017 time frame. Summit, which will be housed at Oak Ridge National Laboratory and will be dedicated to large-scale scientific endeavors ranging from climate modeling to other open science initiatives. The other, called Sierra, is set to be installed at Lawrence Livermore with emphasis on security and weapons stockpile management.

    Both are GPU-accelerated systems that have fewer nodes for all the performance they’re able to pack in due to the collaboration between NVIDIA and its Volta architecture, which for those who follow these generations, is two away from where we are now with Pascal expected in 2016. The key here is the NVLink interconnect, which is set to push new limits in terms of making these the “data centric” supercomputers IBM is espousing as the next step beyond supercomputers which have traditionally been valued according only to their floating point capabilities.

    We will be exploring the technology in a companion piece that will immediately follow this one and offer a deeper sense of the projected architecture from chip to interconnect. However, to kick off this series, we wanted to provide a touchstone for these first inklings at what exascale-class systems might look like in the U.S. in the years to come.

    One thing is for sure, these are packing a lot of punch in a far lessened amount of space. The Summit system at Oak Ridge is expected to push the 150 to 300 peak petaflop barrier, but according to Jeff Nichols, one of the most remarkable aspects of the system is how they were able to work partners IBM, NVIDIA, and Mellanox to create an architecture that can be boiled down to a much smaller number of nodes for far higher performance and a much larger shared memory footprint.

    At this stage, Summit will be 5x or more the performance of Titan at 1/5 the size—weighing in at just around 3400 nodes.

    NVIDIA performance/Watt roadmap [netdna-cdn.com]

    Intel Announces Several New and Enhanced HPC Technologies [hpcwire.com]

    Intel Corporation today announced several new and enhanced technologies bolstering its leadership in high-performance computing (HPC). These include disclosure of the future generation Intel Xeon Phi processor, code-named Knights Hill, and new architectural and performance details for Intel Omni-Path Architecture, a new high-speed interconnect technology optimized for HPC deployments.

    Intel disclosed that its future, third-generation Intel Xeon Phi product family, code-named Knights Hill, will be built using Intel’s 10nm process technology and integrate Intel Omni-Path Fabric technology. Knights Hill will follow the upcoming Knights Landing product, with first commercial systems based on Knights Landing expected to begin shipping next year.

    Industry investment in Intel Xeon Phi processors continues to grow with more than 50 providers expected to offer systems built using the new processor version of Knights Landing, with many more systems using the coprocessor PCIe card version of the product. To date, committed customer deals using the Knights Landing processor represent over 100 PFLOPS of system compute.

    Knights Corner performance was about 1 TFLOPs double precision, and about 4.5 GFLOPS/Watt.
    Knights Landing performance is expected to be 3 TFLOPs double precision, and about 14-16 GFLOPs/Watt.
    Knights Hill... 9-10 TFLOPs and 40-50 GFLOPS/Watt?

    If there aren't any big new chips coming out, of course the list is going to stall. The biggest supercomputers are usually announced in advance, so we know that 100-300 PFLOPs is coming around 2017, and 1 EFLOPs around 2020-2022. June 2015's list might show a performance upgrade for Tianhe-2, since that was rumored for this one.

    Finally, China’s bevy of supercomputers goes unused [marketwatch.com]
    China’s Supercomputing Strategy Called Out [hpcwire.com]

    • (Score: 2) by bootsy on Wednesday November 19 2014, @08:42AM

      by bootsy (3440) on Wednesday November 19 2014, @08:42AM (#117564)

      Thank link was good enough to make the front page in my opinion. Thanks for positing. Just timed out on the mod points or I would have rated your post.

      • (Score: 0) by Anonymous Coward on Wednesday November 19 2014, @09:03AM

        by Anonymous Coward on Wednesday November 19 2014, @09:03AM (#117570)

        +1 this - the takyon comment should be an article submission by itself...