Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Thursday February 13 2020, @10:59AM   Printer-friendly
from the rainmakers dept.

Second GPU Cloudburst Experiment Yields New Findings

In late 2019, researchers at the San Diego Supercomputer Center (SDSC) and the Wisconsin IceCube Particle Astrophysics Center (WIPAC) caught the attention of the high-performance computing community and top commercial cloud providers by successfully completing a bold experiment that marshalled all globally available-for-sale GPUs (graphics processing units) for a brief run which proved it is possible to elastically burst to very large scales of GPUs using the cloud, even in this pre-exascale era of computing.

[...] Fast forward to early February 4, 2020, when the same research team conducted a second experiment with a fraction of the remaining funding left over from a modest National Science Foundation EAGER grant.

[...] "We drew several key conclusions from this second demonstration," said SDSC's Sfiligoi. "We showed that the cloudburst run can actually be sustained during an entire workday instead of just one or two hours, and have moreover measured the cost of using only the two most cost-effective cloud instances for each cloud provider."

The team managed to reach and sustain a plateau of about 15,000 GPUs, or 170 PFLOP32s (i.e. fp32 PFLOPS[*]) using the peak fp32 FLOPS provided by NVIDIA specs. The cloud instances were provisioned from all major geographical areas, and the total integrated compute time was just over one fp32 exaFLOP[*] hour. The total cost of the cloud run was roughly $60,000.

[*] fp32 (floating point, 32-bit operands aka single precision)
FLOPS: Floating-point OPerations per Second
PFLOPS; Petaflops 1015 (i.e. 1,000,000,000,000,000) floating point operations per second.
exaFLOPS; 1018 (i.e. 1,000,000,000,000,000,000) floating point operations per second.

Disclaimer: The original story was apparently submitted by a participant in the research.


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Thursday February 13 2020, @12:00PM (2 children)

    by Anonymous Coward on Thursday February 13 2020, @12:00PM (#957680)

    Just how fast can you watch porn?

    • (Score: 2) by takyon on Thursday February 13 2020, @12:12PM (1 child)

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Thursday February 13 2020, @12:12PM (#957684) Journal

      Concurrent streams. How many cores does your brain have?

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 1, Touché) by Anonymous Coward on Thursday February 13 2020, @12:50PM

        by Anonymous Coward on Thursday February 13 2020, @12:50PM (#957688)

        So much pr0n, so little worth watching.

        Maybe these elastic cloud AIs can get to work creating pr0n worth watching.

  • (Score: 4, Interesting) by takyon on Thursday February 13 2020, @12:48PM

    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Thursday February 13 2020, @12:48PM (#957687) Journal

    Most of the big discrete GPUs right now are (going to be) in the range of 5 to 20 TFLOPS FP32. For example, 8 TFLOPS for Radeon RX 5700 [techpowerup.com], 13.5 TFLOPS for RTX 2080 Ti [techpowerup.com], and 16.3 TFLOPS for Titan RTX [techpowerup.com]. AMD will have "Big Navi" this year and Nvidia should have "Ampere". One of them will probably make it to 20 TFLOPS.

    The enterprise/HPC cards have more memory, better FP16 performance, etc. Tesla T4 [techpowerup.com] mentioned in TFA is 8.141 TFLOPS FP32, so it should take almost 21,000 of those to reach 170 PFLOPS.

    Intel is about to enter the market, and will be focusing on supercomputing:

    Intel’s Xe for HPC: Ponte Vecchio with Chiplets, EMIB, and Foveros on 7nm, Coming 2021 [anandtech.com]
    Monstrous 500W Intel Xe MCM Flagship GPU Leaked In Internal Documents – 4 Xe Tiles Stacked Using Foveros 3D Packaging [wccftech.com]

    The multi-chip module and HBM approach will enable massive performance in a single gigantic package. Although heat could be through the roof, it's worth it if it can be cooled effectively and there is a FLOPS/W improvement.

    We will see GPUs that can hit 100 TFLOPS, and hopefully 1 PFLOPS (or a lot more if a 3DSoC approach gets adapted). You'll be able to rent exascale performance soon. FP32 is one thing, but GPUs will have to compete with stuff like tensor processors and the giant Cerebras Wafer Scale Engine for machine learning. Are these levels of performance irrelevant for gaming? Triple-monitor/wide 8K seems like the absolute end of the road, and 16K VR shouldn't need all of the performance of regular 16K due to headset eye tracking + foveated rendering. I guess we could boost refresh rates to 1000 Hz.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
  • (Score: 0) by Anonymous Coward on Thursday February 13 2020, @01:53PM (7 children)

    by Anonymous Coward on Thursday February 13 2020, @01:53PM (#957702)

    did they just run a benchmark program to test or did they acctually throw some observation data that needed to be crunched at it?
    how many kilowatt hours were sacrificed? how much energy was "abused" from craddle (design, manufacture, shipping) to grave (landfill)?
    there will never be enough processing power to figure out why the planet is not covered with silicon that spits out electrons ... eh?

    • (Score: 4, Informative) by takyon on Thursday February 13 2020, @02:18PM (6 children)

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Thursday February 13 2020, @02:18PM (#957713) Journal

      In the second experiment, which was about eight hours long versus less than two hours in the first experiment, the IceCube Neutrino Observatory processed some 151,000 jobs, up from about 101,000 in the first burst.

      “This means that the second IceCube cloud run produced 50% more science, even though the peak was significantly lower,” explained Sfiligoi, who also noted that the latter experiment added OSG, XSEDE, and PRP’s Kubernetes resources, effectively making it a hybrid-cloud setup unlike the first time, when it was purely cloud-based.

      No wasted electrons.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 1) by fustakrakich on Thursday February 13 2020, @05:05PM (5 children)

        by fustakrakich (6150) on Thursday February 13 2020, @05:05PM (#957765) Journal

        How many watts were consumed in that eight hours?

        How many watts per FLOP?

        --
        La politica e i criminali sono la stessa cosa..
        • (Score: 2) by takyon on Thursday February 13 2020, @05:23PM (4 children)

          by takyon (881) <takyonNO@SPAMsoylentnews.org> on Thursday February 13 2020, @05:23PM (#957769) Journal

          I'm just gonna be lazy and say:

          1.46 megawatts (42 gigajoules consumed, or 10 tons of TNT)

          9 picowatts

          Are those right? Probably not. But they sound good.

          --
          [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
          • (Score: 1) by fustakrakich on Thursday February 13 2020, @05:32PM (3 children)

            by fustakrakich (6150) on Thursday February 13 2020, @05:32PM (#957771) Journal

            I am disappointed that this is not a subject of interest. A lot of power is being used to produce something hardly more useful than determining somebody's price/earnings ratios.

            --
            La politica e i criminali sono la stessa cosa..
            • (Score: 2) by takyon on Thursday February 13 2020, @05:34PM (1 child)

              by takyon (881) <takyonNO@SPAMsoylentnews.org> on Thursday February 13 2020, @05:34PM (#957772) Journal
              --
              [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
              • (Score: 1) by fustakrakich on Thursday February 13 2020, @06:21PM

                by fustakrakich (6150) on Thursday February 13 2020, @06:21PM (#957788) Journal

                Well, if it can cook minute rice in 15 seconds, I'm all for it.

                They're still using a hell of a lot of juice.

                --
                La politica e i criminali sono la stessa cosa..
            • (Score: 0) by Anonymous Coward on Friday February 14 2020, @07:58AM

              by Anonymous Coward on Friday February 14 2020, @07:58AM (#958117)
              Oh, I dunno, analysing the behaviour of neutrinos is I think a worthy use of that energy. Getting a small step closer to unlocking the mysteries of the universe is a wonderful goal in my book, and its sad to see you value it on the level of analysing market valuations.
(1)