Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Saturday May 11, @01:02AM   Printer-friendly
from the old-space-heater dept.

Someone purchased the eight year old Cheyenne supercomputer for $480k. Failing hardware. Leaking water system. What would it be good for? Selling for parts would flood the market. Testing the parts would take forever. They also have to pay for transport from it's current location. Originally built by SGI.

https://gsaauctions.gov/auctions/preview/282996
https://www.popsci.com/technology/for-sale-government-supercomputer-heavily-used/
https://www.tomshardware.com/tech-industry/supercomputers/multi-million-dollar-cheyenne-supercomputer-auction-ends-with-480085-bid

Cheyenne Supercomputer - Water Cooling System

Components of the Cheyenne Supercomputer

Installed Configuration: SGI ICE™ XA.

E-Cells: 14 units weighing 1500 lbs. each.

E-Racks: 28 units, all water-cooled

Nodes: 4,032 dual socket units configured as quad-node blades

Processors: 8,064 units of E5-2697v4 (18-core, 2.3 GHz base frequency, Turbo up to 3.6GHz, 145W TDP)

Total Cores: 145,152

Memory: DDR4-2400 ECC single-rank, 64 GB per node, with 3 High Memory E-Cells having 128GB per node, totaling 313,344 GB

Topology: EDR Enhanced Hypercube

IB Switches: 224 units

Moving this system necessitates the engagement of a professional moving company. Please note the four (4) attached documents detailing the facility requirements and specifications will be provided. Due to their considerable weight, the racks require experienced movers equipped with proper Professional Protection Equipment (PPE) to ensure safe handling. The purchaser assumes responsibility for transferring the racks from the facility onto trucks using their equipment.

Please note that fiber optic and CAT5/6 cabling are excluded from the resale package.

The internal DAC cables within each cell, although removed, will be meticulously labeled, and packaged in boxes, facilitating potential future reinstallation.

Any ideas (serious or otherwise) of suitable uses for this hardware?


Original Submission

This discussion was created by janrinok (52) for logged-in users only. Log in and try again!
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Saturday May 11, @01:21AM

    by Anonymous Coward on Saturday May 11, @01:21AM (#1356509)

    No idea, but some speculation: Maybe a bitcoin bro bought it for the luls. It's probably not a great mining system, but looks cool...

  • (Score: 5, Interesting) by looorg on Saturday May 11, @02:20AM (3 children)

    by looorg (578) on Saturday May 11, @02:20AM (#1356513)

    Sure the machine is old by supercomputer standards, it will not break into the top or have not done so in years. But it's probably far from worthless.

    Considering that what they are replacing it with is going to cost about $25.000.000 getting this one for $480.000 might have been a steal of sorts. Even double or tripling the amount for say a datacenter that can hold the weight, transporting it there and getting it up and running again. The electricity bill is probably outrageous tho but them be the breaks. A big cost is as noted probably moving professionally from Cheyenne, Wyoming to where ever you want it to go. After all it's basically out in the middle of nowhere now.

    There might be a computational use for it, not everything have to be super or at the top of the list to be useful. Perhaps a small private or public research institute? A small university getting it as a grant or gift by some donor? Those places that can't afford to fork over $25M for something shiny and new. I'm certain it could still fill some kind of point or niche in the high performance computing field, just not at the top.

    I don't know if it would be a cool bitcoin rig, but I doubt it.

    But I'm sure it could be used for something. After all they did get it on the cheap. If that fails it is probably one of those things where the parts are worth more then the whole, except you probably would have to trickle them out as to not flood the market in one go. Also build some kind of testing rig and test all those modules to find which 1% of them have already failed. Unless you can sell them as is and then it's big of a gamble if you get one of the working bits.

    Still I kind of hoped someone had come forward by now to claim the bid and tell us a bit more. There were after all 27 bidders and the winning bid was only $1915 more then the runner up bid (auch! for them). Some of them really tried to lowball it, the 10th place on the bidder list only bid $165k, so 17 people bid even less then that for it. I guess those are the scrappers if I had to guess. The top 5 bids where all $400k+. If you put down that kind of money I would think you would have a plan for it beyond just lets scrap it for parts.

    • (Score: 5, Informative) by DrkShadow on Saturday May 11, @04:55AM (2 children)

      by DrkShadow (1404) on Saturday May 11, @04:55AM (#1356528)

      Numbers to ...

      E5-2697v4 is about 2/3 the performance of a recent (2023) E-2468 (according to benchmarks, but it's hard to get "real numbers' here).

      The E-2468 is roughly 45% the power of the E5 chip.

      You can get 64GB ECC RDIMMs that work well with passive cooling, lets assume 3/4 the power (DRAM doesn't scale well for the capacitors involved) and +40% performance (but they probably have 4x16GB sticks now, EVEN MORE power, and maybe par-performance for the separate memory channels, but apples and oranges..) 313,344GB amounts to 4900 dimms.

      Ignoring the disk, MB, chassis. A not-zero cost.

      Whether air cooling (AC) or water cooling, energy is energy and it must be dissipated. At half the energy, AC is half the cost. Then, the PC is half the cost (ram, disk, cpu). The energy is half the cost: 22.7 megawatts -> 11 megawatts.

      Wyoming's electricity averages 11c/kWh. Not a megawatt, and not with the bulk discounts that high capacity institutions have. but enough for comparison, maybe. So, energy cost is $243 000/h vs $121 000/hr.

      The difference in hardware, the stated CPU has an MSRP of $426. I saw the Samsung dimms for $164. $426 * 8064 + $164 * 4900 = 4238864 to build a comparable super computer. The computer costs $1.4mm/yr to operate vs the old one, $2.7mm/yr to operate. Ignoring chassis cost, switch cost, building a new computer would resolve the ECC issues, the water cooling issues, take up 80% the space (same number of CPUs/nodes), use half the energy, and pay for itself in under three years of operation.

      $5mm + $1.4mm = 6.4mm; divide by operating cost, $2.7mm, it takes about 2.5 years to break even. (And then you're saving $1.3mm in energy each year.)

      ---

      If you can pay $3 million per yer in operating expense, it would seem like you could afford $5 million to build a newer device - that will last you longer, probably be more reliable, come with a warranty. You could take out the $5 million in a loan at single-digit interest, and save much more than that yearly (~20% the up-front cost, yearly) on energy. You'd come out ahead within about 3 years.

      Unless these people need a supercomputer, *NOW*, and can't wait the waiting-list period to have something built for them, and probably plan to resell it in under 4 years -- it's not financially sound to purchase this supercomputer.

      • (Score: 4, Informative) by DrkShadow on Saturday May 11, @05:04AM

        by DrkShadow (1404) on Saturday May 11, @05:04AM (#1356529)

        Whoops, I didn't give industrial discount on the energy. Lets take that at 50%.

        Then, energy for the old is $1.3mm/yr and new is $600k/yr for operating expense.

        So 6 years is break-even point on a new computer + energy vs using the old computer.

        That's not entirely unreasonable. You still have to take into account the node memory errors, water cooling replacement, and cabling(!!?!?), but not entirely unreasonable. Especially if you don't want to have to think about it. I might see ASW doing it for their Tx instances, except for the water cooling problems. I'm unsure what kind of unreliability cloud vendors tolerate.

        --

        Interestingly, 4900 memory dimms is fewer than the CPU count, and you need at *least* one each. So they almost certainly used 16GB dimms and put two per CPU. The new build would have to use 32GB DIMMs and have one per CPU. You could mix things up, get bigger CPUs and fewer of them, but that's out of scope.

      • (Score: 4, Interesting) by looorg on Saturday May 11, @09:59AM

        by looorg (578) on Saturday May 11, @09:59AM (#1356536)

        Unless these people need a supercomputer, *NOW*, and can't wait the waiting-list period to have something built for them, and probably plan to resell it in under 4 years -- it's not financially sound to purchase this supercomputer.

        No doubt it would be cheaper to just build a new one with new hardware. But it might not be an option. Are there over the counter supercomputers in some regard? The waiting list for getting one more be years long. There appear to be a limited set of companies that build them these days. They might just not have the amount of modules and components just sitting and waiting to be used. So perhaps this is the reason, get one pre-owned and run it for a few years while you are on the list to get something new?

        Perhaps it's a customer close to Wyoming? Less transport, similar or cheap(er) powercosts? You probably don't want to move a thing like this out to one of the more expensive power areas.

        Energy is as noted probably one of the big costs in the long term, the other one is probably people to run it. You need some people with knowledge and skills to do so, they probably are not entirely insignificant in cost per year either.

        Or perhaps it's just not supposed to make sense. As someone else here noted perhaps some Cryptobro just bought it for the lulz so they can say they own a supercomputer now. Worlds biggest and most pointless piece of furniture ...

  • (Score: 2, Funny) by Anonymous Coward on Saturday May 11, @02:29AM

    by Anonymous Coward on Saturday May 11, @02:29AM (#1356516)

    I could get a DeLorean, AND a '75 Pacer!

  • (Score: 5, Interesting) by anubi on Saturday May 11, @02:43AM

    by anubi (2828) on Saturday May 11, @02:43AM (#1356517) Journal

    Irwin Allen used AN/FSQ7 panels as a movie prop. See it in Time Tunnel, Lost In Space, Voyage to the Bottom of the Sea, Land of the Giants, and I have seen it in a few other movies. Those panels sure look impressive.

    https://search.brave.com/search?q=an%2Ffsq-7 [brave.com]

    http://q7.neurotica.com/Q7/ [neurotica.com]

    --
    "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]
  • (Score: 3, Funny) by Anonymous Coward on Saturday May 11, @06:47AM (1 child)

    by Anonymous Coward on Saturday May 11, @06:47AM (#1356531)

    I’d use it to compile chromium for my Gentoo, hopefully before the next chromium version is released.
    It currently takes me 6-7 hours on an i7-10870H with 64 GB of ram.

  • (Score: 5, Interesting) by VLM on Saturday May 11, @03:31PM (4 children)

    by VLM (445) on Saturday May 11, @03:31PM (#1356545)

    Most of what they bought is not obsolete or failing.

    Sure the CPUs are not cutting edge, go buy some new MB/CPU and replace those leaky water cooling blocks while you're at it and enjoy your partially new partially old supercomputer.

    Some tech advances pretty slowly. Lets say they throw literally everything into the dumpster and pay to have it hauled away but keep the 224 Infiniband switches. I'm guessing eight thousand odd CPUs and 224 switches means these are the 48 port models (must be at least 36 ports but probably more with 4-d interconnections and trunk and management lines). Nvidia will sell you a nice new MQM8700 for a mere $30K each. So if you were planning on building a supercomputer of your own and buying "a couple hundred Infiniband switches" you could buy 224 new MQM8700s for a mere $6.7M or buy 224 slightly used switches from this place for a mere $480K. So all you have to do is spend less than $6.2M on removal, shipping, and disposal of everything that's not an Infiniband switch. Consider that even scrap value of 8K Infiniband cables is worth something. I just checked ConnectZone and a 15 meter Infiniband cable will set you back $400. So 8K of those cables would cost somewhat in excess of $3M even with some bulk discounts. Sure the labor cost of safe reusable removal is not $0 but the labor is still probably less than $3M minus the $480K used price, just for the cables alone.

    I think it would be hard not to profit if you're in the business of running a supercomputer, or can middleman with peeps who do. I mean, it gets to the point where this is so cheap you can buy it for spare parts.

    Some years ago I looked into IB "because I could" and its barely within reach if I bought used gear but its still a bit too rich for my blood so I installed 10G ethernet and ran vSAN and later CEPH on it. Works pretty well. Figure you're gonna pay $100/port for each end of a 10G ethernet and more like $1000 for each end of an Infiniband (not too accurate, but close enough). A new 16 port netgear XS716T used to set you back $1K and some years later I could replace that with a new MS510TXM with only 10 ports for about $600, so prices are slowly dropping. Trendnet today sells a new TEG-7124 which is "pretty much" my several years old XS716T for almost exactly half the price I paid for my XS716T.

    Ironically I'm old enough to remember when ethernet was new you'd spend like $500 for each end of 10M and later 100M FE. So prices have gone down both absolute and relative to inflation WRT networking gear.

    • (Score: 4, Insightful) by sgleysti on Saturday May 11, @04:31PM (1 child)

      by sgleysti (56) Subscriber Badge on Saturday May 11, @04:31PM (#1356555)

      I briefly contemplated making a small cluster. What seemed to give the best FLOPs / $ was the highest end consumer grade Ryzen CPUs. For the interconnect, I figured I could do a 5-node star topology with 4-port 10G ethernet cards and give each node a direct connection to all the others. It didn't look too expensive because I wouldn't have needed a switch and could have used direct attach cables. I would have used the 1G motherboard ethernet and a switch for management.

      In the end, I figured that would have been a lot of money, time, and effort for something I didn't really need. Any modern processor these days is incredibly fast if you use a language that compiles to machine code and takes advantage of the vector units in the CPU.

      • (Score: 3, Interesting) by VLM on Saturday May 11, @05:07PM

        by VLM (445) on Saturday May 11, @05:07PM (#1356560)

        something I didn't really need

        Yeah I wanted the experience of messing with very interesting infrastructure software and I want reliable "production" storage (using proxmox/CEPH right now), I was not motivated by FLOPS. Some are, which is fine. It's a big hobby, plenty of space.

        4-port 10G ethernet cards

        The economics is constantly changing. The 10G copper switch market right now seems to run "meh" $70/port (fiber usually more), and PCI cards run $50 to $100 for single ports and $200 to $300 for multiports like your 4-port.

        So for five nodes your 4-port mesh would cost about $1250 just in cards alone whereas using 1 port cards and a switch would cost 5 times $50 plus "about" $500 for a very small switch so figure maybe $750. Cable cost can add up, 5 cables vs 20-ish cables.

        Strictly economically you'd do better with the switch design in 2024, although the economics have varied widely in the past.

        I'm using dual 10G copper ports as LAGs to each node and there are load balancing issues but I "can" run up to 20G in theory.

        Of course if you want the fun of running FRR or whatever routing platform on your partial mesh then you kind of have to run multi-port if the point is having fun with routing protocols. Although with a full mesh I guess you could just use static routing? I'd want to set up OSPF just for the LOLs "because I can" but not everyone enjoys routing as a hobby...

        The main rule of home lab is have fun, so here I am looking at ebay search results and daydreaming while writing this. So today on ebay I can get all the used Infiniband cards I want for about $50/each and probably at least half of them actually work, and switches that might or might not work run around $100 to $200 so for somewhat under a kilobuck, between the cost of your 10G multiport cards and my 10G switch design, I could play with Infiniband at home; how much is fun and bragging rights worth? I guess I could start small with one switch and two cards and see what happens...

        Another fun feature of InfiniBand IIRC is there's a wide range of semi-compatible port speeds where the slowest is 10GB and the highest is unaffordable but something like 400GB, and secondly tuning Linux to use all the bandwidth under TCP/IP is an art form which makes it "fun". At least I think it would be fun. Very few people can push 400GB/s in their basement, maybe I will be one of the first. Note that consumer SSDs crap out around "sixteen GB/s" or so because that's as fast as PCIe4 can run, this means I will have to do weird things with RAID arrays.

        I've always wanted to have Infiniband in my basement so this might end up my summer project; we'll see. I'll probably do it "someday" anyway.

        So many other things I could do to fill my infinite spare time. Such as run 10G fiber from my basement "datacenter" to my office and put 10G in my office so I would have very fast access to my storage...

    • (Score: 3, Interesting) by captain normal on Saturday May 11, @07:37PM (1 child)

      by captain normal (2205) on Saturday May 11, @07:37PM (#1356574)

      That's assuming someone is only after the hardware. Removing data from any type of drive is hard. The location of this from the GSA Auction site is the NCAR-WY SuperComputing Center in Cheyenne, Wy. I could see some interested parties looking to mine data from a large government supercomputer looking to buy and drag this back home to scan for data.
      https://en.wikipedia.org/wiki/NCAR-Wyoming_Supercomputing_Center [wikipedia.org]

      --
      Everyone is entitled to his own opinion, but not to his own facts"- --Daniel Patrick Moynihan--
      • (Score: 3, Informative) by looorg on Saturday May 11, @10:35PM

        by looorg (578) on Saturday May 11, @10:35PM (#1356589)

        Except as far as I can tell no storage was included or put up for auction. None. Beyond RAM but that should be blank by now. So as far as I can tell zero chance of an actual data leakage here.

(1)