Someone purchased the eight year old Cheyenne supercomputer for $480k. Failing hardware. Leaking water system. What would it be good for? Selling for parts would flood the market. Testing the parts would take forever. They also have to pay for transport from it's current location. Originally built by SGI.
https://gsaauctions.gov/auctions/preview/282996
https://www.popsci.com/technology/for-sale-government-supercomputer-heavily-used/
https://www.tomshardware.com/tech-industry/supercomputers/multi-million-dollar-cheyenne-supercomputer-auction-ends-with-480085-bid
Cheyenne Supercomputer - Water Cooling System
Components of the Cheyenne Supercomputer
Installed Configuration: SGI ICEā¢ XA.
E-Cells: 14 units weighing 1500 lbs. each.
E-Racks: 28 units, all water-cooled
Nodes: 4,032 dual socket units configured as quad-node blades
Processors: 8,064 units of E5-2697v4 (18-core, 2.3 GHz base frequency, Turbo up to 3.6GHz, 145W TDP)
Total Cores: 145,152
Memory: DDR4-2400 ECC single-rank, 64 GB per node, with 3 High Memory E-Cells having 128GB per node, totaling 313,344 GB
Topology: EDR Enhanced Hypercube
IB Switches: 224 units
Moving this system necessitates the engagement of a professional moving company. Please note the four (4) attached documents detailing the facility requirements and specifications will be provided. Due to their considerable weight, the racks require experienced movers equipped with proper Professional Protection Equipment (PPE) to ensure safe handling. The purchaser assumes responsibility for transferring the racks from the facility onto trucks using their equipment.
Please note that fiber optic and CAT5/6 cabling are excluded from the resale package.
The internal DAC cables within each cell, although removed, will be meticulously labeled, and packaged in boxes, facilitating potential future reinstallation.
Any ideas (serious or otherwise) of suitable uses for this hardware?
(Score: 5, Informative) by DrkShadow on Saturday May 11 2024, @04:55AM (2 children)
Numbers to ...
E5-2697v4 is about 2/3 the performance of a recent (2023) E-2468 (according to benchmarks, but it's hard to get "real numbers' here).
The E-2468 is roughly 45% the power of the E5 chip.
You can get 64GB ECC RDIMMs that work well with passive cooling, lets assume 3/4 the power (DRAM doesn't scale well for the capacitors involved) and +40% performance (but they probably have 4x16GB sticks now, EVEN MORE power, and maybe par-performance for the separate memory channels, but apples and oranges..) 313,344GB amounts to 4900 dimms.
Ignoring the disk, MB, chassis. A not-zero cost.
Whether air cooling (AC) or water cooling, energy is energy and it must be dissipated. At half the energy, AC is half the cost. Then, the PC is half the cost (ram, disk, cpu). The energy is half the cost: 22.7 megawatts -> 11 megawatts.
Wyoming's electricity averages 11c/kWh. Not a megawatt, and not with the bulk discounts that high capacity institutions have. but enough for comparison, maybe. So, energy cost is $243 000/h vs $121 000/hr.
The difference in hardware, the stated CPU has an MSRP of $426. I saw the Samsung dimms for $164. $426 * 8064 + $164 * 4900 = 4238864 to build a comparable super computer. The computer costs $1.4mm/yr to operate vs the old one, $2.7mm/yr to operate. Ignoring chassis cost, switch cost, building a new computer would resolve the ECC issues, the water cooling issues, take up 80% the space (same number of CPUs/nodes), use half the energy, and pay for itself in under three years of operation.
$5mm + $1.4mm = 6.4mm; divide by operating cost, $2.7mm, it takes about 2.5 years to break even. (And then you're saving $1.3mm in energy each year.)
---
If you can pay $3 million per yer in operating expense, it would seem like you could afford $5 million to build a newer device - that will last you longer, probably be more reliable, come with a warranty. You could take out the $5 million in a loan at single-digit interest, and save much more than that yearly (~20% the up-front cost, yearly) on energy. You'd come out ahead within about 3 years.
Unless these people need a supercomputer, *NOW*, and can't wait the waiting-list period to have something built for them, and probably plan to resell it in under 4 years -- it's not financially sound to purchase this supercomputer.
(Score: 4, Informative) by DrkShadow on Saturday May 11 2024, @05:04AM
Whoops, I didn't give industrial discount on the energy. Lets take that at 50%.
Then, energy for the old is $1.3mm/yr and new is $600k/yr for operating expense.
So 6 years is break-even point on a new computer + energy vs using the old computer.
That's not entirely unreasonable. You still have to take into account the node memory errors, water cooling replacement, and cabling(!!?!?), but not entirely unreasonable. Especially if you don't want to have to think about it. I might see ASW doing it for their Tx instances, except for the water cooling problems. I'm unsure what kind of unreliability cloud vendors tolerate.
--
Interestingly, 4900 memory dimms is fewer than the CPU count, and you need at *least* one each. So they almost certainly used 16GB dimms and put two per CPU. The new build would have to use 32GB DIMMs and have one per CPU. You could mix things up, get bigger CPUs and fewer of them, but that's out of scope.
(Score: 4, Interesting) by looorg on Saturday May 11 2024, @09:59AM
No doubt it would be cheaper to just build a new one with new hardware. But it might not be an option. Are there over the counter supercomputers in some regard? The waiting list for getting one more be years long. There appear to be a limited set of companies that build them these days. They might just not have the amount of modules and components just sitting and waiting to be used. So perhaps this is the reason, get one pre-owned and run it for a few years while you are on the list to get something new?
Perhaps it's a customer close to Wyoming? Less transport, similar or cheap(er) powercosts? You probably don't want to move a thing like this out to one of the more expensive power areas.
Energy is as noted probably one of the big costs in the long term, the other one is probably people to run it. You need some people with knowledge and skills to do so, they probably are not entirely insignificant in cost per year either.
Or perhaps it's just not supposed to make sense. As someone else here noted perhaps some Cryptobro just bought it for the lulz so they can say they own a supercomputer now. Worlds biggest and most pointless piece of furniture ...