University Loses Valuable Supercomputer Research After Backup Error Wipes 77 Terabytes of Data:
Kyoto University, a top research institute in Japan, recently lost a whole bunch of research after its supercomputer system accidentally wiped out a whopping 77 terabytes of data during what was supposed to be a routine backup procedure.
That malfunction, which occurred sometime between Dec. 14 and Dec. 16, erased approximately 34 million files belonging to 14 different research groups that had been using the school's supercomputing system. The university operates Hewlett Packard Cray computing systems and a DataDirect ExaScaler storage system—the likes of which can be utilized by research teams for various purposes.
It's unclear what kind of files were specifically deleted or what caused the actual malfunction, though the school has said that the work of at least four different groups will not be able to be restored.
Also at BleepingComputer.
Original announcement from the university.
(Score: 1, Informative) by Anonymous Coward on Friday December 31 2021, @06:29AM (4 children)
So huge that my normal advice would be impractical, that is to keep immutable deduped backups over the near term.
(Score: 0) by Anonymous Coward on Friday December 31 2021, @07:27AM (2 children)
According to their website, they have 24 petabytes and enough grunt that 77 terabytes would be gone in less than 5 minutes. It also says that LARGE0 and LARGE1 are in a paired configuration. Sounds like someone ran a command and clobbered a part of LARGE0. A day later, something didn't look right or someone complained about missing data and their mistake before that level of backup completely propagated.
(Score: 0) by Anonymous Coward on Friday December 31 2021, @07:33AM
with 24.923P of training data their CS boffins get the 77T back in no time.
(Score: 0) by Anonymous Coward on Friday December 31 2021, @08:32AM
That's enough for a really high resolution goatse.
(Score: 4, Interesting) by Brymouse on Friday December 31 2021, @01:22PM
Not really.
Lets say you want ZFS Raid3, so you can loose 3 disks before you have a data loss. Supermicro's 90 LFF disk 4ru server filled with 16tb SAS disks is 6 15 disk (12 usable) arrays. Add in 4 x 4 TB NVMe for l2arc/cache and special devices in a mirror config + 2 mirrored SAS disks for server boot disks in the rear. This nets you 1152 TB usable zraid3 space with stupid fast IO ops in 4RU. 20 servers is 2 racks for 24 pb. This is about $90k per server or 1.8 Million. Round up to 2M for the network gear to support this and racks/install.
It's expensive, but not out of reach by any means.