Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Wednesday January 05 2022, @12:41PM   Printer-friendly
from the BIG-oops! dept.

This HPE software update accidentally wiped 77TB of data:

We covered this story here University Loses Valuable Supercomputer Research After Backup Error Wipes 77 Terabytes of Data. I, like some others, suspected finger trouble on the part of those doing the backup, but the company writing the sofware have put their hands up and taken responsibility.

A flawed update sent out by Hewlett Packard Enterprise (HPE) resulted in the loss of 77TB of critical research data at Kyoto University, the company has admitted.

HPE recently issued a software update that broke a program deleting old log files, and instead of just deleting those (which would still have a backup copy stored in a high-capacity storage system), it deleted pretty much everything, including files in the backup system, Tom's Hardware reported.

As a result, some 34 million files, generated by 14 different research groups, from December 14 to December 16, were permanently lost.

In a press release, issued in Japanese, HPE took full responsibility for the disastrous mishap.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Insightful) by Runaway1956 on Wednesday January 05 2022, @03:29PM (7 children)

    by Runaway1956 (2926) Subscriber Badge on Wednesday January 05 2022, @03:29PM (#1210138) Journal

    I'm not sure about your math, or your line of reasoning. It sounds like they not only lost all the work done on those particular days, but they lost all the backups of work leading up to those days? The article is not especially clear exactly what was lost, but if all the backups were erased, it's probably a lot more work than just a couple of days.

    So, I guess it all boils down to a rather simple question: How much time, money, and manpower will it take to replace and reproduce all the erased data? If they can "fix" the problem in less than a month, then not a really big deal. If the "fix" will take years, then it's a major problem. Let's remember that some research projects take years of planning, and funding from the university, from private enterprise, as well as government grants.

    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 2) by krishnoid on Wednesday January 05 2022, @06:14PM

    by krishnoid (1156) on Wednesday January 05 2022, @06:14PM (#1210196)

    You're way ahead of me on this. I'm still trying to puzzle together "backup" and "erased". Like, "whoops, we reversed a flag on rsync" plus "we totally (accidentally) called a separate routine to overwrite old backups, some of which required operator action to remount".

  • (Score: 0) by Anonymous Coward on Wednesday January 05 2022, @09:08PM (5 children)

    by Anonymous Coward on Wednesday January 05 2022, @09:08PM (#1210271)

    whatever, real science is always repoducable, so how can anything of value been lost here?

    • (Score: 3, Insightful) by Kell on Wednesday January 05 2022, @10:36PM (4 children)

      by Kell (292) on Wednesday January 05 2022, @10:36PM (#1210320)

      Speaking as a research engineer: time. Time is valuable and you never get it back.

      --
      Scientists ask questions. Engineers solve problems.
      • (Score: 1) by Acabatag on Thursday January 06 2022, @02:59AM (3 children)

        by Acabatag (2885) on Thursday January 06 2022, @02:59AM (#1210402)

        What seems odd to me is that scientists let anybody from Eye-Tee with their crappy 'enterprise' systems near this much important scientific data. Weren't there toner cartridges to replace or a secretaries' keyboard to blow the crumbs out of?

        This was not ordinary data that the data janitors are charged with maintaining.

        • (Score: 3, Informative) by Kell on Thursday January 06 2022, @06:23AM (2 children)

          by Kell (292) on Thursday January 06 2022, @06:23AM (#1210459)

          This might surprise some, but we often don't have a choice how our data is hosted at our institutions. With various data integrity and handling mandates from funding agencies plus strictly limited grant budgets it's almost impossible to self-manage your own IT unless you're somewhere like CERN. At my institution even getting an Octoprint server that I can connect to from home to manage 3D prints is basically impossible because the IT people are funded independently of the academics who rely on them and simply don't give a shit.

          --
          Scientists ask questions. Engineers solve problems.
          • (Score: 2) by PiMuNu on Thursday January 06 2022, @04:37PM

            by PiMuNu (3823) on Thursday January 06 2022, @04:37PM (#1210553)

            I think even at CERN they have IT guys running the firewall/etc. (Although not as long ago as you might imagine, they had a bit of an "incident" when it turned out all of their controls software shared the same password)

            The folks running the cluster are probably more like research specialists into IT, but they are surely going to use some enterprise solution for managing many storage nodes rather than rolling their own hacky scripts.

          • (Score: 0) by Anonymous Coward on Saturday January 08 2022, @05:29AM

            by Anonymous Coward on Saturday January 08 2022, @05:29AM (#1211018)

            My only catch on this is that the affected storage was for their 3 different supercomputers. At all the places I've worked at and with, the long-term storage is located separately from the cluster storage. Furthermore, there is absolutely the expectation everywhere I have been or worked with to use both for their intended purposes.