Stories
Slash Boxes
Comments

SoylentNews is people

posted by CoolHand on Tuesday February 09 2016, @02:27AM   Printer-friendly
from the why-oh-why dept.

A number of users have reported that running "rm --no-preserve-root -rf /" not only deletes all their files (as expected), but also permanently bricks their computers (which is not). Tracing the issue revealed that the ultimate cause was that SystemD mounted the EFI pseudo-fs as read-write even when this FS was not listed in fstab, and deleting certain files in this pseudo-fs causes certain buggy, but very common, firmware not to POST anymore. A user reported this bug on SystemD's GitHub issue tracker, asking that the FS be mounted read-only instead of read-write, and said bug was immediately closed as invalid. The comment thread for the bug was locked shortly after. Discuss.

Links:
https://github.com/systemd/systemd/issues/2402
http://thenextweb.com/insider/2016/02/01/running-a-single-delete-command-can-permanently-brick-laptops-from-inside-linux/


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 5, Interesting) by rleigh on Tuesday February 09 2016, @09:52AM

    by rleigh (4887) on Tuesday February 09 2016, @09:52AM (#301305) Homepage

    Agreed. The way you handle problems like this is very telling.

    The worst I've seen myself was a bug in my schroot tool, very early in its development. A user customised the configuration, and encountered a problem which would leave /home bind mounted and then rm -rf the base directory, blowing away /home. If I was a systemd-style maintainer, I could have closed it as "not a bug", telling the user "you were using a non-approved configuration, sucks to be you". What I *actually* did was profusely apologise to the submitter that they had encountered such an awful bug in my software, checked that they had a backup copy of their data (they did, thankfully), and then I investigated how it happened, and once I found the problem I added extra checks and other logic to ensure this would never happen again, no matter how it was configured. (It turns out I found a bug/race in the procfs /proc/mounts code which would make it alter as you read it and skip over a mount; but I worked around it by adding a lock around mounts/umounts to mitigate it changing when the tool was run in parallel, reading /proc/mounts into a buffer rather than line-by-line, and by doing extra sanity checking to ensure nothing was mounted before removing the base directory.) This issue was never encountered again over the last decade. I was profoundly embarassed by my software causing such unexpected damage, and took the steps to immediately rectify it and ensure a repeat would never be seen again. The changes I made were "unnecessary" since if the system behaved perfectly I wouldn't need them; but we don't live in a perfect world, and need to deal with that.

    The systemd folks have an even more awful problem. It's not just irretrievable dataloss, it's unrecoverable hardware damage. It might not be "their problem" either, but it *is* their responsibility since they are one of the factors enabling the damage. To treat this so flippantly, irrespective of whether they are right or wrong, is simply disrespectful to the reporter, and quite unprofessional. I have never ever treated my users this way, even when they are "wrong", either as an unpaid amateur in the above example, or as a paid professional. Mistakes can and do happen, no matter how hard we try and how many unit tests we write; none of us are perfect. But there's a right way to deal with a disaster when it occurs, and they have repeatedly failed to do this. Their behaviour is beyond arrogant, and exemplifies the worst stereotypes in the free software world. If they showed a little humility, and looked at the bigger picture of integrating well into the wider ecosystem rather than being "perfect" in isolation, they wouldn't find people like me so opposed to them.

    Starting Score:    1  point
    Moderation   +3  
       Interesting=2, Informative=1, Total=3
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   5  
  • (Score: 0) by Anonymous Coward on Tuesday February 09 2016, @04:09PM

    by Anonymous Coward on Tuesday February 09 2016, @04:09PM (#301476)

    checked that they had a backup copy of their data (they did, thankfully)

    Well, other than giving yourself a nice touchy-feely warm-and-snuggly feeling inside, of what relevance was this? If they didn't have a backup of their data, you had one to give them? Would you have sent them something to compensate for it, like money or a warm tray of brownies??

    • (Score: 2) by linuxrocks123 on Tuesday February 09 2016, @04:19PM

      by linuxrocks123 (2557) on Tuesday February 09 2016, @04:19PM (#301482) Journal

      My guess is he would have walked them through the process necessary to recover the data, i.e. photorec etc.

    • (Score: 2) by rleigh on Wednesday February 10 2016, @04:01PM

      by rleigh (4887) on Wednesday February 10 2016, @04:01PM (#302209) Homepage

      What's the purpose in such a pointlessly sarcastic comment?

      Regarding the example, there's little I could have practically done in the fact of irretrievable data loss. If it's gone, it's gone. Nothing can bring it back. Knowing they had a backup was good for my peace of mind that I wasn't responsible for ruining someone's system. The point is more about what you do to /in response/ to the situation as the developer. In this case, it was fixed quickly and professionally, and the user was in fact pleased that I'd been attentive, responsive and sympathetic to their problem, and happy that it was resolved. They continued to use the tool and also contribute to its development by way of feature requests, testing and patches. I /could/ have been a horrible person, told them it was their own fault, not my responsibility, and closed it as not a bug. And it would have been "technically correct", just like the systemd case here. But it would have been the wrong thing to do. And I'd have most likely lost a user who was quite valuable to the project, as well as making them angry for no reason. Acting like that as a dismissive, disrespectful, arrogant douchbag is no way to run a project and have a good relationship with your users and the wider community. Only people with zero care about the quality of their software and zero empathy would do that, and I would question what they are getting out of developing and maintaining the software it they don't give a damn about the impact of bugs and unexpected behaviour on the people using it. If you were to find a serious problem and report it, I'm sure you'd want the pleasant and productive response, rather than the unnecessarily unpleasant and totally unproductive one.

      When I was the Debian sysvinit/initscripts maintainer, if I had ever received a bug report as serious as this one, I certainly would have treated it with the seriousness it deserves: i.e. the greatest. Assuming that we had mounted it r/w in mountkernfs, the default would have been switched and the fixed version would already be uploaded. No arguments about where the responsibility lies and fobbing off our responsibility; if it stops another system from being irreversibly damaged, then you make the change; I'd likely have even written the patches to make the other tools mount and/or remount the filesystem since it's such a severe problem. Sadly, I'm no longer involved in this stuff, as a result of the individuals responsible for the bug in this very article.

      What's bizarre is that these people are paid "professionals" working for a major company, and they give worse service than I did as an unpaid but hardworking volunteer! If I gave their treatment to my users in my day job, I'd be disciplined and then fired. It makes me wonder if the management at RedHat actually has any control over what their staff do, especially after previous incidents e.g. the RPM db locking/trashing bug and the maintainer who refused to consider it a bug and fix it, even for paying customers!