System adminsitrator Chris Siebenmann has found Modern versions of systemd can cause an unmount storm during shutdowns:
One of my discoveries about Ubuntu 20.04 is that my test machine can trigger the kernel's out of memory killing during shutdown. My test virtual machine has 4 GB of RAM and 1 GB of swap, but it also has 347 NFS[*] mounts, and after some investigation, what appears to be happening is that in the 20.04 version of systemd (systemd 245 plus whatever changes Ubuntu has made), systemd now seems to try to run umount for all of those filesystems all at once (which also starts a umount.nfs process for each one). On 20.04, this is apparently enough to OOM[**] my test machine.
[...] Unfortunately, so far I haven't found a way to control this in systemd. There appears to be no way to set limits on how many unmounts systemd will try to do at once (or in general how many units it will try to stop at once, even if that requires running programs). Nor can we readily modify the mount units, because all of our NFS mounts are done through shell scripts by directly calling
mount; they don't exist in/etc/fstabor as actual.mountunits.
[*] NFS: Network File System
[**] OOM Out of memory.
We've been here before and there is certainly more where that came from.
Previously:
(2020) Linux Home Directory Management is About to Undergo Major Change
(2019) System Down: A systemd-journald Exploit
(2017) Savaged by Systemd
(2017) Linux systemd Gives Root Privileges to Invalid Usernames
(2016) Systemd Crashing Bug
(2015) tmux Coders Asked to Add Special Code for systemd
(2016) SystemD Mounts EFI pseudo-fs RW, Facilitates Permanently Bricking Laptops, Closes Bug Invalid
(2015) A Technical Critique of Systemd
(2014) Devuan Developers Can Be Reached Via vua@debianfork.org
(2014) Systemd-resolved Subject to Cache Poisoning
(Score: 4, Insightful) by Subsentient on Sunday May 10 2020, @07:31PM (16 children)
The longtimers here know I'm no fan of systemd. Hell, I hate it so much I wrote my own init system [universe2.us].
But 347 NFS mounts... Jesus Christ. What the hell are you doing that you need that many filesystems? The output of "mount" must be a real eyerape.
Systemd needs a bit of logic to try to unmount sequentially over a threshold, but, I don't blame systemd too much for this, more like a bizarre use case. What the hell?
"It is no measure of health to be well adjusted to a profoundly sick society." -Jiddu Krishnamurti
(Score: 4, Insightful) by Anonymous Coward on Sunday May 10 2020, @08:00PM (3 children)
You are right that the story submitter had a crazy amount of NFS mounts.
However, I think you are wrong to excuse systemd from not handling this correctly. As an init system, its job is to handle daunting startup and shutdown scenarios. Systemd is supposed to be more robust than the average software we all bumble along with. It seems this attitude towards extreme reliability just doesn't exist with Poettering who has no one else to answer to. Linux isn't more reliable than Windows these days -- at least Windows Server. I remember when Linux got the basics right, but that was when Linux was still reimplementing a stable standard (Unix, POSIX) created by more competent people. Most Linux "innovations" are frankly not that good.
(Score: 0) by Anonymous Coward on Monday May 11 2020, @03:10PM (2 children)
Systemd can't handle situations that they never had a test case for. Why would they bother prioritizing this if a user never reported any problems with unmounting 347 NFS drives before now?
(Score: 2) by rleigh on Monday May 11 2020, @06:08PM (1 child)
Competent software developers ensure that such situations can't arise in the first place. As well as competent developers, you also need a good specification and a good design, which also take these situations into account, as well as unit and integration testing that tests the extremes. It should all be documented, up front. Do you think the systemd developers did this, or did they just bash out some code and hope for the best? The most critical part of the system should not be written by dangerously incompetent arseholes.
Testing parallelism is hard, but several of the projects I work on have threading tests which are designed to provoke edge cases with locking and resource limitations, it can be done.
This isn't just the systemd developers. I could previously cause a kernel panic with LVM just by creating and deleting a few snapshots in parallel. Some kernel locking defect I assume. Back when I was testing Debian by rebuilding the whole archive on a 24 CPU system, I thought I'd test out ~30 parallel builds, and it repeatedly panicked the system within 5 minutes every time. No idea if it's fixed, I gave up on it at this stage, but that usage had clearly not been tested at all.
(Score: 0) by Anonymous Coward on Wednesday May 13 2020, @02:42PM
>Competent software developers
This should be corrected to say "Software developers with an excessive amount of time and fungible resources on their hands". It's easy to ask for these things, a lot harder to swallow when you get stuck with the bill.
(Score: 1, Interesting) by Anonymous Coward on Sunday May 10 2020, @08:30PM (3 children)
As an ex-code-plodder-along who dealt more with databases than kernels, but now increasingly just an ageing, helpless passenger on the Linux train... what would it take to have a spin of Mint or Manjaro (forget Debian & Arch) that have your alternative init system? Or how would one go about replacing systemd with epoch on an existing system? Mint, years back used upstart, but then upstream debian and ubuntu switched to systemd and it was a choiceless done deal.
(Score: 2) by Reziac on Monday May 11 2020, @03:14AM
PCLinuxOS does not use systemd.
From here in userland I don't care one way or the other so long as the OS is stable, but this is my preferred distro, so... there it is. Or rather, isn't.
And there is no Alkibiades to come back and save us from ourselves.
(Score: 2) by janrinok on Monday May 11 2020, @08:10AM (1 child)
Ubuntu still let you install SysV-init on versions up to and including 19.10 [ubuntu.com], but I'm not going to re-install one to check which packages it used. I've just checked 20.04 and sysv-init-utils is still there, but there is also systemd-sysv there too which suggests it isn't the sysv-init that we all know. As I've never tried switching to it I do not know how well (or otherwise) it works.
A quick online search suggests that some have reverted to sysv-init successfully but not in huge numbers.
(Score: 2) by janrinok on Monday May 11 2020, @08:14AM
(Score: 5, Interesting) by bloodnok on Sunday May 10 2020, @08:47PM
This may be an unusual use-case and well outside of the comfort-zone of most of us, but it is not unreasonable. And, in any case, it is not the job of the OS to determine what is unreasonable. Its job is to deal with whatever is thrown at it with as much grace as possible. Spawning tasks without apparent limit is entirely graceless.
This identifies a pretty serious deficiency in the systemd implementation/design/philosophy. Any system that does things in parallel *must* provide a way to limit the number of tasks that are executed at once. That is a fundamental requirement. The limit to parallelism should not be the imagination, or cussedness, of the system administrator but something that systemd itself provides. I would hope that this is taken seriously by the systemd team, but as I am trying to rid myself of systemd I find myself caring only i na very abstract sense.
Oh and If you're not convinced by the need for limits, trying building the kernel with -j 200.
(Score: 4, Insightful) by Bot on Monday May 11 2020, @12:45AM (4 children)
indeed this is a corner case.
but Unix is the OS for corner cases, and this incident proves right those talking about the Unix philosophy and those (see my comment history) judging a DSL for booting as a mistake.
yes the solution is to complicate either the DSL or its umount routines. inconveniencing everybody for the need of the guy mounting 45739 nfs volumes.
so basically, what would be a user in Unix is an outlier in systemd land
and I bet switching to a systemdless Deb distro is faster than addressing the bug.
Account abandoned.
(Score: 2) by Thexalon on Monday May 11 2020, @12:49PM (3 children)
Pardon my ignorance, but WTF does "DSL" mean in this context?
The only thing that stops a bad guy with a compiler is a good guy with a compiler.
(Score: 2) by Bot on Monday May 11 2020, @03:51PM (1 child)
Sorry, I was referring to systemd units configuration needing to replace turing complete and infinitely interfaceable bash scripts, so dsl means domain specific language, good summary in https://yakking.branchable.com/posts/domain-specific-languages/ [branchable.com]
Account abandoned.
(Score: 2) by Thexalon on Monday May 11 2020, @05:03PM
Ah yes, makes perfect sense now.
The only thing that stops a bad guy with a compiler is a good guy with a compiler.
(Score: 0) by Anonymous Coward on Monday May 11 2020, @07:00PM
Diesel
(Score: 3, Interesting) by boltronics on Monday May 11 2020, @02:33AM
I run Debian Stretch to host a bunch of Xen VMs, and on each system I've done this (that's running systemd), the system always crashes on shutdown - unless I manually shutdown all Dom-U (guest) hosts first. I haven't spent a significant amount of time investigating, but it seems like systemd is doing something to the logical volumes (or the associated volume groups) prior to the guests finishing shutdown - maybe deactivating the volume group because it doesn't detect mounted volumes on the Dom-0 or something silly. I'm not sure. I just know it is not an issue under SysVinit.
Anyway, imagine an environment with a bunch of NAS boxes that host VM OSs, and somebody wanting to fire up a bunch of VMs on a single host, maybe to simulate a production environment for troubleshooting something, and the number of NFS mounts would quickly add up. It's certainly a valid use case, common or not.
It's GNU/Linux dammit!
(Score: 0) by Anonymous Coward on Monday May 11 2020, @04:04AM
An unusual use case to be sure. The question is if it is significant enough to cater for.