Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 19 submissions in the queue.
posted by janrinok on Friday October 16 2015, @08:02AM   Printer-friendly
from the what,-no-apocalypse? dept.

Structural and semantic deficiencies in the systemd architecture for real-world service management

This is a in-depth architectural critique of systemd. It claims to be the first purely technical review of systemd internals, and provides a detailed analysis of several components. It criticizes on the basis of ordering related failures, a difficult to predict execution model, non-determinism in boot-order, as well as several other points.

Though many users would perceive the long processing pipeline to increase reliability and be more "correct" than the simpler case, there is little to acknowledge this. For one thing, none of jobs, transactions, unit semantics or systemd-style dependencies map to the Unix process model, but rather are necessary complications to address issues in systemd being structured as an encapsulating object system for resources and processes (as opposed to a more well-defined process supervisor) and one accommodating for massive parallelism. Reliability gains would be difficult to measure, and that more primal toolkits like those of the daemontools family have been used in large-scale deployments for years would serve as a counterexample needing overview.


Original Submission #1Original Submission #2

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by darkfeline on Friday October 16 2015, @11:27AM

    by darkfeline (1030) on Friday October 16 2015, @11:27AM (#250491) Homepage

    >non-determinism in boot-order

    How is this a bad thing, so long as dependencies are being started before their dependents? I think the "old" way of potentially relying on race conditions, where process A must be started three daemons before process B and no one knows because the init script kept working since it was written in 2001, to be significantly worse than "non-determinism in boot-order".

    In fact, forced randomness would be good for fishing out these kinds of problems early so they don't cause a huge headache later on.

    --
    Join the SDF Public Access UNIX System today!
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 4, Insightful) by Anonymous Coward on Friday October 16 2015, @11:35AM

    by Anonymous Coward on Friday October 16 2015, @11:35AM (#250494)

    >I think the "old" way of potentially relying on race conditions, where process A must be started three daemons before process B and no one knows because the init script kept working since it was written in 2001, to be significantly worse than "non-determinism in boot-order".

    In theory, theory and practice are the same. In practice, they are not.
    And in practice, things which are working are *always* better than things which are not.

  • (Score: 5, Insightful) by fritsd on Friday October 16 2015, @12:27PM

    by fritsd (4586) on Friday October 16 2015, @12:27PM (#250511) Journal

    why is non-determinism bad?

    Well, I guess it's simpler to debug and solve a problem of the form: "my corporation's order processing system often crashes between database startup and load-leveling front-end webserver start up", rather than a problem of the form: "my corporation's order processing system crashes 17.7% of the time and everyone is yelling at me".

    Humans and computers create problems. Humans fix problems. Humans have only a limited capacity of understanding, so the init system must be structured in a "human-repairable" way otherwise it becomes byzantine, cryptic and error-prone.

    Incidentally, about fast boot times: if you hear "my corporation's order processing system can restart from a crash several times per second", YOU'RE DOING IT WRONG.

    • (Score: 0) by Anonymous Coward on Thursday October 22 2015, @05:50PM

      by Anonymous Coward on Thursday October 22 2015, @05:50PM (#253311)

      The pro-systemd crowd will likely spin it to say that they can spin up 1000 instances of the order processing system in second.

      See for them only two things matter, desktop and containers/VMs. And in both instances fast startup and shutdown is preferred over long uptimes (or at least it seems so given that this is what gets priority by the systemd devs).

  • (Score: 3, Informative) by Anonymous Coward on Friday October 16 2015, @01:41PM

    by Anonymous Coward on Friday October 16 2015, @01:41PM (#250528)

    For desktop and some server usage non-determinism is not a bad thing per-se. However it makes things actually harder to debug. As you are not sure what order something happened in. Take for example you are debugging a bit of code. It blows up every 100th run. But it is doing so because some module you assume is 'there' just isnt. In fact your code is assumed to run after this other code. So now you have a new issue that did not exist before. How to 'wait' or 'force load'. Then what if you cant force it to load? What if you double load it? How long do you wait? Remember it is non deterministic and you want the OS to CLI to show up someday so you have to time out. It is not a good user experience when randomly your computer takes 5 mins to spin up when normally it takes 30 seconds.

    For a RTOS sort of situation determinism is the name of the game. These dudes figure out how many cycles something takes to run. They make sure it takes that and no more and no less. It is deterministic. They will balk at it because they have years of exp telling them otherwise. It may be 'ok' but they will not sit down while something takes 3-50 seconds to startup depending on what phase of the moon something is in. They *need* it to startup in 3 seconds. It needs to do that every single time or something physical will literally blow up.

    The *idea* of systemd is a good one. A better init system. One more like what OSX and Windows have. A nice single way to talk to the services. A nice clean way to jail them off from root. A nice clean way to handle the life cycle and dependencies between services (the network stack should be running before you spin up the proxy server sort of thing). Something all the current init systems do very well but all do differently with some holes and all of them with different commands that pretty much do the same thing. Like windows has a nice 'are you alive' ping that you do to services. You could build something like that into each app but then it is custom for every single one. OSX and Windows have a nice predefined way of doing that. systemd went off the rails thinking it should be/contain all those services. Throwing out years of work on other services with the idea 'it will be clean' when all you end up doing is creating a whole new beast with different bugs and missing features. He basically did the same thing he did with audio. Recreated the whole stack and then screwed it up *again*.

  • (Score: 3, Informative) by jdccdevel on Friday October 16 2015, @07:55PM

    by jdccdevel (1329) on Friday October 16 2015, @07:55PM (#250806) Journal

    Non-determinism is ALWAYS bad in a computer's startup sequence, it should be removed whenever possible.

    - In a deterministic system, I (a system administrator) can always be certain that A will start before B, because services will always start in the same order.
    - In a non-deterministic system I cannot, unless there is a dependency between them. The dependencies are one way to try to bring a measure of determinism to the system.

    The problem is that dependencies are not always easily defined, nor is it obvious if a given configuration change will require some change in dependencies.

    How does a system administrator know if the computer starting properly was a fluke or not? The admin cannot know! It may be that 90% of the time things will start correctly, it may only be 10% of a one-off fluke, There is no way to know! I have more important things to do with my time than spend hours troubleshoot something as basic as a computer not starting some random percentage of the time, just because someone else thinks I need to save 30 seconds on my boot time once a month! (If that!)

    Even once the problem is isolated, sometimes it is NOT POSSIBLE to express the dependency as a simple A before B, sometimes dependencies are dynamic.

    For example, if I make a configuration change to a service, and suddenly my service depends on a library, which depends on a service that it didn't depend on before. How is that reflected in a systemd service file? In an init script, the startup script could account for that, but not in a systemd service.

    Here's a real-life example I ran into the other day.

    I have a arch system that loads some firewall rules, including NAT. I have some custom settings related to NAT in my sysctl configuration. One day I rebooted, and some of my sysctl settings were gone. Not all of them, just the NAT ones.

    The problem was that _SOMETIMES_ the sysctl settings would be applied before the NAT module was loaded, and so my sysctl settings for NAT were not being applied.
    Easy fix, you say, apply the settings afterwards. That isn't the point.

    Why should I be required to hunt down issues like that? I moved my systems to Linux for stability and predictability. Specifically so that I would not have to deal with issues like that.

    If my system starts correctly once, I want it to do so EVERY TIME. Not 90%, not even 99%. Nothing less than 100% of the time is good enough.

    That is simply not possible with a non-deterministic boot system.

    • (Score: 1) by shrewdsheep on Friday October 16 2015, @08:37PM

      by shrewdsheep (5215) on Friday October 16 2015, @08:37PM (#250835)

      This is a straw man argument.
      systemd could (and probably should) have an option to start sequentially.

      • (Score: 0) by Anonymous Coward on Thursday October 22 2015, @05:53PM

        by Anonymous Coward on Thursday October 22 2015, @05:53PM (#253314)

        Options bad, m'kay! All hail the one and only "init": systemd!

    • (Score: 2) by Justin Case on Saturday October 17 2015, @02:47PM

      by Justin Case (4239) on Saturday October 17 2015, @02:47PM (#251099) Journal

      Non-determinism is ALWAYS bad in a computer's startup sequence^W^Wbehavior

      FTFY

  • (Score: 4, Informative) by rleigh on Friday October 16 2015, @08:53PM

    by rleigh (4887) on Friday October 16 2015, @08:53PM (#250842) Homepage

    Former Debian sysvinit/initscripts maintainer here.

    I'd just like to correct a couple of points here. Firstly, the "old" way has never been to rely on race conditions of any sort. Neither the (old) by-hand numerical ordering or the (new) LSB header dependencies ever *relied* on races of any sort. There may have been implicit ordering requirements not formally encoded in the dependencies, but that in no way means non-determinism, nor does it imply incorrect behaviour.

    The second point is regarding determinism in systemd and sysvinit. With sysvinit LSB header dependencies, insserv constructs a dependency graph and serialises this as a make-style dependency list. startpar, when run serially, will effectively flatten the graph to a linear sequence. This is completely deterministic. startpar, when run with parallelisation enabled, will be able to use the make-style rules to parallelise appropriate parts of the sequence; while this is less deterministic the overall progression is clearly and explicitly defined, and the dependencies are hard requirements which are enforced, e.g. if a script depends upon three other scripts, those three are guaranteed to have run first. Note that this assumes the scripts behave properly, which is generally the case. As the article states, the dependencies are not enforced to quite the same level of strictness with systemd. With Ubuntu 15.04 on my work PC, the systemd boot is both slower than the sysvinit boot, and less reliable--most of the time it boots, but occasionally it hangs somewhere in the middle of booting and requires a hard reset--it being in a state which precludes any interactive debugging of where and why it got stuck. This is vastly less deterministic than what we had with sysvinit, and a definite step backward--a computer should be able to consistently boot.

    The main wart in sysv-rc/initscripts is the hack to make network hotplug events from udev trigger init actions, but in practice while ugly and complex, I wouldn't say it was non-deterministic. You have a hotplug event leading to a defined set of actions. Could be improved for sure, but it's quite functional.

    This is in no way to imply that sysvinit/sysv-rc are perfect, they clearly have some problems, but much of the criticism in favour of systemd is quite dishonest.

  • (Score: 2) by sjames on Monday October 19 2015, @07:01AM

    by sjames (2882) on Monday October 19 2015, @07:01AM (#251720) Journal

    Service X fails to start 33% of the time tells me nothing. Service X fails to start on boot but starts just fine manually = I missed a dependency.

    An indeterminate system creates heisenbugs.

  • (Score: 0) by Anonymous Coward on Wednesday November 04 2015, @04:58AM

    by Anonymous Coward on Wednesday November 04 2015, @04:58AM (#258282)

    Getting away from heisenbugs and other "spooky" behavior is why people moved from Windows to Linux in the first place.