Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Friday October 16 2015, @08:02AM   Printer-friendly
from the what,-no-apocalypse? dept.

Structural and semantic deficiencies in the systemd architecture for real-world service management

This is a in-depth architectural critique of systemd. It claims to be the first purely technical review of systemd internals, and provides a detailed analysis of several components. It criticizes on the basis of ordering related failures, a difficult to predict execution model, non-determinism in boot-order, as well as several other points.

Though many users would perceive the long processing pipeline to increase reliability and be more "correct" than the simpler case, there is little to acknowledge this. For one thing, none of jobs, transactions, unit semantics or systemd-style dependencies map to the Unix process model, but rather are necessary complications to address issues in systemd being structured as an encapsulating object system for resources and processes (as opposed to a more well-defined process supervisor) and one accommodating for massive parallelism. Reliability gains would be difficult to measure, and that more primal toolkits like those of the daemontools family have been used in large-scale deployments for years would serve as a counterexample needing overview.


Original Submission #1Original Submission #2

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1) by isj on Friday October 16 2015, @09:05AM

    by isj (5249) on Friday October 16 2015, @09:05AM (#250464) Homepage

    I've written my share of init-scripts, and I have no love for traditional init.d stuff because of its shortcomings:
      - race conditions on pid-files
      - inability to specify fine-grained dependencies (eg. an NFS mount depends on a specific network interface as opposed "the network")
      - lots of boilerplate code in init scripts
      - runlevels are obsolete
      - differences between distributions (although that is not exclusive a problem for init.d)
      - sequential execution

    For a bog-standard daemon I find it unreasonable to force me to write more than 3-4 lines describing the service. It is the responsibility of init system to keep track of pids, make sure daemons don't fork away and become untraceable, provide a unified interface for start/stop/reconfigure/reload. Over the years lots of tools and bandaids have been added to init.d to alleviate its shortcomings, but they still feel like kludges and still require me to write larger scripts.

  • (Score: 5, Insightful) by Thexalon on Friday October 16 2015, @12:32PM

    by Thexalon (636) on Friday October 16 2015, @12:32PM (#250513)

    I've written init scripts as well, and while I don't like it, I also have seen nothing that convinces me that systemd solves the problems you listed any better than those init scripts.

    race conditions on pid-files

    Which to the best of my knowledge is not a problem systemd solves. If it does, you're probably using pid files for the wrong reasons, because pid files should not be to determine whether a service is running - the right thing to check is whether it's responding as expected. The reason for that is that you aren't looking for, say, MySql, you're looking for "Is there a relational database that speaks the MySql dialect of SQL listening on port 3306"? That way, if the user switches to something that is different but compatible, you don't have to change what you're doing.

    inability to specify fine-grained dependencies

    Since there's no rules about what you can't include in a shell script, what I would recommend doing is to look for that specific condition at the beginning of your shell script.

    lots of boilerplate code in init scripts

    Can't you include a set of functions that eliminates that problem?

    runlevels are obsolete

    The "standard" runlevels all seem like pretty good concepts to me: shutdown, single-user recovery, no remote services, full access but no GUI, full access with a GUI, reboot. Which concept would you consider obsolete, and why?

    differences between distributions

    That is entirely to be expected when packaging software. As in, those kinds of system variations are a big part of what a distribution does, unless you are now demanding that everything in the universe not only conform to one of the big boys' standards (yum/rpm, apt/dpkg), but conform to a particular placement of all files beyond the FHS [pathname.com] (e.g. what goes in /usr versus /usr/local, how /etc is organized). And what about BSDs and other non-Linux POSIX-compliant systems which do things significantly different from Linux but should still have access to all the same application software?

    sequential execution

    So? All software makes use of sequential execution. At a low-level, it's what CPU cores do for their entire existence. Unless it's causing some kind of problem, and in this case it isn't, because (a) reboots and runlevel changes are relatively rare, and (b) they're fast enough that nobody really had a problem with it for a really long time. For what it's worth, on the init.d-based systems I've run the time to go from starting to boot to fully operational was less than the time for the BIOS to finish its POST.

    For a bog-standard daemon I find it unreasonable to force me to write more than 3-4 lines describing the service.

    My understanding, from everything I've seen with systemd, is:
    1. There is no such thing as a bog-standard daemon. They all tend to have their quirks. (Otherwise, nobody would need to change or replace them to make them work on a systemd-based machine.)
    2. You can handle those quirks with shell scripts, or with C. Of the two, shell scripts are far less error-prone and less catastrophic when things go wrong.

    Basically, what I see from systemd is taking something that is flexible and easily changed by a system administrator to something that is rigid and only can be changed in the source code of a wide variety of packages.

    --
    The only thing that stops a bad guy with a compiler is a good guy with a compiler.
    • (Score: 2, Interesting) by isj on Friday October 16 2015, @08:13PM

      by isj (5249) on Friday October 16 2015, @08:13PM (#250821) Homepage

      My post wasn't advocacy of systemd. It was a critique of init.d-style systems.

      The problem with pid files is that they need to exist in such systems. Some daemons write their own pid-file. Some assumes you do it. Some forks away, some don't, etc. And startdaemon (or is it daemonstart?) punts and leaves that to the daemon, which requires more scripting/code.
      Why should each daemon or init-script have code to write that pid file? I expect a daemon/service manager to keep track of that for me. But init.d-style systems don't.

      I agree that checking if a service is functioning/available shouldn't be done with pid-files. However, the does-process-exist is used for answering the question "is service X running". Not the question "is service X functioning?".
      As for the clients of the service they should of course only use the service API (sockets, files, whatever).

      Since there's no rules about what you can't include in a shell script,...
      More scripting...

      Can't you include a set of functions that eliminates that problem?
      More scripting...

      Please stop asking me to write more scripting.

      init.d-style systems generally assume that all daemons are special snowflakes and therefore don't provide reasonable built-in functionality for well-behaved daemons.

      Let's assume that I have a well-behaved daemon, say, soylentd. In the service description file I expect that I have to specify:
          - Which services/resources the daemon relies on.
          - Which program/script to run to start the service.
          - Whether the service forks away or not.
      and then the service/daemon manager provides suitable defaults for the rest:
          - internal service name: derive that from the service description filename
          - how to stop: assume SIGTERM, and if that doesn't work SIGKILL
          - how to reconfigure: SIGHUP is usually supported
          - keep track of pid/process-group/control-group
      I don't expect to include /etc/init.d/functions, write a case statement, use arcane echo options, or even have a #!/bin/sh line
      I do expect the ability to override the built-in defaults for those special snowflakes/daemons.

      >runlevels are obsolete
      The "standard" runlevels all seem like pretty good concepts to me: shutdown, single-user recovery, no remote services, full access but no GUI, full access with a GUI, reboot. Which concept would you consider obsolete, and why?

      Perhaps I was a bit hasty. I do find runlevel 1 useful when debugging boot/initialization problems. But the last time I found the distinction between runlevel 2/3/4/5 useful was back in the previous millennium. I would find 3 states more useful:
          1: off
          2: under service/maintenance (minimal services are runnig. More can be started manually)
          3: fully running

      > sequential execution
      ...

      My point is not to use multiple CPU cores to initialize - that is usually not the bottleneck. My point is that when a service is a bit slow to start, eg. the network due to a DHCP timeout there is no reason to delay the start of X/Windows, initializing USB devices, etc. In fact, if I need to debug that DHCP problem I need X/Windows and my USB mouse. So starting the services one by one is counter-productive.

      • (Score: 2) by sjames on Monday October 19 2015, @06:51AM

        by sjames (2882) on Monday October 19 2015, @06:51AM (#251716) Journal

        Some of the daemons I write have NO scripting other than for convention. You can directly link the executable into rc?.d if you like. Too bad systemd can't handle that.

        In other cases, my script is just a few lines, one for each of start, stop, status, and reload (sometimes not even status and reload). It doesn't get much simpler.

        It's really up to you how simple or complicated you want to make it.

      • (Score: 2) by sjames on Monday October 19 2015, @06:54AM

        by sjames (2882) on Monday October 19 2015, @06:54AM (#251717) Journal

        There are several parallel start mechanisms already out there that will keep things like DHCP from hanging everything else up.

  • (Score: 2, Insightful) by Anonymous Coward on Friday October 16 2015, @01:42PM

    by Anonymous Coward on Friday October 16 2015, @01:42PM (#250530)

    - race conditions on pid-files

    How is this a problem?

    - inability to specify fine-grained dependencies (eg. an NFS mount depends on a specific network interface as opposed "the network")

    You have access to /proc

    - lots of boilerplate code in init scripts

    Write an include file

    - runlevels are obsolete

    For you, perhaps

    - differences between distributions (although that is not exclusive a problem for init.d)

    It's not between distributions, it's between systems and it is the job of the system administrator to customize init scripts for their specific requirements.

    - sequential execution

    This could have been fixed in initd without reinventing the wheel

    • (Score: 1, Insightful) by Anonymous Coward on Friday October 16 2015, @02:46PM

      by Anonymous Coward on Friday October 16 2015, @02:46PM (#250568)

      For you [...]

      Was getting caught part of your plan?

      • (Score: 2, Funny) by Anonymous Coward on Friday October 16 2015, @05:05PM

        by Anonymous Coward on Friday October 16 2015, @05:05PM (#250656)

        For you [...]

        Was getting caught part of your plan?

        I'd love to reply, unfortunately your ill-conceived snark has just crashed the discussion. Please choose an advanced option to continue.

        1. Safe Mode
        2. Safe Mode with Networking
        3. Safe Mode with Command Prompt

        .

        HTH, HAND!

  • (Score: 3, Interesting) by Whoever on Friday October 16 2015, @03:30PM

    by Whoever (4524) on Friday October 16 2015, @03:30PM (#250599) Journal

    Take a look at OpenRC some time. Most of your criticisms are solved by it. For example, with OpenRC under Gentoo, I could specify a dependency on "net.eth1" (ie, the eth1 interface). OpenRC supports dependencies so that the order in which services are starter is not manually specified. It doesn't have the traditional 1-6 run levels. Instead, most services are in the "default" run level. It supports parallel service startup. The dependency information is also used when re-starting a service: dependent services are also restarted.