Métadonnées systemd

systemd: RequiredBy versus WantedBy

Ce billet n’a pas encore été traduit en français. La version anglaise est disponible ci-dessous.

Introduction

Debamax helps several customers build and maintain system images based on Debian: those are deployed on target devices (regular servers, workstations, embedded devices), and then upgraded to new versions of the custom operating system using secure channels, and validated upgraded paths.

Such systems usually require some system-level integration, making sure all required packages work well when installed together, along with some configuration daemon to ensure those sytems can be tweaked as required. Such configuration daemons can be controlled remotely over a so-called “business application” that takes care of establishing a link to some remote backoffice, or exposed on a local console for operators to configure the system in a controlled manner (as opposed to exposing a full root shell).

This article focuses on the implementation details of such a configuration daemon, which is split into two parts: the actual daemon is exposing a REST interface, which is used by the business application to trigger configuration updates, and another part which is handling upgrades. To abstract this article from internal names, those two parts are called debamax-daemon and debamax-upgrade respectively. The following graph highlights the possible interactions.

Interactions between daemon and upgrade components
Interactions between units

Each part is managed by a systemd service unit:

  • debamax-daemon is spawning a daemon which manages REST requests, so uses Type=simple in the [Service] section.
  • debamax-upgrade is spawning a shell script which determines whether there's something to do on the upgrade side, so uses Type=oneshot in the [Service] section. It is started once at boot-up, triggering an upgrade if required (maybe resuming some interrupted download), and it can be started again at any later point in time, when the configuration daemon requests an upgrade (blue, left-to-right arrow in the graph above). In turn, during an upgrade, the debamax-daemon package might get upgraded, in which case its service gets restarted (magenta, right-to-left arrow in the graph above), as is usual in Debian environments.

Through some heavily simplified code, this article highlights the need for accurate metadata in those two systemd units, namely RequiredBy= versus WantedBy= in the [Install] section.

The playground is a minimal Debian 10 (Buster) virtual machine. To simplify tests, no actual debamax-* Debian packages are involved (even though that’s the case for customer systems), and directories for local administrators are used instead of system-wide directories (which are for proper packages). This means:

  • Using /etc/systemd/system/<unit>.service instead of /lib/systemd/system/<unit>.service for systemd service unit definitions.
  • Using /usr/local/sbin/<daemon-or-script> instead of /usr/sbin/<daemon-or-script> for the configuration daemon and upgrade script.

Important: In the whole article, the following convention will be used: when quoting commands being run, a line with 4 hyphens (----) will separate the output of the command from the system logs (as seen through journalctl). Letting a journalctl -f run in the background or in a different console makes it easy to follow what’s happening.

Looking at the upgrade script

Naive attempt

When starting from scratch, rather than patching an existing systemd unit, it might seem appealing to look around in all /lib/systemd/system/*.service, find something that resembles what we’d like to achieve, and adapt it. A minimal systemd service unit (/etc/systemd/system/debamax-upgrade.service) might look like this:

[Service]
Type=oneshot
ExecStart=/usr/local/sbin/debamax-upgrade

[Install]
RequiredBy=multi-user.target

and a minimal upgrade script (/usr/local/sbin/debamax-upgrade, don’t forget the +x flag) might be:

#!/bin/sh
echo "starting debamax-upgrade"
echo "stopping debamax-upgrade"

Let’s enable it now, meaning enable and start in a single command:

root@demo:~# systemctl enable --now debamax-upgrade
Created symlink /etc/systemd/system/multi-user.target.requires/debamax-upgrade.service \\
  → /etc/systemd/system/debamax-upgrade.service.
----
Jun 13 17:01:16 demo systemd[1]: Reloading.
Jun 13 17:01:16 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:01:16 demo debamax-upgrade[881]: starting debamax-upgrade
Jun 13 17:01:16 demo debamax-upgrade[881]: stopping debamax-upgrade
Jun 13 17:01:16 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:01:16 demo systemd[1]: Started debamax-upgrade.service.

The RequiredBy=multi-user.target is translated to a symlink pointing to the appropriate systemd service unit, systemd reloads its configuration on its own (no need for a separate systemctl daemon-reload), and the unit is started.

All good? Not quite!

Actual testing

Now we have a debamax-upgrade script that was started once, and that will start at boot-up, which is rather good. But what if the daemon needs to request an upgrade? Restarting the debamax-upgrade.service unit seems the right thing to do:

root@demo:~# systemctl restart debamax-upgrade
----
Jun 13 17:04:05 demo systemd[1]: Stopped target Graphical Interface.
Jun 13 17:04:05 demo systemd[1]: Stopping Graphical Interface.
Jun 13 17:04:05 demo systemd[1]: Stopped target Multi-User System.
Jun 13 17:04:05 demo systemd[1]: Stopping Multi-User System.
Jun 13 17:04:05 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:04:05 demo debamax-upgrade[892]: starting debamax-upgrade
Jun 13 17:04:05 demo debamax-upgrade[892]: stopping debamax-upgrade
Jun 13 17:04:05 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:04:05 demo systemd[1]: Started debamax-upgrade.service.
Jun 13 17:04:05 demo systemd[1]: Reached target Multi-User System.
Jun 13 17:04:05 demo systemd[1]: Reached target Graphical Interface.
Jun 13 17:04:05 demo systemd[1]: Starting Update UTMP about System Runlevel Changes...
Jun 13 17:04:05 demo systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Jun 13 17:04:05 demo systemd[1]: Started Update UTMP about System Runlevel Changes.

Wait a minute! What’s with those Graphical and Multi-User targets? Even if there are no display managers, graphical.target is the default target, which depends on multi-user.target, as can be seen in the bootup manpage. But why are those two targets getting exited from?

It was requested to restart a service unit that’s RequiredBy the multi-user.target, which means their requirements are no longer met. Once the service unit starts again, those targets can be re-entered, and some side-effects can be seen (UTMP, System Runlevel Changes).

Since this is about a shell script that runs once and exits, as defined through Type=oneshot, one might wonder what happens if only a new start would be requested:

root@demo:~# systemctl start debamax-upgrade
----
Jun 13 17:05:09 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:05:09 demo debamax-upgrade[899]: starting debamax-upgrade
Jun 13 17:05:09 demo debamax-upgrade[899]: stopping debamax-upgrade
Jun 13 17:05:09 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:05:09 demo systemd[1]: Started debamax-upgrade.service.

That looks way better, and closer to what was intended. To be entirely honest, an early “fix” was to just spawn systemctl start debamax-upgrade from the debamax-daemon code when requesting an upgrade, but that looked fishy. Can we fix the restart case that looked awkward?

Towards a real fix

Let’s switch the systemd service unit from RequiredBy= to WantedBy=. Beware, while doing so, it’s best to first disable the current unit, edit it, and then enable it again, so that the proper symlinks can be removed and created:

root@demo:~# systemctl disable --now debamax-upgrade
Removed /etc/systemd/system/multi-user.target.requires/debamax-upgrade.service.

root@demo:~# sed 's/RequiredBy=/WantedBy=/' -i /etc/systemd/system/debamax-upgrade.service

root@demo:~# systemctl enable --now debamax-upgrade
Created symlink /etc/systemd/system/multi-user.target.wants/debamax-upgrade.service \\
  → /etc/systemd/system/debamax-upgrade.service.
----
Jun 13 17:06:12 demo systemd[1]: Reloading.
…
Jun 13 17:06:42 demo systemd[1]: Reloading.
Jun 13 17:06:43 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:06:43 demo debamax-upgrade[936]: starting debamax-upgrade
Jun 13 17:06:43 demo debamax-upgrade[936]: stopping debamax-upgrade
Jun 13 17:06:43 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:06:43 demo systemd[1]: Started debamax-upgrade.service.

For the avoidance of doubt, here’s the new version of the systemd service unit for the upgrade component (/etc/systemd/system/debamax-upgrade.service):

[Service]
Type=oneshot
ExecStart=/usr/local/sbin/debamax-upgrade

[Install]
WantedBy=multi-user.target

Let’s compare starting:

root@demo:~# systemctl start debamax-upgrade
----
Jun 13 17:07:48 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:07:48 demo debamax-upgrade[941]: starting debamax-upgrade
Jun 13 17:07:48 demo debamax-upgrade[941]: stopping debamax-upgrade
Jun 13 17:07:48 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:07:48 demo systemd[1]: Started debamax-upgrade.service.

with restarting:

root@demo:~# systemctl restart debamax-upgrade
----
Jun 13 17:08:16 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:08:16 demo debamax-upgrade[944]: starting debamax-upgrade
Jun 13 17:08:16 demo debamax-upgrade[944]: stopping debamax-upgrade
Jun 13 17:08:16 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:08:16 demo systemd[1]: Started debamax-upgrade.service.

OK, this is way better!

Looking at the daemon

More context

Let’s backpedal a bit: it was confessed earlier that an early “fix” was just to switch from restart to start for the debamax-upgrade.service unit. It was noted in a ticket “for later” that some strange interactions across targets would otherwise be happening, but an easy fix was available. Why was it so important to get back to this topic to find a proper fix?

The problem was spotted again in a different area… This custom operating system features 3 critical components:

  • the configuration daemon (including upgrade management);
  • a “business application” which is provided by the customer, talks to remote servers, talks REST with the configuration daemon to deploy configuration updates, and talks with other applications;
  • a “support application” which is also provided by the customer, which has an important but less central role than the “business application”, and reports back to the “business application”.

The internal names for the business and support applications have been replaced with demo-service-a and demo-service-b in this article.

To free up some resources during upgrades, and to avoid possible ups and downs as those applications get restarted during upgrades, it was decided to just shut them down prior to upgrades. A successful upgrade would trigger a reboot and let the usual boot-up sequence start everyone again, while failed upgrades would restart services manually.

Naive attempt

Let’s see what minimal systemd unit configuration could look like for the configuration daemon and for the two demo services:

  • /etc/systemd/system/debamax-daemon.service (note RequiredBy= is still used at this point):
    [Service]
    Type=simple
    ExecStart=/usr/local/sbin/debamax-daemon
    
    [Install]
    RequiredBy=multi-user.target
  • /etc/systemd/system/demo-service-a.service:
    [Service]
    Type=simple
    ExecStart=/usr/local/sbin/demo-service-a
    
    [Install]
    WantedBy=multi-user.target
  • /etc/systemd/system/demo-service-b.service:
    [Service]
    Type=simple
    ExecStart=/usr/local/sbin/demo-service-b
    
    [Install]
    WantedBy=multi-user.target

Let’s keep all three daemons to something very minimal, waiting a full day for something to happen, ensuring they keep running for a while. The aim is simulating long-running daemons (Type=simple) with trivial code, instead of just having a single shell script exit early (which would usually be marked as Type=oneshot):

  • /usr/local/sbin/debamax-daemon:
    #!/bin/sh
    trap 'echo stopping debamax-daemon' TERM
    echo "starting debamax-daemon"
    sleep 24h || true
  • /usr/local/sbin/demo-service-a:
    #!/bin/sh
    trap 'echo stopping service A' TERM
    echo "starting service A"
    sleep 24h || true
  • /usr/local/sbin/demo-service-b:
    #!/bin/sh
    trap 'echo stopping service B' TERM
    echo "starting service B"
    sleep 24h || true

Now, let’s check a new version of the upgrade script, that implements the policy described above: stop demo services first, then trigger an upgrade. To simply things further, the upgrade is simulated as well. What would normally happen when a package containing a daemon is upgraded would be putting new files in place, and restarting the daemon. Here, the upgrade simulation is only about restarting the debamax-daemon service unit.

  • /usr/local/sbin/debamax-upgrade:
    #!/bin/sh
    echo "starting debamax-upgrade"
    
    echo "1. stopping some services during the upgrade"
    systemctl stop demo-service-a
    systemctl stop demo-service-b
    sleep 1
    
    echo "2. simulating the upgrade: new package gets installed, daemon gets restarted"
    systemctl restart debamax-daemon
    sleep 1
    
    echo "stopping debamax-upgrade"

Note: At this stage, both versions of the debamax-upgrade.service unit would give the same results. The debamax-upgrade script could even be started manually from the shell (without involving systemctl at all). For simplicity’s sake, the systemctl start debamax-upgrade call, that works with both versions, was chosen.

Actual testing

What happens when we start the upgrade process?

root@demo:~# systemctl start debamax-upgrade
----
Jun 13 17:28:47 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:28:47 demo debamax-upgrade[1272]: starting debamax-upgrade
Jun 13 17:28:47 demo debamax-upgrade[1272]: 1. stopping some services during the upgrade
Jun 13 17:28:47 demo demo-service-a[1263]: Terminated
Jun 13 17:28:47 demo demo-service-a[1263]: stopping service A
Jun 13 17:28:47 demo systemd[1]: Stopping demo-service-a.service...
Jun 13 17:28:47 demo systemd[1]: demo-service-a.service: Succeeded.
Jun 13 17:28:47 demo systemd[1]: Stopped demo-service-a.service.
Jun 13 17:28:47 demo demo-service-b[1262]: Terminated
Jun 13 17:28:47 demo demo-service-b[1262]: stopping service B
Jun 13 17:28:47 demo systemd[1]: Stopping demo-service-b.service...
Jun 13 17:28:47 demo systemd[1]: demo-service-b.service: Succeeded.
Jun 13 17:28:47 demo systemd[1]: Stopped demo-service-b.service.
Jun 13 17:28:48 demo debamax-upgrade[1272]: 2. simulating the upgrade: new package gets installed, daemon gets restarted
Jun 13 17:28:48 demo systemd[1]: Stopped target Graphical Interface.
Jun 13 17:28:48 demo systemd[1]: Stopping Graphical Interface.
Jun 13 17:28:48 demo systemd[1]: Stopped target Multi-User System.
Jun 13 17:28:48 demo systemd[1]: Stopping Multi-User System.
Jun 13 17:28:48 demo systemd[1]: Started demo-service-b.service.
Jun 13 17:28:48 demo debamax-daemon[1266]: Terminated
Jun 13 17:28:48 demo debamax-daemon[1266]: stopping debamax-daemon
Jun 13 17:28:48 demo systemd[1]: Stopping debamax-daemon.service...
Jun 13 17:28:48 demo systemd[1]: Started demo-service-a.service.
Jun 13 17:28:48 demo demo-service-b[1277]: starting service B
Jun 13 17:28:48 demo systemd[1]: debamax-daemon.service: Succeeded.
Jun 13 17:28:48 demo systemd[1]: Stopped debamax-daemon.service.
Jun 13 17:28:48 demo demo-service-a[1278]: starting service A
Jun 13 17:28:48 demo systemd[1]: Started debamax-daemon.service.
Jun 13 17:28:48 demo debamax-daemon[1280]: starting debamax-daemon
Jun 13 17:28:49 demo debamax-upgrade[1272]: stopping debamax-upgrade
Jun 13 17:28:49 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:28:49 demo systemd[1]: Started debamax-upgrade.service.
Jun 13 17:28:49 demo systemd[1]: Reached target Multi-User System.
Jun 13 17:28:49 demo systemd[1]: Reached target Graphical Interface.
Jun 13 17:28:49 demo systemd[1]: Starting Update UTMP about System Runlevel Changes...
Jun 13 17:28:49 demo systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Jun 13 17:28:49 demo systemd[1]: Started Update UTMP about System Runlevel Changes.

Basically, both demo services are stopped as expected in the first place, but when the debamax-daemon unit is restarted, the same target dance (as seen in the previous section) happens. As a side effect of exiting and re-entering targets, services that were purposefully stopped are started again!

Spotting this while testing the upgrade process was the final incentive to move from the early “let’s use start instead of restart” workaround mentioned in the previous section… to a real fix, and that’s the point where the differences between RequiredBy= and WantedBy= were analyzed.

Fixing metadata

Let’s see what happens with proper metadata, remembering to first disable the service unit (to get the “wrong” symlink removed), before switching from RequiredBy= to WantedBy=, and enabling the service unit again:

root@demo:~# systemctl disable --now debamax-daemon
Removed /etc/systemd/system/multi-user.target.requires/debamax-daemon.service.

root@demo:~# sed 's/RequiredBy=/WantedBy=/' -i /etc/systemd/system/debamax-daemon.service

root@demo:~# systemctl enable --now debamax-daemon
Created symlink /etc/systemd/system/multi-user.target.wants/debamax-daemon.service \\
  → /etc/systemd/system/debamax-daemon.service.
----
Jun 13 17:30:59 demo systemd[1]: Reloading.
Jun 13 17:30:59 demo debamax-daemon[1280]: Terminated
Jun 13 17:30:59 demo debamax-daemon[1280]: stopping debamax-daemon
Jun 13 17:30:59 demo systemd[1]: Stopping debamax-daemon.service...
Jun 13 17:30:59 demo systemd[1]: debamax-daemon.service: Succeeded.
Jun 13 17:30:59 demo systemd[1]: Stopped debamax-daemon.service.
…
Jun 13 17:32:02 demo systemd[1]: Reloading.
Jun 13 17:32:02 demo systemd[1]: Started debamax-daemon.service.
Jun 13 17:32:02 demo debamax-daemon[1317]: starting debamax-daemon

For the avoidance of doubt, here’s the new version of the systemd service unit for the daemon: (/etc/systemd/system/debamax-daemon.service):

[Service]
Type=simple
ExecStart=/usr/local/sbin/debamax-daemon

[Install]
WantedBy=multi-user.target

Let’s run the upgrade scenario again:

root@demo:~# systemctl start debamax-upgrade
----
Jun 13 17:32:37 demo systemd[1]: Starting debamax-upgrade.service...
Jun 13 17:32:37 demo debamax-upgrade[1321]: starting debamax-upgrade
Jun 13 17:32:37 demo debamax-upgrade[1321]: 1. stopping some services during the upgrade
Jun 13 17:32:37 demo demo-service-a[1278]: Terminated
Jun 13 17:32:37 demo demo-service-a[1278]: stopping service A
Jun 13 17:32:37 demo systemd[1]: Stopping demo-service-a.service...
Jun 13 17:32:37 demo systemd[1]: demo-service-a.service: Succeeded.
Jun 13 17:32:37 demo systemd[1]: Stopped demo-service-a.service.
Jun 13 17:32:37 demo demo-service-b[1277]: Terminated
Jun 13 17:32:37 demo demo-service-b[1277]: stopping service B
Jun 13 17:32:37 demo systemd[1]: Stopping demo-service-b.service...
Jun 13 17:32:37 demo systemd[1]: demo-service-b.service: Succeeded.
Jun 13 17:32:37 demo systemd[1]: Stopped demo-service-b.service.
Jun 13 17:32:38 demo debamax-upgrade[1321]: 2. simulating the upgrade: new package gets installed, daemon gets restarted
Jun 13 17:32:38 demo debamax-daemon[1317]: Terminated
Jun 13 17:32:38 demo debamax-daemon[1317]: stopping debamax-daemon
Jun 13 17:32:38 demo systemd[1]: Stopping debamax-daemon.service...
Jun 13 17:32:38 demo systemd[1]: debamax-daemon.service: Succeeded.
Jun 13 17:32:38 demo systemd[1]: Stopped debamax-daemon.service.
Jun 13 17:32:38 demo systemd[1]: Started debamax-daemon.service.
Jun 13 17:32:38 demo debamax-daemon[1326]: starting debamax-daemon
Jun 13 17:32:39 demo debamax-upgrade[1321]: stopping debamax-upgrade
Jun 13 17:32:39 demo systemd[1]: debamax-upgrade.service: Succeeded.
Jun 13 17:32:39 demo systemd[1]: Started debamax-upgrade.service.

This time, demo services are stopped as previously, but there’s no more target dance, which means that demo services are not started again. \o/

Conclusion

Our main takeaway would be: be extra careful before thinking about using the RequiredBy= keyword!

Of course, when in doubt, checking the documentation (systemd.unit) might have saved some troubles, and looking at the RequiredBy= and WantedBy= sections, plus their redirections to Requires= and Wants= would likely have yielded the same outcome, but without being entirely clear about the differences between both approaches:

Often, it is a better choice to use Wants= instead of Requires= in order to achieve a system that is more robust when dealing with failing services.