RSS

blog

Explore a single year: 2015 2016 2017 2018

Debugging black screen in Debian Installer

Tux, the Linux mascot

What was noticed?

A few weeks ago, other developers reported on the #debian-boot IRC channel that they were seeing regressions while testing daily builds of the graphical version of the Debian Installer. This could easily be reproduced by anyone, fetching the latest netboot/gtk/mini.iso file for the amd64 architecture, which can be found in the directory of daily builds for amd64.

Timing-wise, it might appear that coincided with the move of all our git repositories from the good, old alioth.debian.org hosting service to the brand new salsa.debian.org, the Gitlab instance replacing the previous Fusionforge one. But that transition had nothing to do with our issue, which is rather related to our move from the 4.15 kernel series to the 4.16 one.

First investigation

Looking at that black screen, it looked like the X server was starting just fine, but nothing was being drawn, and input devices (keyboard, mouse) appeared inactive. Thankfully, the QEMU/KVM monitor makes it possible to trigger a switch to a different TTY, to access log files and run a few commands. It was possible to extract kernel logs and X logs to compare what was happening with different kernel versions. The big surprise was: the logs were identical, except for the uname output and the timestamps embedded in the logs. Switching back to the TTY where X was expected to show up, here it was, properly started and displaying the usual language selection screen! Moving from X logs to system logs (including kernel logs):

May 12 02:25:02 kernel: [    3.302164] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.302291] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.302351] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.302450] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.302530] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.302682] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.302804] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.302963] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.303009] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:02 kernel: [    3.303078] random: debconf: uninitialized urandom read (8 bytes read)
May 12 02:25:19 init: starting pid 170, tty '/dev/tty2': '-/bin/sh'
May 12 02:25:21 debconf: Setting debconf/language to en
May 12 02:25:21 main-menu[229]: DEBUG: resolver (libgcc1): package doesn't exist (ignored)
May 12 02:25:21 main-menu[229]: INFO: Falling back to the package description for brltty-udeb
May 12 02:25:21 main-menu[229]: INFO: Menu item 'localechooser' selected
May 12 02:25:22 debconf: Setting debconf/language to en
May 12 02:25:22 gtk-set-font: Switching to font 'DejaVu Sans' for 'en'
May 12 02:25:24 kernel: [   25.303708] random: crng init done
  

So it appears there were randomness/entropy-related issues very soon after boot-up; then, roughly 20 seconds later, debconf (which drives the installation process, asking questions and waiting for answers) was properly started; closely followed by the kernel reporting some “random” initialization had finished. This rang a bell so I’ve reported my findings as #898468, putting the Debian kernel team in copy.

Ben Hutchings swiftly replied with some more information about the possible culprit. Traditionally, X clients and X servers share an authentication cookie (see the XAUTHORITY environment variable, which can point to e.g. ~/.Xauthority, or the xauth command line tool), called MIT-MAGIC-COOKIE-1, which requires some bits of randomness. It had been reported that some desktop session programs rely on libICE to handle that cookie, which uses a function from the libbsd library, arc4random_buf(), which in turn uses the getrandom() system call exposed by the kernel.

That getrandom() syscall was recently modified as part of a security fix, and this made callers block more easily when not enough entropy has been gathered. A parameter/an option can be specified to make this call non-blocking, but it would then return an error immediately instead of blocking. More information about this can be found in the Fixing Linux getrandom() in stable thread on the debian-release@ mailing list, but basically the idea was to get libbsd fixed in unstable/testing rather than simply reverting the security fix (the latter was preferred as a follow-up to the security updates for stable which were triggering severe regressions).

The installer team hadn’t checked that libICE was actually used by the installer, but waiting for the libbsd fix to rebuild and recheck was deemed like a sensible course of action. Unfortunately, right after a first bug fix upload was accepted into unstable, no changes could be noticed in a freshly rebuilt installer…

Deeper investigations

A quick reality check revealed that the absence of changes was to be expected, as the graphical installer doesn’t use the libICE library, which doesn’t even ship any udebs…

Back to square one/screen black: debconf was being reported as the process affected by the blocking behaviour of the getrandom() call so the first step was to chase down randomness-related calls in its source code. Unfortunately nothing turned up…

Next try was diving into the gtk+2.0 source code, since that’s the graphical toolkit used by the graphical installer. That looked promising at first since there were a few calls to g_random_int() to initialize the internal stamp field for some data structures. That function is defined in the glib1.0 package (see glib/grand.c), and trying to get rid of the calls from gtk+2.0 made no differences at all…

At this point, it wasn’t clear which part was responsible for the calls to getrandom(), so Cyril went for a brute force approach. A new custom build of the graphical installer was prepared, including the strace tool (which traces syscalls); and modifying the start-up scripts to wrap debconf’s execution with strace. The rootskel package ships this line in /lib/debian-installer/menu:

exec debconf -o d-i $MENU

which was modified by a sed call in the main Makefile of the debian-installer package (build/Makefile):

sed 's,exec debconf,exec strace -v -f -s 400 -o /tmp/debconf.strace debconf,' -i "/lib/debian-installer/menu"

This resulted in strace writing a /tmp/debconf.strace file which could then be used to debug this further. Since the Debian Installer comes with little debugging tools, this file can be extracted using a simple network pipe, with nc/netcat.

A new hope

Searching for getrandom() calls, they were apparently appearing right after opening the /etc/fonts/fonts.conf file, but also while opening other files under the /etc/fonts/fonts.d/ directory, which contains configuration snippets for fontconfig. Searching its source code, it appeared two files could be relevant:

Adding a few fprintf(stderr, "...\n"); statements around all suspicious calls made it clear: the uuid_generate_random() calls (implemented in libuuid1/libuuid1-udeb, packages built from the util-linux source) are responsible for the delays. Let’s look at the implementation in lib/randutils.c, the UUID generation calling get_random_bytes() in the same file:

/*
 * Generate a stream of random nbytes into buf.
 * Use /dev/urandom if possible, and if not,
 * use glibc pseudo-random functions.
 */
#define UL_RAND_READ_ATTEMPTS   8
#define UL_RAND_READ_DELAY      125000  /* microseconds */
void random_get_bytes(void *buf, size_t nbytes)
{
        unsigned char *cp = (unsigned char *)buf;
        size_t i, n = nbytes;
        int lose_counter = 0;
#ifdef HAVE_GETRANDOM
        while (n> 0) {
                int x;
                errno = 0;
                x = getrandom(cp, n, GRND_NONBLOCK);
                if (x> 0) {                    /* success */
                       n -= x;
                       cp += x;
                       lose_counter = 0;
                } else if (errno == ENOSYS) {   /* kernel without getrandom() */
                        break;
                } else if (errno == EAGAIN && lose_counter < UL_RAND_READ_ATTEMPTS) {
                        xusleep(UL_RAND_READ_DELAY);    /* no etropy, wait and try again */
                        lose_counter++;
                } else
                        break;
        }
        if (errno == ENOSYS)
#endif
        /*
         * We've been built against headers that support getrandom, but the
         * running kernel does not.  Fallback to reading from /dev/{u,}random
         * as before
         */
        {
                int fd = random_get_fd();
[…]

So if the HAVE_GETRANDOM macro was defined at compile-time, a getrandom() call is attempted, in non-blocking mode. If the ENOSYS error is triggered, the break; gets us out of the loop, and the code carries one with a fallback to reading from /dev/urandom or /dev/random (one gets the relevant file descriptor with random_get_fd() as seen on the last line.

But if the error is EAGAIN instead, that means not enough entropy was available. In this case, there’s a little delay getting added through xusleep(), and further attempts are made until a given limit (UL_RAND_READ_ATTEMPTS) is reached. At this point, the final break; triggers an exit of the loop and the fallback described above is reached.

Where to go from here?

To make extra sure diagnostics were correct, a little patch was developed as a proof of concept, disabling the HAVE_GETRANDOM macro, so that the while loop quoted above isn’t even tried, and so that one jumps directly to the fallback reading from /dev/*random. It was a bit tricky to get this part right since there are several places where it can be enabled: a configure.ac check, another check on the __NR_getrandom macro, and a final check on the SYS_getrandom macro.

--- a/configure.ac
+++ b/configure.ac
@@ -462,7 +462,6 @@ AC_CHECK_FUNCS([ \
        getdtablesize \
        getexecname \
        getmntinfo \
-       getrandom \
        getrlimit \
        getsgnam \
        inotify_init \
--- a/lib/randutils.c
+++ b/lib/randutils.c
@@ -27,14 +27,14 @@
 
-#elif defined (__linux__)
+#elif 0
    /* usable kernel-headers, but old glibc-headers */
 
-#if !defined(HAVE_GETRANDOM) && defined(SYS_getrandom)
+#if 0
 /* libc without function, but we have syscal */

This patch ensures a direct read from /dev/*random, which means delays go away, and that Debian Installer starts properly. Hurray?

Unfortunately, it doesn’t seem reasonable to degrade the quality of randomness used by a function called uuid_generate_random(), since it might be used by something a little more critical than setting UUIDs for fonts in fontconfig; a quick search on sources.debian.org returns 58 packages with that name in their source code at the very least. Increasing this function’s complexity to add an I don’t care about entropy quality option doesn’t seem too good either. So maybe it might make sense to see if fontconfig might be tweaked to use something that wouldn’t rely on getrandom()

Time to report these findings to the Debian Installer team as a whole, adding fontconfig and util-linux developers to the loop. Let’s see what consensus we can reach.


Published: Tue, 05 Jun 2018 18:45:00 +0200

Tails: early work on reproducibility

Tails logo

Quick introduction about Tails and build reproducibility

Tails is a live operating system, booting from USB or from DVD, aiming at preserving user’s privacy and anonymity. It is Free Software and based on the Debian GNU/Linux distribution. The Tails website contains a more complete overview of the project.

Over the last few years, an increasing number of software developers and security-focused people have been looking into build reproducibility: when a build system is deterministic, building a given component from a given source should always lead to the exact same binary result (byte-for-byte). The Tor project and the Debian distribution were among the first teams to work towards this goal. More information can be found on the reproducible-builds.org website, whose motto is Provide a verifiable path from source code to binary.

What does it mean in the Tails context? The main “product” of the Tails project is a bootable ISO image which contains the live system containing tools designed and preconfigured to preserve privacy and anonymity. Compromising this image would defeat the whole point of the project and could even endanger lives of journalists or whistleblowers relying on it. Making the image build process reproducible means developers and even users can reproduce it on their own hardware and make sure the ISO image published by the Tails project matches the one which was built locally, or which was verified by others.

How Debamax became involved

Flashback: October 2015.

Cyril had already been working on Debian derivatives for various customers and had been identified by some Tails developers as a potential asset to work on the first steps towards reproducibility. A sprint approach was chosen to tackle the freezable APT repository topic: meet, discuss, design, code; repeat a few times.

What follows is an overview of the results, with a few pointers to code and documentation. They are presented sequentially but all those topics are closely intertwined, and that had to be taken into account during the design phase.

Keeping track of packages in archives

The first objective was to imagine a workflow which would make it possible to build a given ISO image with the exact same set of Debian packages. An interesting data point is that 4 separate archives are used during a Tails build:

Of course, those archives aren’t static: the Debian archive is updated up to 4 times a day, the Debian Security archive is updated whenever a new security update is published, etc. So we needed a way to keep track of all packages used during the build but also of the state of each archive at any point where an image was being built.

It was decided to use reprepro, which is designed to produce custom Debian repositories, while also making it possible to mirror upstream repositories. It also allows to create snapshots, which exactly fits the need to keep packages around! Packages which would normally be deleted or replaced by a new version (when a synchronization happens) are kept as long as there’s at least one snapshot that depends on them.

First results:

Keeping track of packages used during the build

While working for other customers, Cyril already had to keep track of packages used to build Debian images: the idea was to list all packages and versions used for a given build, making it possible to generate changelog-like summaries of changes between two builds.

A similar approach was used here, where triplets are gathered with: package, version, URI. Here’s what the implementation looks like:

That means three files are generated with those triplets: one with binary packages from the bootstrap phase, one with binary packages downloaded through apt-get, and one with source packages downloaded through apt-get as well.

Another script was developed to aggregate those results into what we call a build-manifest; it gathers all origins (the archives mentioned in the previous section), their references (the snapshot used during the build), and all packages along with their versions. Example for the 3.2 release: tails-amd64-3.2.build-manifest.

Keeping track of packages in the long term

At this point we have the following results:

Keeping all packages forever wouldn’t be reasonable, so snapshots are expired after a few days. Since storing packages actually used for releases is the whole point of mastering repositories in the first place, an extra tool on the infrastructure side was developed to generate tagged snapshots from the time-based ones, thanks to references and packages listed in the build-manifest for the release.

This leads to these results:

Putting all the pieces together

Fastforward: November 2017.

Large parts of this initial freezable APT repository sprint were spent designing what the new workflow would look like during development phases, and during freeze periods. Of course, adjustments were made during the following releases, and the current status is documented on the APT repository page. Details can be found there about the custom APT repository (for Tails), about the time-based snasphots, and about the tagged snapshots.

This was only preliminary work, as there are many reasons which can trigger differences in the resulting ISO image. Details can be found in the reproducible builds blueprint. Many issues have been tackled by the Tails developers since then, and that’s how the 3.3 release has been announced as the first reproducible ISO image! (Of course, this is still rather new, and bug #14933 has been filed already, but the current results are amazing already!)

Congratulations to the Tails developers for reaching this milestone, and many thanks for this cooperation opportunity!


Published: Fri, 08 Dec 2017 10:15:00 +0100

Debian Installer: Stretch released

Debian Installer: Stretch

Foreword

Since the previous post, several Debian Installer release candidates were published, and this post sums up everything that happened between the Debian Installer Stretch RC 2 release and the final Stretch release.

Stretch RC 3

Cyril published the Debian Installer Stretch RC 3 release on 2017-04-10, roughly two months after Stretch RC 2.

Improvements

A number of fixes piled up since then, including the following important changes:

Other changes can be found in the release annoucement.

Here are a few screenshots (click for full view) illustrating the Korean rendering issues follow, so that one can visualize the impact of a font issue:

Stretch D-I RC 2 (broken) Stretch D-I RC 3 (fixed)
Broken Korean rendering in language selection screen  
 
Fixed Korean rendering in language selection screen
 
Entirely broken Korean screen Fixed Korean screen

Hardware support

The full list of hardware-related changes is reproduced below:

Stretch RC 4

With the amount of changes in Stretch RC 3, and the Stretch release date approaching (2017-06-17), it was expected to have a smaller number of changes in the Stretch RC 4 release. Thankfully, that's what happened by the end of May (2017-05-27), with most changes being translations updates: the number of full translations saw a bump from 15 to 21.

Improvements

Hardware support

Stretch RC 5

This Debian Installer release happened on 2017-06-13, only a few days before the final Debian release, planned on 2017-06-17. There were still quite a number of changes to merge, and only those with the most visible impact are listed below.

Improvements

Hardware support

Stretch final

Usually, one would use the same debian-installer upload for the last release candidate and for the final release, but Cyril asked the linux maintainers to merge a last change before the release: It started to become clear in early June that the missing i2c-modules udeb on the armhf platform was the likely cause for several issues (#864536, #864457, #856111).

Performing uploads, builds, and unblocks of linux, debian-installer, and debian-installer-netboot-images before the final release wasn't entirely stressless, but it seemed worth trying. Adding new binary packages in point releases is a very rare event, and going through the NEW queue via unstable looked like the right thing to do, even if the timing was very tight!

What's next?

The next Debian Installer report will likely feature a summary of installer-related changes merged into the 8.9 and 9.1 point releases (for Jessie and Stretch respectively).

Also, Cyril will be giving a talk titled “News from the Debian Installer” during DebConf17. This year, the annual Debian Conference takes place in Montreal, Canada (more info is available on the DebConf17 schedule). See you there?


Published: Sat, 05 Aug 2017 12:00:00 -0400

Debian Installer: Stretch RC 2 released

Debian Installer: Stretch RC 2

Foreword

Since the previous blog post, two Debian Installer release candidates were published, so both will be mentioned in this blog post.

Stretch RC 1

As mentioned in the Plans section of the Stretch Alpha 8 summary: with the full freeze coming up, it made sense to switch from the Alpha numbering to the Release Candidate one. That’s why Cyril published the Debian Installer Stretch RC 1 release on 2017-01-15.

Unfortunately, some blockers were found with merged-/usr setups, so the new debootstrap default was reverted. Even if some of these bugs were fixed in the meanwhile, it seemed unreasonable to enable the new code again near the end of the stretch release cycle, so it’s going to be postponed until after the buster release cycle has started.

Here is a list of other changes:

Stretch RC 2

Since the Linux kernel team was finally moving towards the target kernel version for Stretch (4.9, even if earlier discussions mentioned 4.10), it seemed like a good idea to get a new Debian Installer released as soon as possible, which explains why Debian Installer Stretch RC 2 was released on 2017-02-02, only a few weeks after Stretch RC 1.

Another significant change happened besides the Linux kernel update, with the os-prober component receiving major changes. Let’s have a look at its description:

    Package: os-prober
    Description: utility to detect other OSes on a set of drives
     This package detects other OSes available on a system and outputs the
     results in a generic machine-readable format.
  

This component is used to determine which other operating systems might be hanging around on various partitions and discs, and it’s used e.g. by update-grub to include menu entries for other Linux distributions, Windows, etc. Unfortunately, its historical operating system detection code has been triggering issues in some environments involving virtualization, which ended up in data loss in some cases.

The relevant code was heavily overhauled, and one might hit some regressions with new versions of this component (1.72 and later). Details about these significant changes can be found in the changelog entry for the 1.72 upload, and one might notice that the preparations for this new release candidate resulted in a last-minute regression fix in the 1.74 upload. Similar issues have been reported with the dmsetup create command hanging, leading to a frozen progress indicator when grub is being set up (see bug report #853927). This can be worked around by switching to a console and killing the dmsetup process (see this message for more details), until this issue is fully diagnosed and fixed.

Next release candidate

With the full freeze in effect, Debamax is trying to make sure Cyril can spend as much time as possible on two complementary tasks:

More to come in our next Debian Installer report!


Published: Mon, 13 Feb 2017 13:30:00 +0100

Debian Installer: Stretch Alpha 8 released

Debian Installer: Stretch Alpha 8

Release process

It took a few months after Stretch Alpha 7 (published 2016-07-04), but the Stretch Alpha 8 release of the Debian Installer happened a few days ago. Release preparations had to be delayed a bit because a fix was needed in the linux packaging (see bug report #839552) so that mounting FAT partitions worked again, since this is needed for EFI support.

As a release manager, Cyril has to make sure things look good enough for a release. This usually involves freezing udeb-producing packages for a while, so that the main set of packages used to build the Debian Installer doesn’t get any last minute changes that might bring some regressions while stabilization is in progress.

The debian-installer package got uploaded on 2016-10-27 but two major issues popped up:

The first issue was due to the rather recent linux/linux-signed split. The idea behind this move is preparing for Secure Boot support, with linux being used to build linux kernel and modules as usual, and linux-signed holding extra signatures for them, so that they can be verified cryptographically. This exposed an awful and old bug which hadn’t been detected until now. Then an extra commit got added as a work around for the linux/linux-signed specific situation: code comes from the linux source package, so that’s what needs to be listed in Built-Using. Further improvements are planned (see bug report #842719), by checking for a possible Built-Using field in each udeb, so that this workaround can be replaced by some more generic code.

The second issue was due to the reintroduction of InRelease support. There are two ways of validating the contents of a given distribution on a Debian mirror: checking the Release file against its detached signature (Release.gpg), or checking the InRelease file alone, as it contains an inline signature. Since only gpgv is available in a Debian Installer environment, the idea was to split the InRelease file into two files: the Release file and its signature. The tricky part is that the final newline is dropped by GnuPG, so a little tr … | sed … | tr … dance was added to do the same. Unfortunately, while it works fine with usual implementations of those commands, that’s not the case with the busybox implementation used in Debian Installer, leading to a bad signature result during the installation process (see bug report #842591). Thankfully Ansgar Burchardt had a proof of concept ready with a simple state machine in POSIX shell, which Cyril could merge and upload to fix debootstrap-udeb, fixing this showstopper.

Major update: debootstrap and merged-/usr

As mentioned above, debootstrap was updated, but not only for InRelease support. It received a number of fixes and improvements (see the release announce for the details), but the biggest change deserves a longer explanation: debootstrap now defaults to merged-/usr.

Once upon a time, UNIX systems were booted from a floppy disk, and once the boot sequence had finished, one would mount extra resources onto the /usr directory: programs, libraries, home directories, etc. Nowadays, it makes little sense to keep the distinction between boot-time and non-boot-time tools, and it was proposed to get rid of this distinction entirely. One way to achieve this is as simple as setting up symlinks for a number of directories: bin, sbin, lib, and other libXX (one can find lib32, lib64, etc. depending on the architecture), respectively pointing at usr/bin, usr/sbin, usr/lib, etc. This approach means there’s no need to change any single package, it’s just about using a specific directories+symlinks setup at installation time.

The options to enable or disable this feature are --merged-usr and --no-merged-usr respectively. The Debian script (shared across many versions) was updated to default to merged-/usr for stretch and later, which explains why this Debian Installer Stretch Alpha 8 release now defaults to a merged-/usr setup.

Credits: This change was driven by both Marco d’Itri and Ansgar Burchardt, while Julien Cristau worked on most other changes. Thanks!

Some final notes:

Next release: Stretch Alpha 9

A few things are planned for the next release:


Published: Tue, 22 Nov 2016 00:08:00 +0100

Hello, World!

Hello, World!

Debamax SAS has been successfully registered with the Trade and Company Register in Rennes, and has officially started operating in October! The legal notices page has further information regarding this registration and identification numbers.

A Twitter account (@DEBAMAX) is going to be set up to complement this website and its RSS feed.


Published: Wed, 7 Oct 2015 12:00:00 +0200