Ce billet n’a pas encore été traduit en français. La version anglaise est disponible ci-dessous.
Quick introduction about Tails and build reproducibility
Tails is a live operating system, booting from USB or from DVD, aiming at preserving user’s privacy and anonymity. It is Free Software and based on the Debian GNU/Linux distribution. The Tails website contains a more complete overview of the project.
Over the last few years, an increasing number of software developers and
security-focused people have been looking into build
reproducibility: when a build system is deterministic, building a
given component from a given source should always lead to the
exact same binary result (byte-for-byte).
The Tor project and
distribution were among the first teams to work towards this goal.
More information can be found on
website, whose motto is
Provide a verifiable path from source
code to binary.
What does it mean in the Tails context? The main “product” of the Tails project is a bootable ISO image which contains the live system containing tools designed and preconfigured to preserve privacy and anonymity. Compromising this image would defeat the whole point of the project and could even endanger lives of journalists or whistleblowers relying on it. Making the image build process reproducible means developers and even users can reproduce it on their own hardware and make sure the ISO image published by the Tails project matches the one which was built locally, or which was verified by others.
How Debamax became involved
Flashback: October 2015.
Cyril had already been working on Debian derivatives for various
customers and had been identified by some Tails developers as
a potential asset to work on the first steps towards
reproducibility. A sprint approach was chosen to tackle
freezable APT repository topic: meet, discuss, design,
code; repeat a few times.
What follows is an overview of the results, with a few pointers to code and documentation. They are presented sequentially but all those topics are closely intertwined, and that had to be taken into account during the design phase.
Keeping track of packages in archives
The first objective was to imagine a workflow which would make it possible to build a given ISO image with the exact same set of Debian packages. An interesting data point is that 4 separate archives are used during a Tails build:
- the regular Debian archive;
- the Debian Security archive;
- the Tor project archive;
- the Tails archive.
Of course, those archives aren’t static: the Debian archive is updated up to 4 times a day, the Debian Security archive is updated whenever a new security update is published, etc. So we needed a way to keep track of all packages used during the build but also of the state of each archive at any point where an image was being built.
It was decided to use reprepro, which is designed to produce custom Debian repositories, while also making it possible to mirror upstream repositories. It also allows to create snapshots, which exactly fits the need to keep packages around! Packages which would normally be deleted or replaced by a new version (when a synchronization happens) are kept as long as there’s at least one snapshot that depends on them.
- commits in the puppet-tails.git repository, which is used to maintain the Tails infrastructure; in particular: the files/reprepro/snapshots/time_based and manifests/reprepro/snapshots directories;
- time-based.snapshots.deb.tails.boum.org is where the snapshots are published, to be used during the build process instead of the upstream archives.
Keeping track of packages used during the build
While working for other customers, Cyril already had to keep track of packages used to build Debian images: the idea was to list all packages and versions used for a given build, making it possible to generate changelog-like summaries of changes between two builds.
A similar approach was used here, where triplets are gathered with: package, version, URI. Here’s what the implementation looks like:
apt-getwrapper was developed to track all downloaded packages (binary and source);
- a patch was added to the
debootstrapscript so that it would install that wrapper automatically;
debootstrapitself was patched to store the triplets during the
bootstrapphase as well.
That means three files are generated with those triplets: one with
binary packages from the bootstrap phase, one with binary packages
apt-get, and one with source
packages downloaded through
apt-get as well.
to aggregate those results into what we call
build-manifest; it gathers all
archives mentioned in the previous section),
references (the snapshot used during the build), and
all packages along with their versions. Example for the 3.2
Keeping track of packages in the long term
At this point we have the following results:
time-basedsnapshots of entire archives: for amd64 and i386 at first, and for amd64 only starting with Tails 3.0;
- build-manifest files containing references to those snapshots and lists of packages.
Keeping all packages forever wouldn’t be reasonable, so snapshots
are expired after a few days. Since storing packages actually used
for releases is the whole point of mastering repositories in the
tool on the infrastructure side was developed to generate
tagged snapshots from the
time-based ones, thanks to
references and packages listed in the build-manifest for the
This leads to these results:
- tagged.snapshots.deb.tails.boum.org is where the tagged snapshots are published. Those are used when building or rebuilding a stable release.
- These snapshots only include packages needed to rebuild releases, and of course their sources for license compliance reasons. Since they’re rather small compared to the full time-based snapshots, those can be kept forever.
Putting all the pieces together
Fastforward: November 2017.
Large parts of this initial
freezable APT repository
sprint were spent designing what the new workflow would look like
during development phases, and during freeze periods. Of course,
adjustments were made during the following releases, and the current
status is documented on
repository page. Details can be found there about the custom APT
repository (for Tails), about the time-based snasphots, and about
the tagged snapshots.
This was only preliminary work, as there are many reasons which can trigger differences in the resulting ISO image. Details can be found in the reproducible builds blueprint. Many issues have been tackled by the Tails developers since then, and that’s how the 3.3 release has been announced as the first reproducible ISO image! (Of course, this is still rather new, and bug #14933 has been filed already, but the current results are amazing already!)
Congratulations to the Tails developers for reaching this milestone, and many thanks for this cooperation opportunity!