Undoing merged /usr on Debian

As of Debian Buster, debootstrap (the tool for bootstrapping new Debian installations) has been updated to perform a merged-/usr installation by default. For those who aren’t familiar with the idea, a “merged-/usr” is a system configuration where, instead of splitting system binaries four ways between /bin, /sbin, /usr/bin, and /usr/sbin (which has historically been the convention in Linux distributions), /bin is symlinked to /usr/bin and /sbin is symlinked to /usr/sbin. (Similarly, /lib is symlinked to /usr/lib to pull all shared libraries into a single directory.)

As far as I know, this idea originates (at least within the Linux ecosystem) from freedesktop.org and the systemd people (Debian’s wiki page on the matter links to the systemd wiki page on the same). It’s another idea to come out of FD.o and cause a bit of a political stir, but the point here is that Debian has adopted this new configuration as the default for new installs starting with Buster. I then discovered last week that my laptop is using the merged-/usr configuration, much to my surprise.

This was surprising because I’ve been running Debian on this laptop for approaching three years, and I hadn’t knowingly taken any action to switch to a merged configuration. I originally installed Stretch in the middle of 2017, and then did a dist-upgrade to Buster when that came along last year, and the Stretch-to-Buster upgrade doesn’t do anything involving merging /usr at all.

However, back in December, I had a rather spectacular tab completion failure in a root shell, which had the side-effect of permanently bricking dpkg on the machine, and hence necessitating a reinstall. As I was due to leave for Germany for a two week holiday the following day, and really didn’t want to come home to a broken laptop, I stayed up until around 3am reinstalling Debian – except this time it was a fresh Buster install. Hello to a new merged-/usr machine.

Merged-/usr is currently optional on Debian, which means that essential packages such as coreutils or libc still install binaries into /bin and /sbin, and on merged systems this gets indirected via symlinks into /usr. The package files themselves still contain paths under /bin, /sbin, and /lib, which means that dpkg is not aware that a system is running a merged configuration. This is one of the reasons that I didn’t realise my laptop was configured this way for six months, because whenever I asked dpkg about packages and the filesystem paths they owned, it would tell me the split-/usr paths contained in the package database metadata, and not the actual on-disk locations.

This is, however, still useful, as it means it’s possible to obtain a list of files which would otherwise live under /bin, /sbin, and /lib etc., if the system were using a split configuration. In theory, this means that one can manually re-split a merged-/usr.

But why?

Why not? Split-/usr worked for me right up until I accidentally got rid of it, and it’s still a supported configuration. Maybe I’m just nostalgic for “traditional Unix” (for whatever value that phrase has these days), but I put a lot of effort into restoring my laptop’s configuration as closely to its pre-reinstall state as i could. In any case – it’s free software; the whole point is that I can use my computer in the way I want to.

Measure twice, cut once

The general procedure is to identify which paths under the filesystem root have been merged into /usr, and then determine which package own files under those paths and which files need to be moved. There were six symlinks of interest in my laptop’s root directory: /bin, /sbin, /libx32, /lib64, /lib32, and /lib. For each of these, I then asked dpkg which packages own files under those paths using dpkg -S.

Making this kind of change carries a high risk of badly bricking a system. So being careful is the first and most important order of the day here. This means having some way to recover in case things go awry, and making sure that nothing potentially disruptive is running while the change is happening.

In my case, my laptop’s root filesystem is ZFS, so I took a snapshot of the datasets containing / and /usr before I started. I also put the system into what passes for single-user mode on this machine, so there weren’t any user programs running around which might fall over if, for example, the system’s dynamic loader goes missing.

One other reasonably important thing to mention here is that my approach was to copy files which were in the wrong directory, and not move them. This meant (especially later on, when dealing with /lib) I was able to try a couple of different strategies for copying files, using a temporary directory as a target, so that I could check whether the commands I was writing were doing what I expected them to or not.

Measuring

Straight out of the gate, I determined that /libx32 was completely superfluous, as no packages had files under that path, as I didn’t have any packages using the x86_32 ABI installed. The /usr/libx32 directory which the symlink pointed to was also empty and unused, so rm /libx32; rmdir /usr/libx32 cleaned that up.

The remaining paths were all owned by at least one package, so I went through them in ascending order of size and complexity.

First of all was /lib64. The only thing in this directory according to dpkg was /lib64/ld-linux-x86-64.so.2, which was in reality located under /usr/lib64. This is the system’s dynamic loader, which is invoked by the kernel whenever a dynamically linked binary is executed; the loader is responsible for resolving and mapping all of the shared libraries required by that binary to run. (In reality, this file is a symbolic link to /lib/x86_64-linux-gnu/ld-2.28.so, which is the versioned glibc dynamic loader on the system, but the path to the interpreter under /lib64 is the one which is set by the linker during compilation.)

The consequences of moving the dynamic loader (so that it’s not present at the expected path) are that all dynamically linked programs using that loader (i.e. almost all of the binaries on the system) will fail to start. As it happens, moving the existing symlink out of the way and moving a directory in to replace it isn’t an atomic action, so I used the statically linked busybox package in Debian (which I had already installed beforehand) for moving things around, something like this:

# mkdir /lib64.new
# cp -P /lib64/* /lib64.new   # -P to make sure symlinks are copied properly
# busybox mv /lib64 /lib64.old
# busybox mv /lib64.new /lib64

The next item on the list was /lib32, which contains a number of 32-bit libraries. I’ll admit that I’m not totally sure why these are installed, but some clang-related development package in Debian seems to require them, however my laptop is otherwise a 64-bit system, so these were easy enough to move without disruption. For each package which owns paths under /lib32, I listed these with dpkg -L $pkgname | grep ^/lib32, and then copied these files into a new directory which was moved into place after moving the existing symlink (similar to above).

/bin and /sbin were a little bit easier to handle – as I was only copying files (and not moving them, etc), I had the original copies of all the files in /usr/bin and /usr/sbin in my PATH to fall back on if anything untoward happened. I used a bit of grep and sed magic to process and generate a list of paths, and rsync to perform the actual moves. The following is a tidied-up extract out of root’s .bash_history on my laptop from when I was performing the move for /bin:

# (for i in $(tr -d , < binpkgs); do dpkg -L $i; done) | sort | uniq > binfiles
# grep ^/bin binfiles > binbinaries
# less binbinaries 
# vim binbinaries   # remove /bin from the top of the list of files
# less binbinaries 
# mkdir /bin.new
# rsync -avP $(sed -e 's,^,/usr,' binbinaries) /bin.new

/lib was quite a bit more complicated, as there are quite a lot of files which need to live under /lib, and more still which need to live under /usr/lib, and they were all living together under /usr/lib. Unfortunately by this point I was focusing too intensely to take any useful notes for later reference, but piecing things together from the shell history and files left under /root afterwards, I created a list of all the files which should live under /lib expressed as relative pathnames (e.g. /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 is listed as x86_64-linux-gnu/ld-linux-x86-64.so.2), and then ran the following couple of commands:

# mkdir /lib.new
# (cd /usr/lib; rsync -RPHlptgoD $(cat /root/rejig/newlibs) /lib.new)

First we change into /usr/lib, and then rsync a number of files into /lib.new. The rsync flags were very carefully chosen here:

As it happens, there were some files under /lib which weren’t managed by the package manager – kernel modules for third-party kernels (for slightly hysterical raisins, my laptop is running a custom-built 5.4-series kernel instead of the Debian stock 4.19). These were easy enough to rsync across manually, but it’s nonetheless important that they weren’t forgotten.

At this point I then moved the new /bin, /sbin and /lib directories into place, similar to what I initially did with /lib64. With /lib in particular, the static busybox trick is necessary again, because moving the existing symlink for /lib breaks the symlink in /lib64 to the system dynamic loader.

I then made another ZFS snapshot as a checkpoint and rebooted my laptop, as a sanity check. Everything came back okay, so it was then time to perform the destructive part of the operation.

Cutting

Now that all the files were copied into the right place, I could then delete the old copies which were left in /usr. This is a tricky part, because the idea is to delete only the files which shouldn’t be in /usr without accidentally deleting any of the others.

In the case of /usr/lib64, the only item in this directory prior to the unmerge was the dynamic loader symlink, and the directory wasn’t in use by any other package, so it was easy to remove it.

/usr/lib32 was also reasonably straightforward, as the list of files which should and shouldn’t be in /usr/lib32 was small enough that it was feasible to process it manually.

In the case of /sbin and /bin, this was a case of taking the list of binaries which had been moved, prepending “/usr” to the name, and then carefully removing them.

# sed -e 's,^,/usr,' binbinaries > binrm
# less binrm          # check that the contents of the file is as expected.
# rm -v $(cat binrm)  # explicitly verbose

As before, /usr/lib is the one which requires some delicacy, as there are a lot of things which live under /lib and a lot of things which live under /usr/lib. It took me several attempts to come up with a way of cleanly removing files without any unintended side-effects. The idea I had was to generate separate lists of files for separate types of filesystem entities (i.e. one for regular files, one for directories, one for symlinks etc.), which would mean that stray files and links could easily be removed automatically, and then directories (which are more complex due to /lib and /usr/lib having common subdirectories) could be handled manually later.

I came up with the following one-liners to generate these lists using find; find is purposely prevented from recursing here with the -mindepth and -maxdepth parameters, and is instead used more as a means of filtering files by type.

# find $(cat libfiles) -mindepth 0 -maxdepth 0 -type f | sed -e 's,^,/usr,' > things-to-del
# find $(cat libfiles) -mindepth 0 -maxdepth 0 -type l | sed -e 's,^,/usr,' > things-to-del2
# find $(cat libfiles) -mindepth 0 -maxdepth 0 -type d | sed -e 's,^,/usr,' > things-to-del3

It’s then easy enough to remove the regular files and the symlinks, after performing some manual sanity checks:

# rm -v $(cat things-to-del)
# rm -v $(cat things-to-del2)

In the case of the directories left over, I processed those all manually. /usr/lib/modules and /usr/lib/firmware are both superfluous and can be removed completely, but other directories need quite a bit more care, with some consultation of dpkg -S to check whether the paths under /usr are actually required by anything.

Final remarks

It turns out that there’s quite a lot of moving parts on Debian systems which do things without the package manager really being aware of the fact.

On some level this makes me a bit uncomfortable, as I generally hold that the package manager should be the authority on what the state of a system should be. This is why, for example, I spin my own Debian packages for almost all of the software I run on the routers for AS207480 instead of copying random binaries around the place, as it makes maintaining and updating things fits into my existing package update tooling, and also means that the presence or absence of certain packages is indicative of a machine’s role. Taken to the logical extreme, these ideas lead to things like NixOS, where the package manager takes the desired state of a machine as input, and then configures the machine appropriately to match.

On the other hand, Debian is a very old project with a long history, so there’s some amount of technical and social debt to be expected. Switching to merged-/usr while continuing to support split-/usr systems necessitates some sleight of hand in order to pull the wool over dpkg’s eyes. In the grand scheme of things, for most people it doesn’t make any difference whether the system has a split or merged /usr; but the system is constructed loosely enough that it’s possible to poke and tweak at it until it feels just right.