Undoing merged /usr on Debian
As of Debian Buster, debootstrap
(the tool for bootstrapping new Debian
installations) has been updated to perform a merged-/usr
installation by
default. For those who aren’t familiar with the idea, a “merged-/usr
” is
a system configuration where, instead of splitting system binaries four ways
between /bin
, /sbin
, /usr/bin
, and /usr/sbin
(which has historically
been the convention in Linux distributions), /bin
is symlinked to /usr/bin
and /sbin
is symlinked to /usr/sbin
. (Similarly, /lib
is symlinked to
/usr/lib
to pull all shared libraries into a single directory.)
As far as I know, this idea originates (at least within the Linux ecosystem)
from freedesktop.org and the systemd
people (Debian’s wiki page
on the matter links to the systemd wiki
page
on the same). It’s another idea to come out of FD.o and cause a bit of
a political stir, but the point here is that Debian has adopted this new
configuration as the default for new installs starting with Buster. I then
discovered last week that my laptop is using the merged-/usr
configuration,
much to my surprise.
This was surprising because I’ve been running Debian on this laptop for
approaching three years, and I hadn’t knowingly taken any action to switch to
a merged configuration. I originally installed Stretch in the middle of 2017,
and then did a dist-upgrade
to Buster when that came along last year, and the
Stretch-to-Buster upgrade doesn’t do anything involving merging /usr
at all.
However, back in December, I had a rather spectacular tab completion failure in
a root shell, which had the side-effect of permanently bricking dpkg
on the
machine, and hence necessitating a reinstall. As I was due to leave for Germany
for a two week holiday the following day, and really didn’t want to come home
to a broken laptop, I stayed up until around 3am reinstalling Debian – except
this time it was a fresh Buster install. Hello to a new merged-/usr
machine.
Merged-/usr
is currently optional on Debian, which means that essential
packages such as coreutils
or libc
still install binaries into /bin
and /sbin
, and on merged systems this gets indirected via symlinks into
/usr
. The package files themselves still contain paths under /bin
,
/sbin
, and /lib
, which means that dpkg
is not aware that a system
is running a merged configuration. This is one of the reasons that I didn’t
realise my laptop was configured this way for six months, because whenever I
asked dpkg
about packages and the filesystem paths they owned, it would tell
me the split-/usr
paths contained in the package database metadata, and not
the actual on-disk locations.
This is, however, still useful, as it means it’s possible to obtain a list of
files which would otherwise live under /bin
, /sbin
, and /lib
etc., if
the system were using a split configuration. In theory, this means that one can
manually re-split a merged-/usr
.
But why?
Why not? Split-/usr
worked for me right up until I accidentally got rid of
it, and it’s still a supported configuration. Maybe I’m just nostalgic for
“traditional Unix” (for whatever value that phrase has these days), but I put
a lot of effort into restoring my laptop’s configuration as closely to its
pre-reinstall state as i could. In any case – it’s free software; the whole
point is that I can use my computer in the way I want to.
Measure twice, cut once
The general procedure is to identify which paths under the filesystem root have
been merged into /usr
, and then determine which package own files under those
paths and which files need to be moved. There were six symlinks of interest in
my laptop’s root directory: /bin
, /sbin
, /libx32
, /lib64
, /lib32
, and
/lib
. For each of these, I then asked dpkg
which packages own files under
those paths using dpkg -S
.
Making this kind of change carries a high risk of badly bricking a system. So being careful is the first and most important order of the day here. This means having some way to recover in case things go awry, and making sure that nothing potentially disruptive is running while the change is happening.
In my case, my laptop’s root filesystem is ZFS, so I took a snapshot of the
datasets containing /
and /usr
before I started. I also put the system into
what passes for single-user mode on this machine, so there weren’t any user
programs running around which might fall over if, for example, the system’s
dynamic loader goes missing.
One other reasonably important thing to mention here is that my approach was to
copy files which were in the wrong directory, and not move them. This meant
(especially later on, when dealing with /lib
) I was able to try a couple
of different strategies for copying files, using a temporary directory as a
target, so that I could check whether the commands I was writing were doing
what I expected them to or not.
Measuring
Straight out of the gate, I determined that /libx32
was completely
superfluous, as no packages had files under that path, as I didn’t have
any packages using the x86_32
ABI installed. The /usr/libx32
directory
which the symlink pointed to was also empty and unused, so rm /libx32; rmdir
/usr/libx32
cleaned that up.
The remaining paths were all owned by at least one package, so I went through them in ascending order of size and complexity.
First of all was /lib64
. The only thing in this directory according to
dpkg
was /lib64/ld-linux-x86-64.so.2
, which was in reality located under
/usr/lib64
. This is the system’s dynamic loader, which is invoked by
the kernel whenever a dynamically linked binary is executed; the loader
is responsible for resolving and mapping all of the shared libraries
required by that binary to run. (In reality, this file is a symbolic link to
/lib/x86_64-linux-gnu/ld-2.28.so
, which is the versioned glibc dynamic loader
on the system, but the path to the interpreter under /lib64
is the one which
is set by the linker during compilation.)
The consequences of moving the dynamic loader (so that it’s not present at the expected path) are that all dynamically linked programs using that loader (i.e. almost all of the binaries on the system) will fail to start. As it happens, moving the existing symlink out of the way and moving a directory in to replace it isn’t an atomic action, so I used the statically linked busybox package in Debian (which I had already installed beforehand) for moving things around, something like this:
# mkdir /lib64.new
# cp -P /lib64/* /lib64.new # -P to make sure symlinks are copied properly
# busybox mv /lib64 /lib64.old
# busybox mv /lib64.new /lib64
The next item on the list was /lib32
, which contains a number of 32-bit
libraries. I’ll admit that I’m not totally sure why these are installed,
but some clang
-related development package in Debian seems to require them,
however my laptop is otherwise a 64-bit system, so these were easy enough to
move without disruption. For each package which owns paths under /lib32
,
I listed these with dpkg -L $pkgname | grep ^/lib32
, and then copied these
files into a new directory which was moved into place after moving the existing
symlink (similar to above).
/bin
and /sbin
were a little bit easier to handle – as I was only copying
files (and not moving them, etc), I had the original copies of all the files in
/usr/bin
and /usr/sbin
in my PATH
to fall back on if anything untoward
happened. I used a bit of grep
and sed
magic to process and generate a list
of paths, and rsync
to perform the actual moves. The following is a tidied-up
extract out of root’s .bash_history
on my laptop from when I was performing
the move for /bin
:
# (for i in $(tr -d , < binpkgs); do dpkg -L $i; done) | sort | uniq > binfiles
# grep ^/bin binfiles > binbinaries
# less binbinaries
# vim binbinaries # remove /bin from the top of the list of files
# less binbinaries
# mkdir /bin.new
# rsync -avP $(sed -e 's,^,/usr,' binbinaries) /bin.new
/lib
was quite a bit more complicated, as there are quite a lot
of files which need to live under /lib
, and more still which need
to live under /usr/lib
, and they were all living together under
/usr/lib
. Unfortunately by this point I was focusing too intensely to
take any useful notes for later reference, but piecing things together
from the shell history and files left under /root
afterwards, I created a
list of all the files which should live under /lib
expressed as relative
pathnames (e.g. /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
is listed as
x86_64-linux-gnu/ld-linux-x86-64.so.2
), and then ran the following couple of
commands:
# mkdir /lib.new
# (cd /usr/lib; rsync -RPHlptgoD $(cat /root/rejig/newlibs) /lib.new)
First we change into /usr/lib
, and then rsync
a number of files into
/lib.new
. The rsync
flags were very carefully chosen here:
-R
instructsrsync
to use full relative paths. For example, the command linersync one/foo two/bar dest
will result in thedest
directory containing the filesfoo
andbar
, whilersync -R one/foo two/bar dest
will result in thedest
directory containing two subdirectoriesone
andtwo
, which contain the filesfoo
andbar
, respectively.-P
instructsrsync
to display per-file transfer progress, and allow partial transfers. This option is only here due to muscle memory, as I regularly invokersync
with this flag otherwise.-HlptgoD
are usually used via the-a
flag, which is a synonym for these flags plus-r
. However,-r
is the recursive flag, which would recurse into directories and copy their entire contents, which is not desired here, as/lib
and/usr/lib
share common subdirectories with different contents. (For example, on a split-/usr
Debian machine,/lib/x86_64-linux-gnu
and/usr/lib/x86_64-linux-gnu
are distinct, and contain different things.)
As it happens, there were some files under /lib
which weren’t managed by
the package manager – kernel modules for third-party kernels (for slightly
hysterical raisins, my laptop is running a custom-built 5.4-series kernel
instead of the Debian stock 4.19). These were easy enough to rsync
across
manually, but it’s nonetheless important that they weren’t forgotten.
At this point I then moved the new /bin
, /sbin
and /lib
directories
into place, similar to what I initially did with /lib64
. With /lib
in
particular, the static busybox trick is necessary again, because moving the
existing symlink for /lib
breaks the symlink in /lib64
to the system
dynamic loader.
I then made another ZFS snapshot as a checkpoint and rebooted my laptop, as a sanity check. Everything came back okay, so it was then time to perform the destructive part of the operation.
Cutting
Now that all the files were copied into the right place, I could then delete
the old copies which were left in /usr
. This is a tricky part, because
the idea is to delete only the files which shouldn’t be in /usr
without
accidentally deleting any of the others.
In the case of /usr/lib64
, the only item in this directory prior to the
unmerge was the dynamic loader symlink, and the directory wasn’t in use by any
other package, so it was easy to remove it.
/usr/lib32
was also reasonably straightforward, as the list of files which
should and shouldn’t be in /usr/lib32
was small enough that it was feasible
to process it manually.
In the case of /sbin
and /bin
, this was a case of taking the list of
binaries which had been moved, prepending “/usr” to the name, and then
carefully removing them.
# sed -e 's,^,/usr,' binbinaries > binrm
# less binrm # check that the contents of the file is as expected.
# rm -v $(cat binrm) # explicitly verbose
As before, /usr/lib
is the one which requires some delicacy, as there are
a lot of things which live under /lib
and a lot of things which live under
/usr/lib
. It took me several attempts to come up with a way of cleanly
removing files without any unintended side-effects. The idea I had was to
generate separate lists of files for separate types of filesystem entities
(i.e. one for regular files, one for directories, one for symlinks etc.), which
would mean that stray files and links could easily be removed automatically,
and then directories (which are more complex due to /lib
and /usr/lib
having common subdirectories) could be handled manually later.
I came up with the following one-liners to generate these lists using find
;
find
is purposely prevented from recursing here with the -mindepth
and
-maxdepth
parameters, and is instead used more as a means of filtering files
by type.
# find $(cat libfiles) -mindepth 0 -maxdepth 0 -type f | sed -e 's,^,/usr,' > things-to-del
# find $(cat libfiles) -mindepth 0 -maxdepth 0 -type l | sed -e 's,^,/usr,' > things-to-del2
# find $(cat libfiles) -mindepth 0 -maxdepth 0 -type d | sed -e 's,^,/usr,' > things-to-del3
It’s then easy enough to remove the regular files and the symlinks, after performing some manual sanity checks:
# rm -v $(cat things-to-del)
# rm -v $(cat things-to-del2)
In the case of the directories left over, I processed those all
manually. /usr/lib/modules
and /usr/lib/firmware
are both superfluous and
can be removed completely, but other directories need quite a bit more care,
with some consultation of dpkg -S
to check whether the paths under /usr
are
actually required by anything.
Final remarks
It turns out that there’s quite a lot of moving parts on Debian systems which do things without the package manager really being aware of the fact.
On some level this makes me a bit uncomfortable, as I generally hold that the package manager should be the authority on what the state of a system should be. This is why, for example, I spin my own Debian packages for almost all of the software I run on the routers for AS207480 instead of copying random binaries around the place, as it makes maintaining and updating things fits into my existing package update tooling, and also means that the presence or absence of certain packages is indicative of a machine’s role. Taken to the logical extreme, these ideas lead to things like NixOS, where the package manager takes the desired state of a machine as input, and then configures the machine appropriately to match.
On the other hand, Debian is a very old project with a long history, so
there’s some amount of technical and social debt to be expected. Switching to
merged-/usr
while continuing to support split-/usr
systems necessitates
some sleight of hand in order to pull the wool over dpkg
’s eyes. In the grand
scheme of things, for most people it doesn’t make any difference whether the
system has a split or merged /usr
; but the system is constructed loosely
enough that it’s possible to poke and tweak at it until it feels just right.