Server updates and build system misuse

If you have a sufficient number of computers, you're eventually going to want some means of automating updates, or you'll inevitably forget one or more of them (with potentially catastrophic consequences, if you're unlucky). I reached that point in life some time at the beginning of this year when I started deploying routers for my network, and I soon after started noticing that I was forgetting some infrequently used machines in the periodic rounds of apt updates.

There are a variety of existing tools for dealing with the task of logging into a bunch of machines remotely and running a series of commands, ranging from glorified parallel SSH (optionally implemented in Perl for extra hack value) to slightly more refined tools such as e.g. Ansible.

The "correct" way to solve this problem would be to learn how to use something like Ansible (if anything it's probably useful to know, and is something to put on my CV), however I was feeling lazy enough to decide that running apt-get update && apt-get upgrade on ten machines is annoying to do manually, but too lazy to bother actually learning how to use Ansible.

The problem statement I formulated was to be able to run a number of commands on remote machines in order to achieve some goal, with some constraints about the order in which commands are executed. This sounds suspiciously similar to the sort of thing a software build system is intended to do, so of course I started writing a Makefile for doing apt updates.

[Disclaimer: I run Debian (or something sufficiently close to Debian to be largely indistinguishable) on nearly all my machines. The following sections contain Debian user ramblings and assume the use of GNU Make.]

make and ssh

The Makefile

The goal I have is to check for and install software updates on a number of servers. Checking for and installing updates on a machine requires updating the package lists (i.e. run apt-get update as root on the machine) and then installing any available new package versions (run apt-get upgrade on the machine). I'll explain the Makefile literately below.

I have my list of target machines where I want to perform the updates.

HOSTS = proton neutron electron

The default target of the Makefile is to perform updates on all of the target machines:

all: $(HOSTS)
	@

Note that this target has an empty recipe (hence the @). This is required here because I'm about to use a catch-all pattern rule; without a recipe defined, make would process the all target as declaring a dependency on the expansion of the $(HOSTS) macro, but then examine other rules in order to find a recipe, which would inadvertently cause it to be matched by the catch-all pattern.

Next, I have a catch-all pattern rule. Note that this one has an empty recipe as well — without a recipe, make would continue searching other rules for a recipe elsewhere after processing the dependency declaration, similar to above.

%: %-apt-update %-apt-upgrade
	@

The pattern expansion is the magic here, as it means that the target "electron" will automagically depend on the "electron-apt-update" and "electron-apt-upgrade" targets.

We can now tell make how to apt-get update a machine:

%-apt-update:
	ssh -l root $(@:%-apt-update=%) -- apt-get update
	touch $(@)

This uses a shorthand form of a GNU Make feature called patsubst (see here), which allows me to substitute in the name of the target (using the $@ automatic variable) while removing the suffix "-apt-update" from the target name. The result of the sigil soup is that invoking the target electron-apt-update will cause the command ssh -l root electron -- apt-get update to be executed. By default, make assumes that targets correspond to file system entities, and determines whether a target is up to date based on the modification of that file relative to the modification time of its dependencies. So I create (or update the modification time of) a file with the same name as the target in order to tell make that the target was (re)built correctly.

Running apt-get upgrade is similar to the above:

%-apt-upgrade: %-apt-update
	ssh -l root $(@:%-apt-upgrade=%) -- apt-get upgrade
	touch $(@)

Note the dependency on the corresponding rule for updating the given system.

For the sake of convenience, I've also defined a rule to remove the placeholder files, so that in the future I can re-run the upgrade process from scratch.

clean:
	rm -f *-apt-update *-apt-upgrade

There's one more piece of non-obvious magic required here, which is due to how GNU Make handles chains of implicit rules. Without this line, the intermediate *-apt-update files are removed by make after the corresponding *-apt-upgrade rule has been brought up to date. (I've added the *-apt-upgrade targets here too in case any higher level dependencies are added to the Makefile in the future.)

.SECONDARY: $(HOSTS:%=%-apt-update) $(HOSTS:%=%-apt-upgrade)

The results

This Makefile should now run the appropriate series of apt-get commands in the correct order on all of the configured hosts simply by running make, or on a single host individually using a command similar to e.g. make neutron. If the upgrade step fails on one host for some reason, once the failure has been resolved the make target can be re-run, and will resume from the apt-get upgrade step without needing to re-run the apt-get update.

There are a couple of changes and tweaks one can apply to the Makefile; it's possible to do things like put common options for ssh and apt-get into make variables if the defaults aren't adequate, and then add the options to the ssh command line (e.g. ssh $(SSH_OPTIONS) -l root hostname -- apt-get $(APT_OPTIONS) update). For reasons of compressing the amount of time I want to sit at a terminal periodically pressing enter at apt-get upgrade prompts, I've changed the all target to look like this instead:

all: all-apt-update all-apt-upgrade
	@

all-apt-update: $(HOSTS:%=%-apt-update)
	@

all-apt-upgrade: $(HOSTS:%=%-apt-upgrade)
	@

This means that apt-get update is run on every host before apt-get upgrade gets run anywhere. (As a side note, I could run apt-get upgrade with the -y option, however I consider this unsafe, as I prefer to manually oversee what changes are actually being made when I run updates like this.)

Interlude

There's more than one build system in life. redo is a build system designed for incremental builds, originally formulated by D. J. Bernstein (and described in some depth here). Every target is a file, and each target has an assocated script which is run to generate it (a so-called "do" script, whose name is derived by appending .do to the target name). In contrast to make, however, dependency information is not stored in the target files themselves, but in an external database of some description. Dependencies between targets are declared by calling an external redo program in a target's "do" script, which checks if the named dependencies are in the database and whether they are up to date. It results in a lot of small files and a fair amount of scripting machinery, which is pretty typical of Bernstein's software.

As an exercise, I translated the Makefile I presented above into a redo build. As noted in the linked description, there are multiple implementations of the redo concept. For reference, I've been using Leah Neukirchen's rather concise C implementation.

redo and ssh

Build files everywhere

First of all, I need a list of the hosts on which I want to run updates.

$ cat > hosts <<EOF
> proton
> neutron
> electron
> EOF

I started out with the all.do file, as the default target in redo is called "all". (Note that I'm omitting the shebang lines of the "do" scripts here, which is due to an implementation detail of Leah Neukirchen's redo where non-executable "do" scripts are executed as an argument to /bin/sh -e.)

# all.do
redo-ifchange $(sed -e 's/$/.host/' hosts)

The redo-ifchange program is (one of) the mechanism(s) through which dependencies are declared. In this example, it declares that the "all" target is dependent on the targets named on redo-ifchange's arguments, and that if any of them change then "all" should also be considered out of date. The sed invocation generates target names based on the hostnames in the hosts file created above.

redo also supports pattern-matching target rules like make does. If redo can't find a "do" script for a target file.ext, then it will fall back to looking for the "do" script default.ext.do. The "all" target above depends on a number of targets which end in ".host", so we can create a default.host.do script which handles all targets with this name.

# default.host.do
host="${1%%.host}"
redo-ifchange "$host".update "$host".upgrade

"do" scripts are passed the name of the target they are expected to build in their arguments (along with other information such as the name of the top-level target currently being built and the name of a temporary file into which the "do" script should write its output). In this case I'm stripping the ".host" suffix, and declaring a dependency on targets for updating and upgrading the host in questions.

Correspondingly, there are "do" scripts called default.update.do and default.upgrade.do pattern matching on the suffixes used in the "do" script above.

# default.update.do
host="${1%%.update}"
ssh -l root "$host" -- apt-get update

# default.upgrade.do
host="${1%%.upgrade}"
redo-ifchange "$host".update
ssh -l root "$host" -- apt-get upgrade

These work much the same as the above pattern-matching "do" script, with the addition of default.upgrade.do depending on the output of default.update.do. Note that no output files are created by the "do" scripts here — as well as dependency information being stored externally from source files, redo manages renaming the temporary file to which the script wrote its output to the name of the target itself.

Results

Running redo should have the same end result as the Makefile above, but with a different implementation. As with the Makefile version, you can add some additional targets to change the order in which the "do" scripts are run. For example, I've changed the all.do script and added "update-all" and "upgrade-all" targets to match the amended Makefile behaviour I described above.

# all.do
redo-ifchange update-all upgrade-all $(sed -e 's/$/.host/' hosts)

# update-all.do
redo-ifchange $(sed -e 's/$/.update/' hosts)

# upgrade-all.do
redo-ifchange $(sed -e 's/$/.upgrade/' hosts)

The less time I spend having to wait for apt-get upgrade to ask me yes or no the better.

Final thoughts

There's more than one way to perform this task. While software build tools probably aren't really the most appropriate way to get the job done here, it's interesting to see to what kind of ends they can be (mis)used. Occasionally, using a tool you already know beats out against using one you don't.