Building Pretty Tiny Docker Images

2015-04-13

I don’t really understand Docker, or at least not yet. Running half of an operating system - usually Ubuntu, too - on top of a whole operating system, perhaps on top of a hypervisor, just seems like too many turtles on the way down.

The whole idea behind Dockerized applications seems to be that you just ship all of your application’s dependencies, down to OS userland and even stuff like database servers, along with the app itself. Even though there’s some witchcraft and trickery going on with unification filesystems, and there’s a lot of caching involved, all that bundling has a cost: if your app is a few megabytes, downloading a hundred megs worth of operating system just to run it just doesn’t scream ‘elegance’ to me.

There’s nothing inherent in Docker that says you have to use one of the bulky base images, though. The Docker Hub carries a perfectly good busybox base image, which clocks in at around 2MB and will give you a workable busybox shell, complete with slimmed down versions of all the utilities you know and love, and even a lightweight vi-lookalike.

However, even the busybox image looks bloated compared to scratch, a Docker base image that contains…

Nothing.

The scratch image is literally just an empty filesystem with a bit of metadata. The only thing that it will contain is stuff that you add to it. This also means that the only thing that will run on it, period, are binaries that are completely statically linked. Any executable that depends on shared libraries will fail: the libraries will simply not be there.

This poses a bit of a problem. Pretty much all executables depend on shared libraries, even if it is only the C runtime library. We could try bundling the libraries by linking the executable statically, but the de facto standard C runtime library on Linux systems, glibc, doesn’t really support static linking anymore. There’s a -static flag for the GNU C compiler, but the binaries it generates are not static enough for our purposes: the C runtime, or parts of it, still need to be present at runtime.

However, there are standard C libraries that do support static linking. One of those is called musl and it’s pretty neat. Not all software will work on musl without modification: musl does not support all the extensions glibc does, and it doesn’t support C++. There’s a huge table listing which NetBSD packages do and do not build with musl, which should give you an idea. As an added bonus, musl is pretty small.

If you were following along at home, you should now have a way to start from an empty Docker image and a way of building static binaries - all the ingredients you need to build extremely lightweight Docker images. As a proof of concept, I used musl to Dockerize D. J. Bernstein’s tinydns authoritative name server, part of the djbdns set of tools. If you want to give the Dockerized tinydns a spin, check out the repository on GitHub.

Thanks for reading! If you have any questions, comments or corrections, feel free to shoot me an email.