Docker.io is taking the world by storm, but a day at the docks is not without its perils. Here I hope to inspire you to try out docker by showing you how to avoid its pitfalls.
In the days of yore
As the FreeBSD jailers and Solaris zoners will attest to, containerizing your services is a great boon, saving space and resources and providing easy management akin to chroots and potential security benefits, without the overheads of full-blown virtual machines.
Linux has had containers for the longest time, in the ancient form of User Mode Linux, which actually ran a kernel in userland, and more recently OpenVZ, which was more like jails.
The former didn’t lend itself to production deployments and the latter never made it into the linux mainline, coming at a time when people were more interested in virtualization than containment. In recent years, a kernel facility named Cgroups has made LinuX Containers (LXC) possible, which as afforded the management, if not security, of bsd jails.
what can be gained
The biggest potential benefit from containers is that CPU, memory and disk resources are 100% shared at native speeds, so no libraries and no data need ever be duplicated on disk nor in memory.
In FreeBSD jails, this was achieved by providing most of the system read-only like
/bin, and sharing it amongst jails. This worked quite well, but was surprisingly tricky to update.
You can do similar stuff with LXC, just as long as you understand that if it breaks, you get to keep all the pieces. This gives you full control, and means that I for one have LXC instances in production with uptimes of 1200 days and counting.
Taking the approach of single-container-single-responsibility further, you could instead of deploying whole system containers create image filesystems that contained only the bare necessities. For instance, your python application would have apart from its code,just the python runtime,
libc and other dependant libraries, and naught much else.
Inspired by the “leaner is better” philosophy backed by the experience of running LXC in production, we built this minimal deployment framework complete with a tool to magically find all the required libraries.
Awesomely small images come from this approach, where the “contact surface” of the application has shrank to nothing but the app itself. It was far from perfect, serving to make the images awesomely less debuggable and managable, and never made it into production proper.
layer upon layer is two steps further
In comes Docker, and its concept of filesystem image layers based on AUFS. The approach isn’t novel itself, having been used by live-CD distributions for the longest time, but it’s the first that provides tools to manage the layers effortlessly for containers. So you can now have 100 servers with 100 application layers, and all your Ruby applications share one runtime layer and your Python applications share another, and they all run on the same base image of Ubuntu, and they do all that transparently, without you having to consciously think about which bit goes where.
Docker takes another step further, borrowing heavily from distributed social source control ala github, allowing you to clone, build, push, pull, commit, share and remix images as easy as that.
This is the type of thing that blows disk-image-based virtualization straight out of the water.
Perils and rough starts
The Docker docs are well written and will get you spawning containers and dockerizing applications in no time at all. What they will not tell you is how to run containers in production for all values of production.
In particular, the following things require special attention:
- changing ips
- service discovery
- dns issues
- cache clash
.. and that is precisely what we will talk about next time.