the right way to use disk space? virtually, of course!

Download this article as an e-bookDownload this article as an e-book

I might have mentioned agedu before, a nice tool to find your least useful and ready-to-be-deleted files real quick.

Sucks when the only files you have are rather large ones that you can’t throw out, like virtual system images which can easily become more than a few gigs heavy.

Disk is cheap you say (again) and I will protest loudly; disk is not cheap for your laptop, it is not cheap for your high-performance platter server, it is not cheap for the environment and it’s ridiculous what kind of wasteful behavior the “hey, it’s cheap” mentality promotes, not all of which relates to computers (think garbage, cars, food, wars, lives…)

Regardless, if you are using KVM there is a way to save disk space, speed up disk accesses and maybe even save the environment a little: kvm ships with a little tool called kvm-img (if you’re using QEMU then it’s qemu-img), and support for a copy-on-write storage format called QCOW2.

The qcow2 format is cool because it supports compression and encryption.

Compress your images

If you cared about disk before, you could untick the “allocate all space now” and save a couple gigs on a 10G disk image, but that wouldn’t last long and you’d hear people grumble about disk corruption and such (corruption that I have never ever seen, I might interject), but now you can compress and rebase your image. Here’s how I saved 20G on my disk:

To convert your raw image to qcow2 you would do:

kvm-img convert -c -f raw -O qcow2 $IN ${IN%.img}_base.qcow2

where $IN is your existing image and ${IN%.img}_base.qcow2 is going to be the name of your new qcow2 image. If you have NADA space left, convert into tmpfs (make sure tmpfs is mounted with sufficient size), remove the raw image and copy the new image out of tmpfs. That’ll free up some space.


But why stop there? I mentioned rebasing, and rebase we shall.
The qcow2 format it is a little less cool for introducing really sucky snapshotting support, as applying and creating snapshots with kvm-img takes hours and is likely to fail! I don’t recommend trying kvm-img snapshot -c foo.qcow2
However, the copy-on-write functionality of qcow2 lets us implement functional faux snapshotting with little effort.

Copy-on-write means we can create an image sliver that only stores the changes from some read-only base image. Even better, we can layer these slivers! So, with the script I’ll introduce in a second, we can:

  1. Create or convert into a compressed base image. Name it foo_base.qcow2, eg “debian_squeeze_base.qcow2″. This is the master base, ideally made right after installing the operating system or whatevr.
  2. Create a usable sliver to store new data into: kvm-img create -b debian_squeeze_base.qcow2 squeeze_today.qcow2
  3. If you are using libvirt, update your /etc/libvirt/qemu/.xml disk source file to point to the ‘today’ image, and restart the libvirt daemon and virt-manager, to catch on to the changes
  4. To create a faux snapshot, just move the today image and rebase it like in step 2.
  5. To revert a faux snapshot, just replace today’s image with the snapshot.

And here is my rebase script:

kwy@amaeth:/var/lib/libvirt/images$ cat 

if [ ! -f $BASE ]
if [ ! -f $BASE ]
   echo "No base image $BASE"
REBASE=${BASE%.qcow2}_`date +%F`.qcow2
if [ -n "$2" ] 
kvm-img create -f qcow2 -b $REBASE $BASE
kvm-img info $BASE 
kvm-img info $REBASE

echo "$BASE -> $REBASE"


  • It takes 2 seconds to rebase and restore as opposed to 1 minute vmware snapshot or 4 hours to snapshot with qcow2
  • you don’t need fancy RAID or LVM tricks
  • You save space as opposed to shitty qcow2 snapshots and raw image copies
  • you can keep several versions or patchlevels of an operating system, and several application groups on the same operating system without having to reinstall the system – you already have a base image you can use!


The experience should be pretty stable, but there is always room to shoot yourself in the foot. Here are a couple of ways you can make it hard for yourself:

  • don’t run out of disk space – it will corrupt your open images, regardless of format
  • don’t modify a base image that another image depends upon.
    Your base image knows nothing about its children (newer snapshots and ‘today’ images), so modifying the base image will cause all its children to corrupt into weirdness. That’s why the base image is “read only” and should be named appropriately.
  • don’t go down under the stairs!
  • don’t do stuff you don’t understand!
  • don’t tell me this ain’t new, cause I know!

Download this article as an e-bookDownload this article as an e-book

Tags: , , , , ,

2 Responses to “the right way to use disk space? virtually, of course!”

  1. fist-time-poster says:

    Nice writeup.

    What do you do, though, when you want to update the base image so that all your dependent images benefit from the updates? Won’t that be a problem? Maybe you can use this (re)basing feature mainly in situations where the dependent images aren’t meant to endure. For instance, if you want to recreate testbed images periodically, off of your base image.

  2. kacper says:

    Good point, you can’t update the base image after rebasing, ever! Should actually never boot a rebased image either; as soon as you’ve done a rebase the original should be kept read-only for the duration of its lifetime. It’s still possible to rebuild a “complete” image with kvm-img if you want to update that one.

    I’m unfamiliar with any technique that would allow you to update a base image so that all dependant images will benefit, well maybe except for BSD jails where most of the filesystem is read-only mounted and shared between jails.