Because I like camping and want my family to be free of the tether to my work area I'd like to make my virtualization guests easier to recover in the event of a failureā¦ backups are important but so is the method you take them and the process required to restore. Who wants to try and restore a full 300GB compressed RAW image for an issue that altered only a few MB but brought the server down. My hope is ZFS together with QCOW2 will allow enough flexibility that I can backup and recover using whatever method is most appropriate at the time (restore snapshots, zfs-receive, or full image restores from cold backups; of course guest OS level backups will be available too).
I've been using RAW on XFS on MDRAID for the last 8 years, had a brief stint with LVM volumes and steered clear of QCOW2 for performance reasons. Now apparently QCOW2 is almost pretty darn good and since SSDs are cheap (time isn't as cheap as SSDs are now) the speed of virtual guests running QCOW2 might warrant another look. I'm copying and pasting from the work of others here, I very much appreciate all the hard work and good documentation they've created for the sake of others.
https://www.jamescoyle.net/how-to/1810-qcow2-disk-images-and-performance
https://jrs-s.net/2018/03/13/zvol-vs-qcow2-with-kvm/
For performance on ZFS this is what I'm starting with:
qemu-img create -f qcow2 -o cluster_size=8k,preallocation=metadata,compat=1.1,lazy_refcounts=on debian9.qcow2 50G
From the man page:
If this option is set to "on", reference count updates are postponed with the goal of avoiding metadata I/O and improving performance. This is particularly interesting with cache=writethrough which doesn't batch metadata updates. The tradeoff is that after a host crash, the reference count tables must be rebuilt, i.e. on the next open an (automatic) "qemu-img check -r all" is required, which may take some time. This option can only be enabled if "compat=1.1" is specified.
Changes the qcow2 cluster size (must be between 512 and 2M). Smaller cluster sizes can improve the image file size whereas larger cluster sizes generally provide better performance.
(falloc seems to have an issue on ZFS link)
Preallocation mode (allowed values: "off", "metadata", "falloc", "full"). An image with preallocated metadata is initially larger but can improve performance when the image needs to grow. "falloc" and "full" preallocations are like the same options of "raw" format, but sets up metadata also.