Libvirt external snapshot with GlusterFS

Revision as of 14:06, 14 May 2014 by Pat (Talk | contribs)


With recent versions of Libvirt and Qemu it is possible to create external disk snapshots while the current and future volumes are connected natively to the GlusterFS cluster via gfapi (not via FUSE).

For this test, the following versions have been used:

  • GlusterFS: 3.4.2
  • Qemu: 2.0.0
  • Libvirt: 1.2.3

Part 1: Preparation

VM-Name
d4572522-eae4-4e3e-ab36-618ae4a91fb4
Original GlusterFS Image Path
virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.qcow2
Prepare an XML, here named
snap.xml
:
<domainsnapshot>
  <name>snap</name>
  <disks>
    <disk name='vda' type='network'>
      <driver type='qcow2'/>
      <source protocol='gluster' name='virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2'>
        <host name='10.1.120.11'/>
      </source>
    </disk>
  </disks>
</domainsnapshot>

Part 2: Create the snapshot using virsh

Note: passing --quiesce in addition to the options already specified makes libvirt contact the qemu-guest-agent it expects to run within the VM and calling freeze/thaw as appropriate to get a more-than-crash-consistent image (on Windows using VSS).

virsh snapshot-create d4572522-eae4-4e3e-ab36-618ae4a91fb4 snap.xml --disk-only --atomic

Examples:

~ # virsh snapshot-create d4572522-eae4-4e3e-ab36-618ae4a91fb4 snap.xml --disk-only --atomic
Domain snapshot snap created from 'snap.xml'
~ # virsh snapshot-create 8196bb77-7478-4bfb-a6ea-52b3a2c65eba kvm-0022-snapshot-01.xml --disk-only --atomic --quiesce

Part 3: Check

Libvirt

vm-test-02 ~ # virsh snapshot-list d4572522-eae4-4e3e-ab36-618ae4a91fb4
 Name                 Creation Time             State
------------------------------------------------------------
 snap                 2014-05-02 15:16:09 +0200 disk-snapshot

vm-test-02 ~ # virsh snapshot-info d4572522-eae4-4e3e-ab36-618ae4a91fb4 --current
Name:           snap
Domain:         d4572522-eae4-4e3e-ab36-618ae4a91fb4
Current:        yes
State:          disk-snapshot
Location:       external
Parent:         -
Children:       0
Descendants:    0
Metadata:       yes

VM Image Path

vm-test-02 ~ # virsh dumpxml d4572522-eae4-4e3e-ab36-618ae4a91fb4 | grep "\.qcow2"
      <source protocol='gluster' name='virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2'>

Backing File Path

Unfortunately, the domblkinfo command still does not understand natively attached glusterfs volumes, but qemu-img does:

vm-test-02 ~ # qemu-img info gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2
image: gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2
file format: qcow2
virtual size: 200G (214748364800 bytes)
disk size: 2.6M
cluster_size: 65536
backing file: gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false

Cleanup/Commit (Offline)

Unfortunately libvirt does not know how to remove an external snapshot and still has some issues when trying to blockcommit on a file with native-gluster backing file:

vm-test-02 ~ # virsh blockcommit d4572522-eae4-4e3e-ab36-618ae4a91fb4 vda
error: invalid argument: top 'virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2' in chain for 'vda' has no backing file

vm-test-02 ~ # qemu-img info /var/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2 
image: /var/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2
file format: qcow2
virtual size: 200G (214748364800 bytes)
disk size: 45G
cluster_size: 65536
backing file: gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false

We therefore have to do it via Qemu, below is the example for doing it offline, but the same should be possible via qemu-monitor commands (since qemu-2.0 can block-commit from the top-level image):

virsh shutdown d4572522-eae4-4e3e-ab36-618ae4a91fb4
qemu-img commit gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2
 
# edit the XML to point to the original image again
virsh edit d4572522-eae4-4e3e-ab36-618ae4a91fb4
 
# remove the snapshot
rm /var/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/de7c7c4e-8664-4ac7-b559-53fd52ec461c.snap01.qcow2
 
# unregister the snapshot in libvirt
virsh snapshot-delete d4572522-eae4-4e3e-ab36-618ae4a91fb4 --current --metadata

Cleanup/Commit (Online)

Patches for traversing the block chain with direct gluster as backing file are already on the libvirt mailinglist, the same goes for patches fixing the glusterfs storage pool. They will probably become part of libvirt-1.2.5. Until then, one has to bypass libvirt to execute a block commit.

~ # virsh qemu-monitor-command e43a2954-3914-465f-9391-9e63b52ec2f5 --pretty '{"execute":"block-commit", "arguments": { "device":"drive-virtio-disk0", "base": "gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/bbf7796f-90bf-45ab-8645-2894f5dae727.qcow2", "top": "gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/bbf7796f-90bf-45ab-8645-2894f5dae727.snap01.qcow2" } }'
{
    "return": {

    },
    "id": "libvirt-28"
}

Unfortunately libvirt does not handle block-jobs well which it didn't start. They hang forever while Qemu waits for block-job-complete:

~ # virsh blockjob e43a2954-3914-465f-9391-9e63b52ec2f5 vda
Block Commit: [100 %]
~ # virsh qemu-monitor-command e43a2954-3914-465f-9391-9e63b52ec2f5 --pretty '{"execute": "query-block-jobs"}'
{
    "return": [
        {
            "io-status": "ok",
            "device": "drive-virtio-disk0",
            "busy": false,
            "len": 214748364800,
            "offset": 214748364800,
            "paused": false,
            "speed": 0,
            "type": "commit"
        }
    ],
    "id": "libvirt-88"
}

So, one has to issue the block-job-complete manually again:

~ # virsh qemu-monitor-command e43a2954-3914-465f-9391-9e63b52ec2f5 --pretty '{"execute": "block-job-complete", "arguments": { "device": "drive-virtio-disk0"} }'
{
    "return": {

    },
    "id": "libvirt-93"
}

~ # virsh blockjob e43a2954-3914-465f-9391-9e63b52ec2f5 vda

~ #

Now check the images again:

~ # virsh qemu-monitor-command --hmp e43a2954-3914-465f-9391-9e63b52ec2f5 "info block"
drive-ide0-0-1: /var/virtualization/iso/a473e8ec-3327-4e24-a19b-e1e7a5a791e0.iso (raw, read-only)
    Removable device: locked, tray closed

drive-virtio-disk0: gluster://10.1.120.11/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/bbf7796f-90bf-45ab-8645-2894f5dae727.qcow2 (qcow2)

Qemu has now only the original image open again.

Have libvirt forget about the snapshot:

virsh snapshot-delete e43a2954-3914-465f-9391-9e63b52ec2f5 --current --metadata
 
# update the XML to have it point to the original file again
virsh edit e43a2954-3914-465f-9391-9e63b52ec2f5
 
# remove the snapshot file:
rm /var/virtualization/vm-templates/5b77d2f6-061f-410c-8ee7-9e61da6f1927/bbf7796f-90bf-45ab-8645-2894f5dae727.snap01.qcow2

Command overview

# create a snapshot for all disks
virsh snapshot-create 8196bb77-7478-4bfb-a6ea-52b3a2c65eba kvm-0022-snapshot-01.xml --disk-only --atomic --quiesce
 
# TODO: copy images away at this point
 
# initiate the block commit for the first disk
# TODO: do this for all disks
virsh qemu-monitor-command 8196bb77-7478-4bfb-a6ea-52b3a2c65eba --pretty '{"execute":"block-commit", "arguments": { "device":"drive-virtio-disk0", "base": "gluster://10.122.0.11/virtualization/vm-persistent/0f83f084-8080-413e-b558-b678e504836e/711b08f2-7c26-4ac3-bb46-66176523d752.qcow2", "top": "gluster://10.122.0.11/virtualization/vm-persistent/0f83f084-8080-413e-b558-b678e504836e/711b08f2-7c26-4ac3-bb46-66176523d752.snapshot-01.qcow2" } }'
 
# monitor the progress
virsh blockjob 8196bb77-7478-4bfb-a6ea-52b3a2c65eba vda
virsh qemu-monitor-command 8196bb77-7478-4bfb-a6ea-52b3a2c65eba --pretty '{"execute": "query-block-jobs"}'
 
# when finished, have Qemu finish it (TODO: check whether we would have to install a dirty block tracer at some point):
virsh qemu-monitor-command 8196bb77-7478-4bfb-a6ea-52b3a2c65eba --pretty '{"execute": "block-job-complete", "arguments": { "device": "drive-virtio-disk0"} }'
 
# remove the snapshot information in libvirt. As an alternative, one could tell libvirt when creating the snapshot to not record any information about it (--no-metadata)
virsh snapshot-delete 8196bb77-7478-4bfb-a6ea-52b3a2c65eba --current --metadata
 
# remove the snapshot
rm /var/virtualization/vm-persistent/0f83f084-8080-413e-b558-b678e504836e/711b08f2-7c26-4ac3-bb46-66176523d752.snapshot-01.qcow2
 
# have libvirt forget about it as well
virsh edit 8196bb77-7478-4bfb-a6ea-52b3a2c65eba

VSS & Qemu-GA

TODO: at the moment, the following error appears in the Windows Application Log:

Volume Shadow Copy Service error: Unexpected error querying for the iVssWriterCallback interface, hr = 0x8007005, Access is denied.

This error can be solved by executing the following steps:

  • From the Start Menu, select Run
    • In the Open field, input dcomcnfg and click OK.
  • OR Open the PowerShell and type dcomcnfg.exe and hit enter
  • Expand Component Services, Computers, and My Computer.
    • Right-click My Computer and click Properties on the pop-up menu.
  • Click the COM Security tab.
    • Under Access Permission click Edit Default.
    • From the Access Permissions dialog, add the "Network Service" account with Local Access allowed.
  • Close all open dialogs.
  • Restart the computer.
Last modified on 14 May 2014, at 14:06