Changes - stoney cloud

stoney conductor: VM Backup

6,813 bytes added, 14:43, 27 June 2014

/* Communication through backend */

= Overview =

This page describes how the VMs and VM-Templates are backed-up and restored inside the [http://www.stoney-cloud.org stoney cloud].

= Requirements =

* sstBackupRootDirectory: file:///var/backup/virtualization

** This directory might be a single partition which needs to have the same size as your partition for the live images (it's a "copy" of the live partition)

* sstBackupRetainDirectory: file:///var/virtualization/retain

** This directory must be on the same partition as your life images are

* A working stoney cloud, installed according to [[stoney cloud: Single-Node Installation]] or [[stoney cloud: Multi-Node Installation]].

* The backup configuration must be set: [[stoney_conductor:_OpenLDAP_directory_data_organisation#Backup | stoney conductor: OpenLDAP directory data organisation]].

= Backup =

== Basic idea ==

The main idea to backup a VM or a VM-Template is, to divide the task into three subtasks:

* ~~Snapshot~~createSnapshot: ~~Save the machines state (CPU~~Create a disk only snapshot. A new overlay file is created, ~~Memory and Disk)~~all write operations are performed to this file. The underlying disk-image is now read only.* ~~Merge~~exportSnapshot: ~~Merge~~ Copy the ~~Disk~~read only disk-~~Image-Snapshot with~~ image to the ~~Live-Image~~backup location.* ~~Retain~~commitSnapshot: ~~Export~~ Commit the ~~snapshot files~~performed write operations from the overlay back to the underlying (original) disk image. Now the underlying image is read-write again and the overlay image can be deleted.

A more detailed and technical description for these three sub-processes can be found [[#Sub-Processes | here]].

Furthermore there is an control instance, which can independently call these three sub-processes for a given machine. Like that, the stoney cloud is able to handle different cases:

=== Backup a single machine ===

The procedure for backing up a single machine is very simple. Just call the three sub-processes (~~snapshot~~create-, ~~merge~~ export- and ~~retain~~commitSnapshot) one after the other. So the control instance would do some very basic stuff:

object machine = args[0];

if( ~~snapshot~~createSsnapshot( machine ) )

{

if ( ~~merge~~exportSnapshot( machine ) )

{

if ( ~~retain~~commitSnapshot( machine ) )

{

printf("Successfully backed up machine %s\n", machine);

} else

{

printf("Error while ~~retaining~~ committing snapshot for machine %s: %s\n", machine, error);

}

} else

{

printf("Error while ~~merging~~ exporting snapshot for machine %s: %s\n", machine, error);

}

=== Backup multiple machines at the same time ===

When backing up multiple machines at the same time, we need to make sure that the ~~downtime~~ snapshots for the machines are as close together as possible. Therefore the control instance should call first the ~~snapshot~~ createSnapshot process for all machines. After every machine has been snapshotted, the control instance can call the ~~merge~~ exportSnapshot and ~~retain~~ commitSnapshot process for every machine. The most important part here is, that the control instance somehow remembers, if the snapshot for a given machine was successful or not. Because if the snapshot failed, it must not call the ~~merge~~ exportSnapshot and ~~retain~~ commitSnapshot process. So the control instance needs a little bit more logic:

# If the snapshot was successful, put the machine into the

# successful_snapshots array

if ( ~~snapshot~~createSnapshot( machines[i] ) )

{

successful_snapshots[machines[i]];

}

# ~~Merge~~ export and ~~reatin~~ commit all successful_snapshot machines

for ( int i = 0; i < sizeof(successful_snapshots) / sizeof(object) ; i++ ) )

{

if ( successful_snapshots[i] )

{

if ( ~~merge~~exportSnapshot( successful_snapshots[i] ) )

{

if ( ~~retain~~commitSnapshot( successful_snapshots[i] ) )

{

printf("Successfully backed-up machine %s\n", successful_snapshots[i]);

} else

{

printf("Error while ~~retaining~~ committing snapshot for machine %s: %s\n", successful_snapshots[i],error);

}

} else

{

printf("Error while ~~merging~~ exporting snapshot for machine %s: %s\n", successful_snapshots[i],error);

}

=== Sub-Processes ===

~~==== Snapshot ====~~See also [[Libvirt_external_snapshot_with_GlusterFS]]~~# Create a snapshot with state:#* If the VM <code>vm-001</code> is running:~~#** Save the state of VM <code>vm-001</code> to the file <code>vm-001.state</code> (This file can either be created on a RAM-Disk or directly in the retain location. This example however saves the file to a RAM-Disk): <syntaxhighlight lang=~~"bash">virsh save vm-001 /path/to/ram-disk/vm-001.state</syntaxhighlight>#** After this command, the VMs CPU and memory state is represented by the file <code>/path/to/ram-disk/vm-001.state</code> and the VM <code>vm-001</code> is shut down.#* If the VM <code>vm-001</code> is shut down:~~ ~~#** Create a fake state file for the VM: <syntaxhighlight lang~~=~~"bash">echo "Machine is not runnung, no state file" > /path/to/ram-disk/vm-001.state</syntaxhighlight># Move the disk image <code>/path/to/images/vm-001.qcow2</code> to the retain location: <syntaxhighlight lang~~=~~"bash">mv /path/to/images/vm-001.qcow2 /path/to/retain/vm-001.qcow2</syntaxhighlight>~~#* '''Please note:''' The retain directory (<code>/path/to/retain/</code>) '''has to be''' on the same partition as the images directory (<code>/path/to/images/</code>). This will make the <code>mv</code> operation very fast (only renaming the inode). So the downtime (remember the VM <code>vm-001</code> is shut down) is as short as possible. ~~#* '''Please note2:''' If the VM <code>vm-001</code> has more than just one disk-image, repeat this step for every disk-image# Create the new (empty) disk image with the old as backing store file: <syntaxhighlight lang~~=~~"bash">qemu-img create -f qcow2 -b /path/to/retain/vm-001.qcow2 /path/to/images/vm-001.qcow2</syntaxhighlight>#* '''Please note:''' If the VM <code>vm-001</code> has more than just one disk-image, repeat this step for every disk-image# Set correct ownership and permission to the newly created image:#* <syntaxhighlight lang~~createSnapshot =~~"bash">chmod 660 /path/to/images/vm-001.qcow2</syntaxhighlight>#* <syntaxhighlight lang~~=~~"bash">chown root:vm-storage /path/to/images/vm-001.qcow2</syntaxhighlight>#* '''Please note:''' If the VM <code>vm-001</code> has more than just one disk-image, repeat these steps for every disk-image# Save the VMs XML description#* Save the current XML description of VM <code>vm-001</code> to a file at the retain location: <syntaxhighlight lang~~=~~"bash">virsh dumpxml vm-001 > /path/to/retain/vm-001.xml</syntaxhighlight># Save the backend entry~~#* There is no generic command to save the backend entry (since the command depends on the backend). Important here is, that the backend entry of the VM <code>vm-001</code> is saved to the retain location: <code>/path/to/retain/vm-001.backend</code>~~# Restore the VMs <code>vm-001</code> from its saved state (this will also start the VM): <syntaxhighlight lang~~=~~"bash">virsh restore /path/to/ram-disk/vm-001.state</syntaxhighlight>#* '''Please note:''' After this operation~~ For the ~~VM <code>vm-001</code> is running again (continues where we stopped it), and we have a consistent backup for the VM <code>vm-001</code>:~~commands see [[Libvirt_external_snapshot_with_GlusterFS#** The file <code>/path/to/ram-disk/vm-001.state</code> contains the CPU and memory state of VM <code>vm-001</code> at time T1~~#** The file <code>/path/to/retain/vm-001.qcow2</code> contains the disk state of VM <code>vm-001</code> at time T1#*** '''Important~~Part_2:~~''' Remember: The live-disk-image <code>/path/to/images/vm-001.qcow2</code> still contains a reference to this file!! So you cannot delete or move it!!!#** The file <code>/path/to/retain/vm-001.xml</code> contains the XML description of VM <code>vm-001</code> at time T1#** The file <code>/path/to/retain/vm-001.backend</code> contains the backend entry of VM <code>vm-001</code> at time T1# Move the state file from the RAM-Disk to the retain location (if you used the RAM-Disk to save the VMs state)#* <syntaxhighlight lang="bash">mv /path/to/ram-disk/vm-001.state /path/to/retain/vm-001.state</syntaxhighlight>~~_Create_the_snapshot_using_virsh]]

For the workflow see [[stoney_conductor:_prov-backup-kvm#createSnapshot]]

==== exportSnapshot ====

# Simply copy the underlying image to the backup location

#* <source lang="bash">cp -p /<path>/<to>/<image>.qcow2 /<path>/<to>/<backup>/<location>/.</source>

~~See also:~~ For the workflow see [[stoney_conductor:~~_prov_backup_kvm~~_prov-backup-kvm#~~Snapshot | Snapshot workflow~~ exportSnapshot]]

==== ~~Merge~~ commitSnapshot ====~~# Check if~~ For the ~~VM <code>vm-001</code> is running~~commands see [[Libvirt_external_snapshot_with_GlusterFS#* If not, start the VM in paused state: <syntaxhighlight lang="bash">virsh start --paused vm-001</syntaxhighlight>~~# Merge the live-disk-image (<code>/path/to/images/vm-001~~Cleanup.~~qcow2</code>) with its backing store file (<code>/path/to/retain/vm-001~~2FCommit_.~~qcow2</code>): <syntaxhighlight lang="bash">virsh qemu-monitor-command vm-001 --hmp "block_stream drive-virtio-disk0"</syntaxhighlight>#* '''Please note:''' If a VM has more than just one disk-image, repeat this step for every image~~28Online. Just increase the number at the end of the command. So command to merge the second disk image would be: <syntaxhighlight lang="bash">virsh qemu-monitor-command vm-001 --hmp "block_stream drive-virtio-disk1"</syntaxhighlight>~~# If the machine is running in paused state (means we started it in 1. because it was not running), stop it again:~~ ~~#* <syntaxhighlight lang="bash">virsh shutdown vm-001</syntaxhighlight>~~29]]

'''Please note:''' After these steps, the live-disk-image <code>/path/to/image/vm-001.qcow2</code> no longer contains a reference to the image at the retain location (<code>/path/to/retain/vm-001.qcow2</code>). This is important for For the ~~[[#Retain | retain process]].~~ ~~See also:~~ workflow see [[stoney_conductor:~~_prov_backup_kvm#Merge | Merge workflow]]~~ ~~==== Retain ====# Move the all the files in from the retain directory (<code>/path/to/retain/</code>) to the backup directory (<code>/path/to/backup/</code>)## Move the VMs state file to the backup directory##* <syntaxhighlight lang="bash">mv /path/to/retain/vm~~_prov-~~001.state /path/to/~~backup~~/vm~~-~~001.state</syntaxhighlight>~~kvm#~~# Move the VMs disk image to the backup directory##* <syntaxhighlight lang="bash">mv /path/to/retain/vm-001.qcow2 /path/to/backup/vm-001.qcow2</syntaxhighlight>##** '''Please note:''' If the VM <code>vm-001</code> has more than just one disk image, repeat this step for each disk image## Move the VMs XML description file to the backup directory##* <syntaxhighlight lang="bash">mv /path/to/retain/vm-001.xml /path/to/backup/vm-001.xml</syntaxhighlight>## Move the VMs backend entry file to the backup directory##* <syntaxhighlight lang="bash">mv /path/to/retain/vm-001.backend /path/to/backup/vm-001.backend</syntaxhighlight>~~ ~~See also [[stoney_conductor:_prov_backup_kvm#Retain | Retain workflow~~commitSnapshot]]

== Communication through backend ==

Since the stoney cloud is (as the name says already) a cloud solution, it makes sense to have a backend (in our case openLDAP) involved in the whole process. Like that it is possible to run the backup jobs decentralized on every vm-node. The control instance can then modify the backend, and theses changes are seen by the diffenrent backup daemons on the vm-nodes. So the communication could look like shown in the following picture (Figure 1):

[[File:Daemon-communication.png|~~500px~~800px|thumbnail|none|Figure 1: Communication between the control instance and the prov-backup-kvm daemon through the LDAP backend]] You can modify/update this workflow by editing [[File:Daemon-communication.xmi]] (you may need [http://uml.sourceforge.net/ Umbrello UML Modeller] diagram programme for KDE to display the content properly).

=== Control-Instance Daemon Interaction for creating a Backup with LDIF Examples ===

</pre>

==== Step 06: Start the ~~Merge~~ export Process (Control instance daemon) ====With the setting of the '''sstProvisioningMode''' to '''~~merge~~export''', the Control instance daemon tells the Provisioning-Backup-KVM daemon to ~~merge~~ export the ~~backing file~~ disk image ~~back into~~ to the ~~current disk image~~backup location.

<pre>

# The attribute sstProvisioningState is set to zero by the fc-brokerd, when sstProvisioningMode is modified to

# ~~merge~~ export (this way the Provisioning-Backup-VKM daemon knows, that it must start the ~~merging~~ export process).

dn: ou=20121002T010000Z,ou=backup,sstVirtualMachine=kvm-005,ou=virtual machines,ou=virtualization,ou=services,o=stepping-stone,c=ch

changetype: modify

replace: sstProvisioningMode

sstProvisioningMode: ~~merge~~export

</pre>

==== Step 07: Starting the ~~Merge~~ export Process (Provisioning-Backup-KVM daemon) ====As soon as the Provisioning-Backup-KVM daemon receives the ~~merge~~ export command, it sets the '''sstProvisioningMode''' to '''~~merging~~exporting''' to tell the Control instance daemon and other interested parties, that it is ~~merging~~ exporting the virtual machine or virtual machine templatedisk images.

<pre>

# The attribute sstProvisioningMode is set to ~~merging~~ exporting by the Provisioning-Backup-VKM daemon.

dn: ou=20121002T010000Z,ou=backup,sstVirtualMachine=kvm-005,ou=virtual machines,ou=virtualization,ou=services,o=stepping-stone,c=ch

changetype: modify

replace: sstProvisioningMode

sstProvisioningMode: ~~merging~~exporting

</pre>

==== Step 08: Finalizing the ~~Merging~~ export Process (Provisioning-Backup-KVM daemon) ====As soon as the Provisioning-Backup-KVM daemon has executed the ~~merge~~ export command, it sets the '''sstProvisioningMode''' to '''~~merged~~exported''', the '''sstProvisioningState''' to the current timestamp (UTC) and '''sstProvisioningReturnValue''' to zero to tell the Control instance daemon and other interested parties, that the ~~merging~~ export of the virtual machine or virtual machine template disk-images is finished.

<pre>

# The attribute sstProvisioningState is set with the current timestamp by the Provisioning-Backup-VKM daemon, when

replace: sstProvisioningMode

sstProvisioningMode: ~~merged~~exported

</pre>

==== Step 09: Start the ~~Retain~~ commit Process (Control instance daemon) ====With the setting of the '''sstProvisioningMode''' to '''~~retain~~commit''', the Control instance daemon tells the Provisioning-Backup-KVM daemon to ~~retain (copy and then delete) all~~ commit the ~~necessary files~~ changes from the overlay file to the ~~configured backup location.~~underlying disk-image

<pre>

# The attribute sstProvisioningState is set to zero by the fc-brokerd, when sstProvisioningMode is modified to

# ~~retain~~ commit (this way the Provisioning-Backup-VKM daemon knows, that it must start the ~~retaining~~ commit process).

dn: ou=20121002T010000Z,ou=backup,sstVirtualMachine=kvm-005,ou=virtual machines,ou=virtualization,ou=services,o=stepping-stone,c=ch

changetype: modify

replace: sstProvisioningMode

sstProvisioningMode: ~~retain~~commit

</pre>

==== Step 10: Starting the ~~Retain~~ commit Process (Provisioning-Backup-KVM daemon) ====As soon as the Provisioning-Backup-KVM daemon receives the ~~retain~~ commit command, it sets the '''sstProvisioningMode''' to '''~~retaining~~comitting''' to tell the Control instance daemon and other interested parties, that it is ~~retaining~~ committing changes from the ~~necessary files~~ overlay disk-images back to the ~~configured backup location~~underlying ones.

<pre>

# The attribute sstProvisioningMode is set to ~~retaining~~ comitting by the Provisioning-Backup-VKM daemon.

dn: ou=20121002T010000Z,ou=backup,sstVirtualMachine=kvm-005,ou=virtual machines,ou=virtualization,ou=services,o=stepping-stone,c=ch

changetype: modify

replace: sstProvisioningMode

sstProvisioningMode: ~~retaining~~committing

</pre>

==== Step 11: Finalizing the ~~Retaing~~ commit Process (Provisioning-Backup-KVM daemon) ====As soon as the Provisioning-Backup-KVM daemon has executed the ~~retain~~ commit command, it sets the '''sstProvisioningMode''' to '''~~retained~~comitted''', the '''sstProvisioningState''' to the current timestamp (UTC) and '''sstProvisioningReturnValue''' to zero to tell the Control instance daemon and other interested parties, that the ~~retaining~~ comitting of ~~all~~ the ~~necessary files~~ changes from the overlay disk-images back to the ~~configured backup location~~ underlying ones is ~~finished~~done.

<pre>

# The attribute sstProvisioningState is set with the current timestamp by the Provisioning-Backup-VKM daemon, when

replace: sstProvisioningMode

sstProvisioningMode: ~~retained~~comitted

</pre>

==== Step 12: Finalizing the Backup Process (Control instance daemon) ====

As soon as the Control instance daemon notices, that the attribute '''sstProvisioningMode''' ist set to '''~~retained~~committed''', it sets the '''sstProvisioningMode''' to '''finished''' and the '''sstProvisioningState''' to the current timestamp (UTC). All interested parties now know, that the backup process is finished, there for a new backup process could be started.

<pre>

# The attribute sstProvisioningState is updated with current time by the fc-brokerd, when sstProvisioningMode is

</pre>

== ~~State of the art~~ Current Implementation (Backup) ==

Since we do not have a working control instance, we need to have a workaround for backing up the machines:

* We do already have a BackupKVMWrapper.pl script (File-Backend) which executes the three [[#Sub-Processes | sub-processes ]] in the correct order for a given list of machines (see [[#Backup multiple machines at the same_time]]).

* We do already have the implementation for the whole backup with the LDAP-Backend (see [[ stoney conductor: prov backup kvm ]]).

* We can now combine these two existing scripts and create a wrapper (lets call it ~~KVMBackup~~LDAPKVMWrapper) which, in some way, adds some logic to the BackupKVMWrapper.pl. In fact the ~~KVMBackup~~ LDAPKVMWrapper wrapper will generate the list of machines which need a backup.

The behaviour on our servers is as follows (c.f. Figure 2):

# The (decentralized) ~~KVMBackup~~ LDAPKVMWrapper wrapper (which is executed everyday via cronjob) generates a list off all machines running on the current host.#* Currently on the hosts the cronjobs looks like: <code>00 01 * * * /usr/bin/LDAPKVMWrapper.pl | logger -t Backup-KVM</code>

#* For each of these machines:

#** Check if the machine is excluded from the backup, if yes, remove the machine from the list

#* Remove the old backup leaf (the "yesterday-leaf"), and add a new one (the "today-leaf")

#* After this step, the machines are ready to be backed up

# Call the ~~BackupKVMWrapper~~KVMBackupWrapper.pl script with the machines list as a parameter# Wait for the ~~BackupKVMWrapper~~KVMBackupWrapper.pl script to finish

# Go again through all machines and update the backup subtree a last time

#* Check if the backup was successful, if yes, set sstProvisioningMode = finished (see also TBD)

~~[[File:wrapper-interaction.png|500px|thumbnail|none|Figure 2: How the two wrapper interact with the LDAP backend]]~~

[[File:wrapper-interaction.png|650px|thumbnail|none|Figure 2: How the two wrapper interact with the LDAP backend]] You can modify/update this workflow by editing [[File:wrapper-interaction.xmi]] (you may need [http://uml.sourceforge.net/ Umbrello UML Modeller] diagram programme for KDE to display the content properly). * If for some reason something does not work at all, the whole backup process can be deactivated by simply disabling the LDAPKVMWrapper cronjob** <code>crontab -e</code>** Comment the LDAPKVMWrapper cronjob line: <code>#00 01 * * * /usr/bin/LDAPKVMWrapper.pl | logger -t Backup-KVM</code>=== How to exclude a machine from the backup ===Login to one of the [[VM-Node | vm-nodes]] and execute the following command If you want to exclude a machine from the backup run you simply need to add the following entry to your LDAP directory: <source lang="bash">machineuuid="<UUID OF THE MACHINE-NAME>" # e.g.: b9d13dbc-9ab7-4948-9daa-a5709de83dc2cat << EOF | ldapadd -D cn=Manager,o=stepping-stone,c=ch -H ldaps://ldapm.stepping-stone.ch/ -W -xdn: ou=backup,sstVirtualMachine=${machineuuid},ou=virtual machines,ou=virtualization,ou=services,o=stepping-stone,c=chobjectclass: topobjectclass: organizationalUnitobjectclass: sstVirtualizationBackupObjectClassou: backupsstbackupexcludefrombackup: TRUEEOF</source> If the backup subtree in the LDAP directory already exists, you need to add the sstbackupexcludefrombackup attribute: <source lang="bash">machineuuid="<UUID OF THE MACHINE-NAME>" # e.g.: b9d13dbc-9ab7-4948-9daa-a5709de83dc2cat << EOF | ldapadd -D cn=Manager,o=stepping-stone,c=ch -H ldaps://ldapm.stepping-stone.ch/ -W -xdn: ou=backup,sstVirtualMachine=${machineuuid},ou=virtual machines,ou=virtualization,ou=services,o=stepping-stone,c=chchangetype: modifyadd: objectClassobjectClass: sstVirtualizationBackupObjectClass-add: sstbackupexcludefrombackupsstbackupexcludefrombackup: TRUEEOF</source> ==== Re-include the machine to the backup ====If you want to re include a machine, simply delete the machines whole backup subtree. It will be recreated during the next backup run. == Next steps ==

= Restore =

'''Attention:''' The restore process is not yet defined / nor implemented. The following documentation is about the old restore process.

== Basic idea ==

The restore process, similar to the backup process, can be divided into three sub-processes:

** The user can also abort the restore process up to this point. After that the restore can not be aborted or undone!

* Non-User-Interaction phase: The daemons communicate through the backend between each other and the restore process continues without further user input (c.f. [[#Communication_through_backend_2 | Communication through backend]])

=== Sub Processes ===

==== Unretain small files ====

This workflow assumes that the backup directory is on the same physical server as the retain directory (protocol is file://)

# Copy the backend-entry file from the backup directory to the retain directory:

#* <source lang="bash">cp -p /path/to/backup/vm-001.backend /path/to/retain/vm-001.backend</source>

# Copy the XML description from the from the backup directory to the retain directory:

#* <source lang="bash">cp -p /path/to/backup/vm-001.xml /path/to/retain/vm-001.xml</source>

# Compare the backend-entry file (the one in the retain directory) with the live-backend entry

#* Resolve all conflicts between these two backend entries

#** Modify the backend entry at the retain location accordingly

# Apply the same changes for the XML description at the retain location (backend entry and XML description need to be consistent).

==== Unretain large files ====

# Copy the state file from the backup directory to the retain directory:

#* <source lang="bash">cp -p /path/to/backup/vm-001.state /path/to/retain/vm-001.state</source>

# Copy the disk image(s) from the backup directory to the retain directory:

#* <source lang="bash">cp -p /path/to/backup/vm-001.qcow2 /path/to/retain/vm-001.qcow2</source>

#** '''Important:''' If a VM has more than just one disk image, repeat this step for every disk image

==== Restore the VM ====

# Shutdown the VM if it is running:

#* <source lang="bash">virsh shutdown vm-001</source>

# Undefine the VM if it is still defined:

#* <source lang="bash">virsh undefine vm-001</source>

# Overwrite the original disk image:

#* <source lang="bash">mv /path/to/retain/vm-001.qcow2 /path/to/images/vm-001.qcow2</source>

#** '''Important:''' If a VM has more than just one disk image, repeat this step for every disk image

# Restore the VMs backend entry:

#* Write the backend entry from the retain location (<code>/path/to/retain/vm-001.backend</code>) to the backend

# Overwrite the VMs XML description with the one from the retain location

#* <source lang="bash">cp -p /path/to/retain/vm-001.xml /path/to/xmls/vm-001.xml</source>

# Restore the VM from the state file with the corrected XML

#* <source lang="bash">virsh restore /path/to/retain/vm-001.state --xml /path/to/xmls/vm-001.xml</source>

== Communication through backend ==

The actual KVM-Restore process is controlled completely by the Control instance daemon via the OpenLDAP directory. See [[#OpenLDAP Directory Integration|OpenLDAP Directory Integration]] the involved attributes and possible values.

[[File:Daemon-interaction-restore.png|thumb|~~500px~~650px|none|Figure 3: Communication between all involved parties during the restore process]]

You can modify/update these interactions by editing [[File:Restore-Interaction.~~xml~~xmi]] (you may need [http://uml.sourceforge.net/ Umbrello UML Modeller] diagram programme for KDE to display the content properly).

=== Control instance Daemon Interaction for restoring a Backup with LDIF Examples ===

</pre>

== ~~State of~~ Current Implementation (Restore) =='''Attention''': The restore process is not yet defined / nor implemented. The following documentation is about the ~~art~~ old restore process. * Since the prov-backup-kvm daemon is not running on the vm-nodes (c.f. [[stoney_conductor:_Backup#Current_Implementation_.28Backup.29]]), the restore process does not work when clicking the icon in the webinterface. === How to manually restore a machine from backup ==='''Important''': Before you continue with this guide, make sure that you have no other possibility to restore the machine. It might be easier and safer to get lost files from the online backup if the machine has one set up. If you really have to restore the machine from the backup:# Stop the machine from via the [https://cloud.stepping-stone.ch/vm-manager/ web interface]# Login (as root) on the [[VM-Node]] the machine was running on As a first step, you would like to set some useful bash variables to be able to copy paste the following guide: '''Double check all variables you are setting here. If one is not correct, you will restore a running machine or overwrite a live-disk image!'''<source lang='bash'>machinename="<MACHINE-NAME>" # For example: machinename="b6dc3d27-5981-4b18-8f3f-31ed3d21a3c6"vmpool="<VM-POOL>" # For example vmpool="0f83f084-8080-413e-b558-b678e504836e"vmtype="<VM-TYPE>" # For example vmtype="vm-persistent"</source>Change to the backup directory for the given machine and check the iterations:<source lang='bash'>cd /var/backup/virtualization/${vmtype}/${vmpool}/${machinename}ls -al</source>Change into the most recent iteration<source lang='bash'>cd 2014...ls -al</source>In there you should have: * The state file <MACHINE-NAME>.state.<BACKUP-DATE> (for example b6dc3d27-5981-4b18-8f3f-31ed3d21a3c6.state.20140109T134445Z)* The XML description <MACHINE-NAME>.xml.<BACKUP-DATE> (for example b6dc3d27-5981-4b18-8f3f-31ed3d21a3c6.xml.20140109T134445Z)* The ldif file <MACHINE-NAME>.ldif.<BACKUP-DATE> (for example b6dc3d27-5981-4b18-8f3f-31ed3d21a3c6.ldif.20140109T134445Z)* And at least one disk image <DISK-IMAGE>.qcow2.<BACKUP-DATE> (for example 8798561b-d5de-471b-a6fc-ec2b4831ed12.qcow2.20140109T134445Z)Now you should save the backup date and the disk image(s) in a variable<source lang='bash'>backupdate="<BACKUP-DATE>" # For example: backupdate="20140109T134445Z"diskimage1="<DISK-IMAGE-1>.qcow2" # For example: diskimage1="8798561b-d5de-471b-a6fc-ec2b4831ed12.qcow2"diskimage2="<DISK-IMAGE-2>.qcow2" # For example: diskimage2="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee.qcow2"...</source> Have again a look at the different variables and '''double check them again'''<source lang='bash'>echo "Machine Name = ${machinename}"echo "VM Pool = ${vmpool}"echo "VM Type = ${vmtype}"echo "Backup date = ${backupdate}"echo "Disk Image 1 = ${diskimage1}"echo "Disk Image 2 = ${diskimage2}"...</source> Copy all these files to the retain location:<source lang='bash'>currentdate=`date --utc +'%Y%m%dT%H%M%SZ'`mkdir -p /var/virtualization/retain/${vmtype}/${vmpool}/${machinename}/${currentdate}cp -p /var/backup/virtualization/${vmtype}/${vmpool}/${machinename}/${backupdate}/${machinename}.ldif.${backupdate} /var/virtualization/retain/${vmtype}/${vmpool}/${machinename}/${currentdate}/</source>  ''' Now you are entering the critical part. You won't be able to undo the following steps''' Check if there is a difference between the current LDAP entry and the one from the backup<source lang='bash'>domain="<DOMAIN>" # For example domain="stoney-cloud.org"ldapbase="<LDAPBASE>" # For expample ldapbase="dc=stoney-cloud,dc=org"ldapsearch -H ldaps://ldapm.${domain} -b "sstVirtualMachine=${machinename},ou=virtual machines,ou=virtualization,ou=services,${ldapbase}" -s sub -x -LLL -o ldif-wrap=no -D "cn=Manager,${ldapbase}" -W "(objectclass=*)" > /tmp/${machinename}.ldifdiff -Naur /tmp/${machinename}.ldif /var/virtualization/retain/${vmtype}/${vmpool}/${machinename}/${currentdate}/${machinename}.ldif.${backupdate}</source>and '''edit the file at the retain location''' according to your needs. If there are no differences (or the differences are not important) you can skip the following step. Otherwise use the [https://cloud.stepping-stone.ch/phpldapadmin PhpLdapAdmin] to delete the machine from the LDAP directory (do not forget to delete the dhcp entry <code>dn: cn=<MACHINE-NAME>,ou=virtual machines,cn=192.168.140.0,cn=config-01,ou=dhcp,ou=networks,ou=virtualization,ou=services,dc=stoney-cloud,dc=org</code>). Then add the LDIF (the one you just edited) to the LDAP (first do some general replacement)<source lang='bash'>sed -i\ -e 's/snapshotting/finished/'\ -e '/member.*/d'\ /var/virtualization/retain/${vmtype}/${vmpool}/${machinename}/${currentdate}/${machinename}.ldif.${backupdate} /usr/bin/ldapadd -H "ldaps://ldapm.${domain}" -x -D "cn=Manager,${ldapbase}" -W -f /var/virtualization/retain/${vmtype}/${vmpool}/${machinename}/${currentdate}/${machinename}.ldif.${backupdate}</source> Undefine the machine<source lang='bash'>virsh undefine ${machinename}</source> Copy all the disk images from the backup location back to their original location<source lang='bash'>cp -p /var/backup/virtualization/${vmtype}/${vmpool}/${machinename}/${backupdate}/${diskimage1}.${backupdate} /var/virtualization/${vmtype}/${vmpool}/${diskimage1}cp -p /var/backup/virtualization/${vmtype}/${vmpool}/${machinename}/${backupdate}/${diskimage2}.${backupdate} /var/virtualization/${vmtype}/${vmpool}/${diskimage2}...</source> And restore the domain from the state file from the backup location with the XML from the retain location (the one you might have edited)<source lang='bash'>virsh restore /var/backup/virtualization/${vmtype}/${vmpool}/${machinename}/${backupdate}/${machinename}.state.${backupdate}</source> Now the machine should be up and running again. Continuing where it was stopped when taking the backup. If everything is OK, you can cleanup the created files and directories<source lang='bash'>rm -rf /var/virtualization/retain/${vmtype}/${vmpool}/${machinename}/${currentdate}rm /tmp/${machinename}.ldif</source>

== Next steps ==

[[Category: stoney conductor]]

Michael

SLB, editor, reviewer

3,368

edits