Virtual machine protection is one of the most challenging tasks of cloud computing. More than backing up the data, it is recovering the data that needs focus. Defining a few metrics upfront will help organisations backup and recover their data efficiently.
The most important metric in Virtual machine recovery is Recovery Time Objective (RTO). How quickly must the virtual machine be recovered and made operational in the event of disaster or human error? This metric has to be defined by the business managers and not the IT personnel. The answer to this question will vary with the kind VM that needs to be recovered. If the VM is mission-critical, the RTO will be very tight and the organisation may need to be able to switch to a hot site or disaster recovery site at the point of failure. They will have to select a continuous backup option with continuous mirroring / replication of data to an alternate site for high availability. If the VM is non-critical, organisations can afford to go slow on the recovery. The organisation can afford to wait for the recovery of the primary server for a specified time frame.
Closely linked with RTO concept is the Recovery Point Objective (RPO) concept. The recovery point objective specifies the acceptable data loss window. Can the organisation afford to lose a few minutes or a few hours of data inputs? This is because the data input at the point of failure may not have been saved and may not be available to the organisation for recovery. If the organisation cannot afford to lose even a few minutes of data entry, the data protection strategy that should be selected is a continuous data backup system. If a few hours of data loss does not make much of a difference, the organisation can afford to set up scheduled backup systems.
An unfortunate aspect of virtual machine deployment is the potential for creating Virtual Machine sprawls. This is because virtual machines can be easily created on the fly and no additional hardware or software needs to be requisitioned. The resource impact of these rogue virtual machines can be immense and organisations that have this problem on hand will have to undertake the exercise of rationalising their VM deployments or provisioning for time required for backing up and recovering these machines.
It should be noted at this point in the discussion that there may be several adjunct systems that need to be recovered along with the VM. The Recovery Time Objective and the Recovery Point Objective that may be defined by the business must take into consideration the time required for the recovery of these systems in addition to the recovery of the VM.