Mis-saves happen. It is a fact of the computing world. The ‘Intelligent’ recognise the fact and provide for it. Cloud service vendors live with this reality and understand its implications. They consequently, take extraordinary efforts to ensure that mis-saves do not cause data loss for their customers.
Most cloud vendors use versioning and time stamps to distinguish between backups and protect customer data against mis-saves.
Each file that is saved into the system is tagged with a unique identifier and a time stamp. The original or the first copy of the file that is saved into the system, is called the primary file and any copies of the file saved from the same node or different nodes are identified as replicas of the original. These copies are then deleted and only one copy of the file is retained.
If changes are made to the file and it is saved into the repository, the new version of the file is compared with the existing version of the file and it is tagged as a new version with a version identifier. The backup algorithm identifies changes in the file and saves the changes to the file as a new version with references to unchanged content in the original file. Users, however, will be able to see all the content in the file version called, as data from the original file replaces the referenced sections of the file when the file is being displayed.
It follows that mis-saved files will not result in complete loss of data. Only the changes made to the new file will be lost and users can recall the original version of the file from the backup repository and redo the changes required once more. They can even use the latest version of the file that is available on the system to rebuild the lost version of the file since most cloud vendors permit users save and retain many versions of a file in the backup repository.
We, at Backup Technology, are powered by Asigra, a robust agentless cloud backup system. Our continuous backup system in place constantly monitors changes to files and saves the changed file as a new version of the file. Since new versions are created after extracting the changes and creating pointers to unchanged content—the new versions occupy less disk space and are smaller in size. Users can access versions of a single file from the storage. Additionally, the open file driver that comes with our software automatically backs up files that have been left open for long periods of time in applications such as Outlook, QuickBooks and Simply Accounting. It performs backup snapshots of data as scheduled back on to servers. So, we invite you to try our software and experience first-hand the power of always having your files saved for you automatically and constantly without conscious effort on your part!
Everyone who has worked with someone else will understand and appreciate the need for versioning. One member of the team may create the document and others may engage in critical evaluation of the contents of the document. Each modification to the document results in the original contents of the document being lost forever. If at some point, the team would like to go back to the original version of the document, the same may not be available to them if all changes have been made to the original document, and the changes have been saved. Versioning helps teams save the first document as the original version and every modified document as a modified version of the original. In the circumstances, if the original document has to be revisited or restored, users will simply have to call up the document saved as the original.
Versioning technology is often packaged as part of a Document Management System (DMS). Each vendor of the technology may apply a different versioning algorithm and logic for identifying and numbering different versions of a document. But, fundamentally, versions are registered sequentially. For instance, a versioning algorithm may number a Document as A v.1 and all subsequent versions may be numbered as A v.1.1, A v.1.2, and so on till the ceiling on the number of versions that can be stored is reached. The date of creation of the document version (generally the system date) may also be used as the version numbering mechanism.
Most backup and recovery systems have a limitation on the number of versions of a document that can be saved on the DMS. Some vendors allows users save only a few tens of versions while others may permit storage of a hundred versions of the document. As new versions of the document are added to the database, older versions may be archived, deleted or removed automatically from the storage repository. Users who wish to store more than the stipulated number of versions of a document may have to rename the document and store versions of the new original.
Versioning technology may be linked with de-duplication technology. Since de-duplication technology looks for exact duplicates for elimination, all versions of a file that are duplicate of the original will be automatically eliminated. Versioning may be linked with incremental and differential backup to ensure that only portions of the file that are modified are stored in the new version of the document with references to the original portion of the file for file build up during recovery.
With the advent of cloud backup and cloud computing, versioning technologies have become immensely sophisticated and document management has been fine-tuned to accommodate the multi-various needs of its patrons.
Versioning, as mentioned in the previous part of this article, Versioning for Cloud Computing- Part l, is the process of assigning numbers with or without date stamps to identify versions of a document or piece of data. Versioning at the backup level may create identities for backup versions that are stored on the server. At the file level, each file may be assigned a version number to distinguish it from other versions of the file after modifications have been done. A few storage providers may treat a set of backups, documents or files or folders as objects and perform object versioning.
File versioning is the most commonly used versioning system in cloud computing. The first version of the file (available in the seeded backup or a subsequent full backup) is generally given the first number (in accordance with the versioning system of numbering adopted) and every new version of the file is compared with the original version or the full backup version and numbered sequentially. The comparison process, additionally, enables the storage provider initiate incremental backup processes, so that only the modified sections of the file are backed up and unchanged portions of the file link back to the original file. This saves on bandwidth and time to backup. If time stamps are available and the management has pre-set archival policies on the system via the agent interface, the files will be automatically archived.
Some vendors like Google use object versioning systems. Objects are stored in buckets. All modifications to the object are part of the bucket, including archived versions of the object. Objects can be restored to an earlier state, overwritten, deleted or modified as required. The object properties allow users to identify the different versions of the object. The properties are numeric.
Versioning can be switched off or on for both file and object versioning systems. A switch off of versioning does not remove identifying characteristics of files or objects already stored under the versioning system. Original versions of the file can be restored without disturbing the current version of the file in file versioning. In object versioning, restoration of an earlier version of the file will result in overwriting of the current version.
Versioning for the cloud is becoming more and more sophisticated as cloud vendors strive to differentiate themselves from the competition. This is, especially, true of cloud service vendors, who want to offer their customers state-of-the-art collaboration tools and provide support for mobile / remote computing.
Many users within an enterprise often share data. The data may be modified, appended to or changed in some manner by users who are authorised to access the information. This creates a new version of the information. But, what if the enterprise wants to undo the changes to the data made by a particular user? If it is a change to a single record, it is possible to effect the change manually. If multiple records have been changed, changing them back to the earlier version can be cumbersome, and time consuming. Versioning is the process of saving versions of documents before changes are made to it. If the change is not desirable, the enterprise has to simply restore the previous version of the document.
How many versions of a document can be stored? Any limits pre-set by the service provider will restrict the number of versions of a document that can be created. Users may have the luxury of customising the figures within the limits pre-defined.
What benefits will the organisation derive from versioning? Versioning is really a tool for the management. Apart from being able to track and restore versions of documents, managerial version control enables the management time stamp information, and weed out or archive versions of documents that are no longer relevant to the day-to-day activities of the business. Archiving and deleting releases precious storage space that can be utilised effectively for storage of current business critical documents.
Versioning is also a necessary adjunct to disaster recovery. Managers can quickly and efficiently identify the latest version of the document for restoration in the event of natural or man made disasters, so that the restored system can kick start from the point when disaster struck the digital repositories, and created the outage. Furthermore, document search is simplified if versioning is automated for the storage system.
There are many different types of versioning technologies used by different types of cloud storage providers. We shall discuss more about these different versioning technologies in the next part of this article, Versioning for Cloud Computing — Part II.