Information availability and IT operations require Data Backup. Legal and Compliance requirements dictate Data Archival. But many organizations make the mistake of equalizing Archive with Backup, which can lead to wrong choice of backup or archival media, very poor restore time and even loss of information.
As part of an audit, an auditor reviewed the backup and archival system of a company. The company presented their backup systems, access controls and audit. When asked about archived data, they again pointed to the tapes containing their backup. But their backup tapes are rotated every 6 months, so the company does not have any archive from earlier then 6 months ago.
The company failed the legal Archival requirement.
In order to properly design and architect a backup or archive systems, one must clearly understand the differences between backup and archive:
The key reason for the existence of backup is to provide an alternative data source in case the primary data source is corrupted or destroyed. A Backup process is creating a copy of the current state of data. It is understood and accepted that the state of the backed up data will change in the future under controlled circumstances. At that point the old backup will become irrelevant for operational purposes and the data will need to be backed-up again.
Criteria for selecting a backup solution
- The backup needs to be accessible fast
- The media should be reusable for maximum cost efficiency
- The media should survive transport in less then ideal conditions (trunk of a car)
- The backed up information should survive with full integrity and availability for several months on the backup media.
- The backup should be able to span multiple media (if backup set is larger then media capacity).
- The solution should be intelligent enough to enable different backup sets (full backup, incremental backup, differential backup etc)
The key reason for the existence of archive is to provide historical reference of information. The archive's process final product is a long term non-changeable copy of data or information. It is understood and accepted that the archive media must be resilient, capable of surviving over long periods of time (years) and must guarantee that the archived data remain unchanged during the entire archive lifespan.
Criteria for selecting archive solution
- The archive media needs to be able to operate with different data collections while treating them at the same level of integrity - individual data records from a database as well as entire documents,
- The access speed to an archive can be slow, but archive media should have an extremely high level of reliability (remember, archives can span several decades)
- When creating an archive, always plan the lifetime of the archive, and make sure that the manufacturer will provide systems that can retrieve the stored data - having an archive that is unreadable because there is nothing to read it on is a terrible idea.
- Data integrity must be maintained over the entire period of the archive existence - there is no point in having an archive if you can't trust that it's the same as it was when archived.
- There should be an index of archive media to retreive relevant information from archive
Backup and archive solutions may be part of an integral system, but they perform a different function, so the actual media and individual systems will most likely vary.
While backup is still performed mostly on magnetic tapes, archive is usually performed on optical disks or microfilm. You may choose magnetic media for archive, but if you do, you need to plan that your archive tapes must be shielded from long term adverse influences, and you must maintain a functional reader for the tapes over the entire lifespan of the archive.
Talkback and comments are most welcome
3 Rules to Prevent Backup Headaches
Business Continuity Plan for Blogs
Further resources and options for educating yourself in IT terminology
can be found here and here