Last month, we hosted a special conference about disaster recovery, featuring George Crump, founder of Storage Switzerland and Éric Poulin from EMC.
George’s Texas-based company is all about data protection and recovery solutions using the latest IT technologies. George leads a hands-on analytical approach, using all of his many years of experience working in data centers, on the front line of data protection and disaster recovery. His goal is to prepare companies for worst case scenarios which unfortunately tend to happen more often than what business owners think.
In his words, backup and recovery is a process. Too often the technology is blamed when things go wrong, where actually the failure is in the proper use of this technology.
In this article, we share George’s take on the major problems of disaster recovery plan and how to resolve them following the 3 R’s.
Major problems of DRPs (Disaster Recovery Plans)
Most existing DR plans have the same problems: they are out of date and unrealistic – created by users for IT but not for the business. The old school SLAs (Service Level Agreements) don’t match up to today’s demands, assuming the people behind them are reasonable in their expectations. But nowadays, users want zero downtime, zero data loss, and zero cost…
Too often the DR sites are too close by, the IT are not on-hand when the disaster strikes, and the plan is too complex and unrehearsed to put into action on time. Technology moves super-fast and the DRPs need to be able to adapt to these changes.
Forward thinking: Introducing SLO (Service Level Objectives)
SLOs give the people with the know-how control of the process. IT understands the possibilities, capabilities, and budget and can dictate to the users what can be delivered.
These SLOs and their components are feasible solutions with room for maneuver and realistic in their concept. The most important applications in your data center come first. Simple to implement, ready to move with the fast pace of IT, and performed through a simple email to top-level management.
3Rs of Data Recovery Plans: Readiness, Recovery and Return
1. Readiness – applying your SLO to Disaster Recovery
IT disasters come in many forms, from the extremes of a manmade or natural disaster, to disasters related directly to technology; the loss of a server or the failure of a storage system and the ever increasing threat of cyber-attack.
A well put together and well tested DRP is the first step to recovery. Depending on your needs and what you can afford to lose (amount of downtime – RTO – and lost data - RPO), there are 3 global scenarios:
• Standard backups using disks or tape drives
Requires 4 hours or more
Similar to archiving, this scenario is definitely the slowest, due to the fact that the media is disconnected and off-site. Assuming that most of the data will not be recovered after a disaster, it is important to consider the VRO (Version Retention Objective), knowing that only the most recent backup will be recovered.
• Backup device/software in place for frequent protection
Requires 45 minutes to 4 hours
Using disk backup with recovery enhancement will provide a shorter time frame for recovery, although performance can be an issue with VMs, compression and high capacity drives. Ideally, there is little or no file transfer involved.
• Secondary system in place for data and key applications
Requires 5 to 45 minutes
More expensive, this scenario is by far the quickest and the easiest, if it is done well. Protected with snapshots, it eliminates all data conversion and data movement. It uses both on and off site systems, making it easy to get to data and providing full coverage for all disasters.
With a solid DRP in place, practice is the key. With the technology available today, there should be no excuses: test and test again all the elements of your SLO, so that everyone is familiar with its operation.
Eventual return to the production site requires a plan in place for the transfer of masses of data in a short time. Since the DR site may be your data’s home for a while, it needs to be functional until the return.
Photo credit: © bakhtiarzein - Fotolia.com