Beyond the Disaster Recovery plan

Disaster recovery is a subset of business continuity. Businesses typically handle business continuity well, but Disaster Recovery is often very hollow, let me explain why.

Writing a disaster recovery plan is straight forward: figure out what’s critical to your business and back it up.

  1. Define all of your assets; not just hardware but all software, configurations, etc.

  2. Determine the risk of each listed asset going down and the overall impact of such an outage.

  3. Determine the business risk of those impacts.

  4. Define your recovery time objective (RTO) which is how quickly you need to recover from those outages.

  5. Define your recovery point objective (RPO) which is how much you can afford to lose.

  6. Adjust infrastructure to improve these metrics as necessary.

How many organizations make it to Step 6? It’s not even close to the end of Disaster Recovery! The steps above will get you started with your baseline plan but likely you’ll have to cycle through them several times to reach an acceptable plan with the necessary coverage and metrics.

If the plan relies on certain tasks you need to test them. Think of how often you test your fire alarms, it’s probably a good idea to test your DR plan that often.

Ask yourself when you last restored your server backups. Are you 100% certain you have backups of everything? When did you make your DR plan? Has anything in your environment changed since?

Even if you have gotten this far, you’re still only a fraction of the way through.

You must have disaster recovery exercises.

Everyone has a plan until they get punched in the face.
— Mike Tyson

Disaster Recovery Exercises

Disaster recovery is a team effort that requires regular practice. You don’t want your first practice to be during your first disaster!

First the DR Master calls a meeting where minutes are taken. Documentation is critical to this process!

The DR Master proposes a disaster and the team works together to explain it’s impact and how to resolve it. Your team should include people from every department, NOT just IT. While you may not need your accountant the entire time, their input on various scenarios, like how a cyber attack destroying or taking control of accounting would affect the business, is essential. All departments will have important input into the scenario; not to mention interest in how it’s resolved.

The DR Master’s job is to ensure strong answers are provided. Weak answers like “we’ll just restore, no big deal” could lead to “restores have failed to boot”. Any weak answers must be documented and followed up on to ensure they become strong answers at the next meeting.

The minutes of your meetings should be transformed into incident response procedures. If you create and continually optimize a step-by-step disaster recovery procedure covering all the major types of incidents you’ll always be prepared regardless of the day or time disasters occur or who is/isn’t available to respond immediately.

These exercises will give your incident response team practice in these scenarios, inform response documentation, and especially invaluable team building.