5 Steps to Create a Disaster Recovery Plan

By: Internetwork Engineering on September 2nd, 2016

5 Steps to Create a Disaster Recovery Plan

In today’s age of digital business, protecting your data in the event of a disaster is critical for business continuity. And with the combination of natural threats, malware, software/hardware malfunctions, and the biggest culprit of all, human error, a disaster is often inevitable.

So no matter how safe you think your data is, you need to have a recovery plan in place for how to react in the face of disaster. And while there’s no be-all-end-all solution for creating the perfect plan (it really all depends on the type of business you have), we’ve got five steps to help you create your own.

Step 1: Risk Analysis

The first step to creating your disaster recovery (DR) plan is to identify your key applications and assets, and the business impact of each. You need to know what you’re protecting, how it should be protected and its value to your business.

You’ll also need to evaluate threats against your business, and the threats to each particular asset you defined.

Each potential scenario that could occur should have preset actions that provide guidance on whether a disaster should be declared and whether the disaster recovery plan should be activated.

Step 2: Asset Recovery

After you’ve defined your assets and assessed the potential threats against them, you need to determine how long you can go without access to each one, known simply as your recovery window.

And be sure that you’re as realistic as possible with this. If you just say, “All of our systems need to be back up and operational in 10 minutes or less,” it’s pretty unrealistic, not to mention, it doesn’t give your IT team priorities on which issues to manage and fix first.

After you’ve established your recovery window for each application, you can start focusing on recovery solutions. This can include recovering data from tape or disk backup, or data replication to an offsite location.

Figuring out the right type and level of protection should tie directly to the business value of the asset and how long you think you can manage without it. That way, each asset has its own recovery solution with a defined budget based on its business value and impact.

Step 3: Communications Planning and Training

This is one of the biggest and most essential pieces of a disaster recovery plan, and also typically the most forgotten. You need to make sure that there is a specific communication plan in place and that those with roles know what their role is.

Your plan should include up-to-date contact information for each role defined in your plan, and how the chain of communications will be organized from a hierarchical perspective.

You’ll also need to make sure that your employees go through training for this plan, so that they each know exactly what their function is and how to execute it. If everyone isn’t crystal clear on their role and responsibilities in the case of a disaster, chances are it’s not going to go well.

Another thing most people and organizations fail to think about is that many disasters, particularly natural ones, may impact a region preventing the employees you trained from making it to the office or participating in the disaster plan. This is why it’s essential to always keep the human aspect of the organization in mind, and cross train multiple employees so that no matter what happens, you always have someone available to help execute the disaster recovery plan.

Step 4: Recovery Site Access

The next step is to actually implement the systems required to deliver the disaster recovery plan. Typically, this involves some type of disaster recovery site to help recover and restore data when your main facility or data center is unavailable. Many organizations opt to implement one of these four types of sites in order to accomplish this step, and you’ll need to decide which site best fits the needs of your business.

Hot Site- With a hot site, your organization will be able to access a fully-functional data center with software and hardware, as well as employee and customer data. These are usually operated around the clock so they can be ready at any time in the case of a disaster.
Warm Site- A warm site typically contains most or all of the software, hardware, personnel, and network services necessary to run a working data center, however it does not have any customer data. When a disaster does occur, your organization can install additional equipment as needed and restore data.
Cold Site- A cold site has IT infrastructure to support systems and data, but doesn’t have any hardware, software, or data until an organization activates their DR plan and installs the necessary equipment. This is really only an option for your organization if your business systems will be OK being down for an extended period, or if you want to use it to supplement hot and warm sites in the event of a long-lasting disaster.

Cloud-Based Recovery Site- Another option is a cloud-based recovery site, which reduces the need for infrastructure, resources, and data center space. These sites are also often much cheaper than the types of sites mentioned above, and are a great option for smaller businesses. However, when using a cloud-based DR site, both bandwidth and security can be an issue.

The failover of inbound communications also must to be planned out carefully and in detail, meaning you need to know how all inbound traffic will be rerouted to the DR site. You should also define how this will be initiated and by who, and decide if you want to have a separate access method to the DR site.

Step 5: Test, Test, and Test Again

Once you’ve got everything together, it’s officially time to test your plan. This first test is the most important part of the planning process, and when you do run your test, make sure you do it with a complete failover and failback of all of your systems.

Ensuring that you find any issues or flaws with your plan before you actually need to implement it is essential. You definitely don’t want an actual disaster to be the first time you put it into action.

Once you’ve done your testing and the walkthrough of your plan, you’ll be able to go back through and highlight all the weaknesses and adjust it accordingly. After addressing the weaknesses document your revisions and test it again.

Each subsequent test should run much smoother than the first, and should get you and your organization ready to execute your plan in the case of a real disaster recovery situation.

And now that you have everything you need to create a disaster recovery plan we’ve just got one parting tip: be sure to update your plan whenever changes are introduced to the production environment. Most organizations test quarterly as small changes to systems are made continuously. You want to make sure your DR plan is up to date and has been tested recently to ensure your organization is prepared to recover operations quickly in the event of a disaster!