Create a Disaster Recovery Plan

“The first thing you do is to take all the plans off the top shelf and throw them out the window and start once more.“ Dwight D. Eisenhower said those words in 1957 regarding how to carry out a plan during an actual emergency, and he was a man who had his share of experience creating and executing plans.  President Eisenhower was illustrating the point that regardless of the amount of planning that is done to prepare for an emergency, it is impossible to predict the actual course of events as the situation unfolds. (Woolley & Peters) However, he also said that planning is necessary.  Planning gives us knowledge of the problem, a familiarity with the situation and a starting point; therefore planning is an essential component of effectively recovering from a disaster.

A business disaster recovery plan is a flexible document that centralizes information that will likely be needed to recover from a disruption to business operations. In order to Create a Disaster Recovery Plan or when creating a disaster recovery plan you do not know what events will cause its use, but it is possible to reduce the amount of time required to recover by thinking ahead. This paper will review research in the area of disaster recovery and business continuity planning and share what I have learned through interviews with experts who have real-world experience creating, testing and using disaster recovery plans. I will also discuss my experiences implementing this research in the creation of a disaster recovery plan for the Financial Managers Society, Inc. (FMS).

Page Contents

Create Disaster Recovery PlanDisaster Recovery Plan – Introduction and Context

An effective and tested disaster recovery plan allows units to increase confidence in advance of any potential disaster and to react more quickly, accurately, and professionally should any kind of disaster come to affect their IT operations.  Use this as an opportunity to think strategically about your operations – what could you do to leverage central resources and simplify your disaster recovery responsibilities, allowing you to focus more on where you can add value for your unit.

If you found your way to this document because of an IT audit finding, a First thing: Don’t panic!  Keep in mind that Internal Audit is a unit, like yours, that has a function at the organization.  The staff members are your colleagues, and they can be of major assistance to you.  Overall, their purpose and function is to assure that we all are addressing risk (specific to us, the risk in IT) in an appropriate way.   If you have never created a disaster recovery plan before, understand that it is not very difficult and you probably already have most of the things you will need.

In order to Create a Disaster Recovery Plan, this document provides a brief introduction to IT disaster recovery planning and the tools and options available to IT managers within the organization or company who need to engage in such planning, and indicates where additional information and help can be obtained.

A disaster recovery plan will benefit five distinct audiences

Audience Purpose
Users Assurance that a plan is in place for the systems that they depend on (in the event of a disaster).
Department/Collegiate Leadership Assurance that preparation for IT recovery in event of a disaster is happening proactively.
Department/Collegiate IT Personnel A good disaster recovery plan can serve as an operational reference and collection point for documentation.  It also helps staff know what their roles will be in a disaster situation.
OIT Assurance that collegiate IT units and OIT are aligned.
Office of Internal Audit Assurance that the IT department has a documented plan for dealing with a potential disaster and returning the department or college to normal IT operation quickly, accurately, and professionally.

 

Disaster Recovery Plan Resources and Common Practices

There are a number of ways to make the process of creating an IT disaster recovery plan easier.  To Create a Disaster Recovery Plan you need to follow the following number of common practices and resources you can choose to use as you think about and create your own disaster recovery plan.

Heat Maps

Heat maps can be a handy tool for prioritizing your disaster recovery planning efforts.  These slides are a short primer on how you might go about creating them for your unit.  Feel free to use different descriptors of the x and y axes, and your own assessment of low, medium, and high.  You should find nearby the Excel tool to generate these graphs.

The Office of Information Technology

One major resource available to you is the Office of Information Technology (OIT).  In addition to developing a number of tools for disaster recovery planning (detailed below) they also must engage in this activity for their own operations.  Additionally, most department and collegiate IT operations (and therefore disaster recovery plans) are at least partially dependent on OIT services.  To the extent that your unit relies on these services, you can reference their service statements for your DR planning.

The OIT Disaster Recovery Services website (coming soon) is a good first stop for information.  DRS has developed a process, training, and a set of tools for creating and testing disaster recovery plans for OIT’s internal use.  They are continuously modifying the tools to make them more broadly applicable for collegiate and departmental use.

  • OIT-DRS DRplan process/templates – Tool developed by OIT to help automate some aspects of creating a disaster recovery plan.
  • OIT-DRS tabletop exercise document – a formalized process for helping to test your disaster recovery plan.
  • List of guidelines around audit expectations – informal list of the types of things auditors specifically look for, including topic areas such as file backup methods, testing, and offsite storage, redundancy, and overall disaster recovery plan expectations.

Note that there is no requirement that you use OIT-DRS to create your disaster recovery plan, but why wouldn’t you?  They will help you create and test your plan, and your plan will have commonalities with other plans developed this way (think cross-training) – this should be considered a “best practice.”

IT Directors Group

The University IT Directors group is an excellent place to find mentors who can help you with the process of creating and testing a disaster recovery plan.

Office of Internal Audit

Auditors have the job of helping you.  In fact, bringing a fresh perspective to considering your operation and making suggestions is a service they perform for which you might otherwise want to pay a consultant.  In particular, when certain aspects of the IT environment are out of your control, they can help in seeing the big picture and can also help effect changes in the environment to assist your situation, if warranted

Next Steps: Create a Disaster Recovery Plan

The first thing that you may want to do is to create an action plan.  This step is just good practice in general, but will also help you stay organized and focused on the right thing at the right time even in the face of daily distractions or other interruptions.  A simple and straightforward plan will suffice.  Below is an example of a possible plan of action for creating a DR plan.

Sample Action Plan

  1. Step One    –    Make a list of items related to operational scope.  Start with broad areas and then produce a level of detail concerning each area.  For example, if you run servers, perform desktop support, run networks, etc. you would want to note that.  You would also note, say in the server area, that you have file servers, database servers, and a web server, for example.
  2. Step Two    –   For each area, describe, in “business terms”, why you are engaged in activities in these areas and sub-areas.  What is the usage?  And what is the tolerance for down time (for example, perhaps the database server can be down for a full day, but the website needs to be up as close to 24×7 as possible, or vice-versa).
    1.  Tip:  A useful tool for focusing discussion on the relative importance and risk tolerance for these items is heat map(s).  See Appendix A.
  3. Step Three  –   Identify the risks associated with a disaster – what happens when things go wrong?  Risk assessment is pretty standard stuff in IT and could include security breach, hardware issues, network issues, etc.  The key here is to talk about the place where IT and business meet and be focused on full or partial disasters (rather than other types of risk like a network attack).  What happens to overall operations and what does IT do to address the problem and in what timeframe (think who?, what?, when?, where?, and why? here).
    1. Hint:  Risk assessment can cover a wide scope, so while the topics of disaster recovery and risk assessment are closely related, disaster recovery is not intended to be a full-blown risk assessment exercise.  However, risk assessment is a key 1st step in developing a DR plan.  Risk assessments lead to risk mitigation and risk management opportunities.  For more information see the work of the IT Directors “Audit Guidelines / Risk Assessment” subgroup, available on the IT Directors wiki.
  4. Step Four   –    Review the risks and vulnerabilities that you have identified. Determine which risks can be eliminated and which ones can be mitigated.  Put a plan in place to implement your findings.
  5. Step Five    –    Determine recovery strategies for every risk that you cannot eliminate. Document everything you have from the first four steps, perhaps using the tools identified earlier.  If you feel uncomfortable about how well you can answer some of the questions in the context of your own unit, take additional steps to try to become comfortable.
    1. Tip:  If you feel overwhelmed, this is a great time to get additional help from people who can help such as the IT Directors group, OIT, or internal audits.  It may be useful to work with a mentor from the IT Directors group or OIT to help keep things clear and get more ideas on the table.  As with many other areas, a fresh perspective on a problem issue may be all it takes to see new solutions!
  6. Step Six    –      Document recovery strategies and key information needed for recovery e.g., contact lists, communications requirements, hardware and software inventories, etc.  In other words, create your disaster recovery plan.
  7. Step Seven   –   Test the plan.  Document your results.  Update your plan with what you learned.  Remember, it is a lot better to be able to tell an auditor where you feel the plan may be somewhat weak and the steps you would like to take to improve things than simply letting them find the holes.  Even having such a plan goes a long way to assuring that you are working to manage disaster risks in appropriate way.

Obviously the plan can be more complex for larger operations, but if it is, make sure you have a team of people with clear assignments to facilitate completion.  This process benefits from having multiple perspectives and like many other aspects of this planning, documenting that you have a team of people all familiar with the DR process for your unit also goes to show how you are working to address risk.

Help Someone Else

Finally, as you complete this process, hopefully having learned through each step more about your operation and gained confidence with DR, consider sharing your knowledge with others.  By making yourself available to others who need to learn, you have the opportunity to continue to learn by seeing others’ situations from the perspective of the observer – sometimes one of the best ways to continue to generate new ideas!

Emerging Issues

Outsourcing

If you’ve outsourced some part of your operation, think about what this means for your disaster recovery planning.  How do you assess the vendor’s disaster recovery capability, and integration with your operations?  Do you have a plan for what to do should the vendor go out of business?

Outsourcing is likely to become more common, and audit standards will certainly evolve.  Be sure to consult Internal Audits.