Understanding the Business Continuity and IT Disaster Recovery Gap
Many business continuity professionals can attest to the tension that often occurs between the business and IT when it comes to recovery capabilities. For example, Company X recently implemented a business continuity program, including determining recovery time objectives (RTOs) for key business processes. Like all well-established business continuity programs, the business impact analysis (BIA) considered the loss of technology and helped the company develop recommended recovery time (and recovery point) objectives for technology resources. The business documented and presented these RTOs to management following the initial BIA, but never followed up with IT to ensure that the capabilities could be met.
Meanwhile, IT leveraged its own application/system list and related recovery information to prioritize applications for recovery and drive the implementation of a disaster recovery solution that was cost-effective and aligned with IT’s conclusions of business requirements for recovery (created from data outside the BIA). Both the business and IT feel confident in their work; yet, neither have communicated with the other. Given that the groups have not undergone a joint exercise (or actual disruption), neither group is aware of the underlying gap: Recovery priorities and strategies are misaligned between the business and IT.
This perspective analyzes the symptoms and root causes of the business continuity and IT disaster recovery gap and proposes solutions to close it.
SYMPTOMS & ROOT CAUSE ANALYSIS
In order to address this gap, organizations must first be able to identify it. Logically, a lack of communication between the groups is often the biggest indicator of this problem. In some organizational cultures, IT and the business are even told by management that they are not allowed to contact each other – a huge red flag! More often than not, if the business and IT are not directly communicating, expectations and capabilities are not adequately understood.
Another symptom is a lack of integrated testing. Both groups typically perform testing at some level, but do so independently. Test results are not shared across groups or escalated to a collective management body, which does not allow for the identification of gaps.
On the surface, this gap may seem easy to address (just communicate!). Right? However, when analyzed further, the root causes point to issues that may be more difficult to tackle.
Data is collected separately by IT and the business. During the business impact analysis, business process owners determine recovery time objectives for applications that are critical for their department to function. IT management often selects which applications are the most important using other methods and criteria, such as the number of users accessing the system or type of information supported by the application. These methods often do not consider the impact to business process areas or internal department dependencies. Furthermore, each group’s data is typically stored in separate repositories and not communicated between groups, widening the gap.
Management expectations are not aligned and/or miscommunicated. Misalignment typically stems from management determining recovery expectations in silos with little input from other groups, while miscommunication typically results from a central body determining expectations but delivering the message in silos. Additionally, the two groups often rely on separate program performance metrics, which makes it extremely difficult to align expectations and identify gaps.
Groups adopt different languages when talking about the same topic. The business and IT may use different names for applications and systems or have different definitions of recovery time objectives, which also makes it difficult to align expectations. Many times, the business believes they are clearly communicating objectives and IT thinks they are communicating recovery capabilities in an effective manner. More often than not, both groups leave the meeting thinking the other understands the message, while no actual consensus was made.
Groups become defensive in fear of appearing to have made the wrong strategy decisions. Recovery capabilities are often moving targets; as the organization changes, business requirements change. Some organizations do not view recovery requirements as such, and point blame for implementing the “wrong” strategies or communicating the “wrong” requirements. In some organizational cultures, these misconceived “wrong” strategies or requirements have severe consequences for responsible employees, which can lead groups to defend their original strategies or requirements, even if it’s not the best solution for the current business structure.
The impact of a gap between the business and IT is simple: Recovery objectives will not be met!
A lack of communication leads to a lack of preparedness. When IT and the business do not share requirements and capabilities, applications may not be prioritized in a way that meets business expectations, which could lead to missed recovery objectives. Business process areas may rely on systems that are not even in scope for IT DR and lack adequate workarounds that could have been developed if capabilities were clearly communicated.
Organizations make investments in the wrong strategy solution. If business requirements are not effectively communicated with IT, organizations may invest in the wrong level of IT disaster recovery strategies, which can cause large financial impacts to the organization. With misaligned recovery objectives, an organization may invest in a cold site, for example, that won’t meet business requirements during an actual disruption. Alternatively, organizations may spend a tremendous amount of money on a hot site, thinking business processes require short recovery timeframes, when in reality a less expensive solution with a longer recovery timeframe would still meet business requirements. An IT DR strategy developed around incorrect information can also lead to unacceptable data loss, which can cause serious regulatory, legal, and/or contractual impacts that would lead to additional financial impacts.
Technology service providers commit to unacceptable SLAs. Due to the advancement of technology, many companies rely on outsourced technology service providers to serve as an extension to internal capabilities or support unique platforms. When the business and IT are operating under different assumptions, technology service providers receive mixed messages or inaccurate information regarding recovery requirements, which could lead to contracts with unacceptable SLAs. Furthermore, departments often onboard technology service providers without consulting IT to review the contract and validate the technology service provider’s capabilities, resulting in a supplier risk that can lead to operational impact to the organization.
Internal frustration leads to unhealthy departmental relationships. A lasting implication of the lack of communication is escalated internal tension and frustration, which can have long-term effects on the success of the recovery programs. As discussed earlier, when gaps are identified, management can become frustrated and feel mislead as to the organization’s actual recovery capabilities; this frustration dramatically decreases the likelihood of the two groups working together to form a cohesive strategy.
In today’s world, the vast majority of business processes depend on some form of technology or data. Technology has become so advanced that manual workarounds are almost nonexistent for critical applications, which leads to single points of failure for business functions. Because business processes are so closely integrated with technology, the only effective way to ensure continuity is to integrate technology and business recovery efforts and mitigation strategies. In doing so, organizations have the insight to prioritize correctly and recover in the most time-efficient manner.
Integration allows companies to identify recovery gaps before they become an issue. When the business and IT are in sync on the capabilities and requirements, disruptive incidents can be accurately planned for and downtime strategies can be developed. If possible, manual workarounds or alternate technology service providers can be contracted for use during a disruption. Additionally, if the collective management (IT and the business) accepts the gaps and the associated risks, the business and IT will understand the implications of the disruption when it happens, so the teams can work together to solve the problem without worrying about getting blamed for the issue.
After organizations have identified and solved the underlying issues causing the gap, the business and IT can successfully integrate strategies to enable recovery during a disruption. To do so, groups should integrate efforts in the following areas:
- Governance – IT disaster recovery and business continuity programs should have coordinated governance structures with documentation that mandates regular meetings. Groups should also use the same metrics to align efforts.
- Business Impact Analysis – Data collected during the BIA should be shared with IT. The business should work with IT to identify recovery gaps in current strategies. IT should also consider the BIA data when assigning application recovery tiers and determining disaster recovery strategies and implementations.
- Testing and Exercising – Combined testing and exercising with the business and IT allows organizations to identify recovery gaps; exercises are the best way to align expectations and capabilities between the business and IT.
- Software – All data should be stored in one central location, such as Castellan, where gaps can be easily identified and all parties have access to the same information.
For more information on how to integrate the business and IT, check out Bridging the Business Continuity and IT Disaster Recovery Gap.
In many organizations, recovery strategies are miscommunicated and misaligned between the business and IT, ultimately preventing the business from meeting recovery time objectives. Some organizations may not even be aware of these gaps and the impact they have on the ability of the organization to effectively recover. By identifying the issue and analyzing root causes, organizations can implement strategies to ensure these groups are working together to identify recovery gaps, effectively prioritize recovery, and develop strategies to enable the continuity of operations.
Get business continuity insights delivered to your inbox.