Using Lessons Learned in the Evaluation of Business Continuity Procedures

The management system approach to business continuity requires a culture of continual improvement in business continuity programs.  One of the key steps in facilitating continual improvement is to regularly evaluate existing business continuity procedures.  This perspective takes a closer look at Clause 9.1.2, ISO 22301’s requirement for evaluation of business continuity procedures. 

ISO 22301 is the first standard to employ the new ISO format for management systems standards, which involves a considerable amount of “templatized” management system content across ten clauses. Because this format, language, and many of the requirements are new to most business continuity professionals, it’s important to review and consider the intent associated with some of the content and concepts.

This perspective is the fifth in a series to discuss key elements of the ISO 22301 business continuity management system, including value-adding elements of the standard or requirements that could “trip up” an organization during the certification process.

Today we’re going to take a look at Clause 9.1.2, the standard’s requirement for evaluation of business continuity procedures.

Clause 9.1.2 – Evaluation of Business Continuity Procedures
The organization shall conduct evaluations of its business continuity procedures and capabilities in order to ensure their continuing suitability, adequacy and effectiveness.  These evaluations shall be undertaken through periodic reviews, exercising, testing, post-incident, reporting, and performance evaluations.  Significant changes arising shall be reflected in the procedure(s) in a timely manner.  The organization shall periodically evaluate compliance with applicable legal and regulatory requirements, industry best practices, and conformance to its own business continuity policy and objectives.  And, the organization shall conduct evaluations at planned intervals and when significant changes occur.  When a disruptive incident occurs and results in the activation of its business continuity procedures, the organization shall undertake a post-incident review and record the results.

Identifying Lessons Learned to Drive Continual Improvement
Every day, disasters across the country force organizations to activate their business continuity plans.  And, after the emergency subsides, a key role for business continuity professionals is documenting, reporting, and acting on lessons learned.  Castellan has worked with numerous clients that have faced disasters and disruptive incidents (both large and small) and when it comes to lessons learned, the most common questions we get are:

  • What do good lessons learned look like?
  • How should lessons learned be documented and tracked?
  • How should we approach looking for root causes for the problems we’ve identified?
  • Which lessons learned should we include in management reviews?

Lessons learned are more than just corrective actions as described in ISO 22301.  Lessons learned should answer “What did we learn from this event?” and “How can we improve?” – with the resulting information then used to drive continual improvement.  Also, it’s important to note, that lessons learned can originate from an actual event (during the event or in post-incident reporting), an exercise or test, or periodic reviews.

The following graphic depicts the relationship between non-conformities, lessons learned, root cause analysis, and corrective actions, which all drive continual improvement for organizations.

Once corrective actions are implemented, the result should be an improvement in the performance of the business continuity procedures.

What Good Lessons Learned Look Like
During a disaster, your business continuity plan should remind personnel to record information received, decisions made, and the outcomes.  This activity log, along with a team review after the conclusion of a disruptive incident, are key opportunities to capture and prioritize lessons learned.

Lessons learned will answer one or more of the following:

  • Did we respond in a timely manner?
  • Did we successfully recover within established timeframes?
  • What issues arose that caused recovery to exceed established timeframes (or could have potentially resulted in that)?
  • What strengths and weaknesses were identified during the response and recovery process?
  • Did the plans include adequate instruction (describing how to recover and how to operate in “recovery mode”), or do they need expanded?
  • Were the right people involved in a timely manner and did they have the knowledge and skills necessary to participate?

Good lessons learned will provide a starting point for identifying root causes of performance issues.  Identifying lessons learned during and after an incident is important to improving future performance, but ensuring they are well documented and tracked until the issue or root cause is addressed – in a prioritized manner – is equally important.  Good lessons learned will document the expected outcome, the actual outcome, and the conditions that may have contributed to the gap between expected and actual outcomes.  Often, this information is documented in the post-incident or exercise report.  The objective in documenting lessons learned is to provide enough information that a root cause analysis can be conducted, if necessary, and subsequent corrective actions can be developed and implemented.

How to Document and Track Lessons Learned
ISO 22301 (and similar standards) require documentation of a post-incident report after any disruptive incident.  Often, these reports are similar to those generated after conducting an exercise, with the focus being the capture of participants, objectives, results (strengths, opportunities for improvement, and corrective actions), and participant feedback.  Depending on the detail included in the post-incident report, a summary may be provided to senior management (requesting acceptance of the findings and input on prioritization).

Documenting and tracking lessons learned may lead to easily identifying corrective actions to take; however, not all lessons learned have clear root causes.  As in any response and recovery effort, there can be hidden causes for issues that arise and, as responsible business continuity professionals, it is our responsibility to investigate the root cause(s).

Identifying Root Causes
With complicated recovery efforts that result in lessons learned, we often recommend performing a root cause analysis – even for those lessons learned that may have clear corrective actions – as a method to understand all contributing factors and identify corrective actions.  We have put together a resource for using root cause analysis in a business continuity context: Applying Root Cause Analysis (RCA) to Business Continuity.

Root cause analysis is a good tool for identifying contributing factors to non-conformities in a business continuity program.  The result of conducting root cause analysis should be one or more corrective actions that can and/or will be implemented to reduce the gap identified in the lesson learned.  Not all lessons learned need to go through a root cause analysis.  But, lessons learned that are persistent and complex should go through the root cause analysis or similar process to identify underlying, often unseen, causes.  Either way, Lessons learned that go through the root cause analysis process or those with easily identifiable corrective actions should result in specific actions that can be documented, implemented, and assigned for closure.

The results of the root cause analysis plus the list of corrective actions and lessons learned resulting from an incident can lead to a long list of outcomes and actions taken that can easily overwhelm senior managers (not to mention take too much time from already crammed schedules).  So, one key issue is selecting which lessons learned should be presented to management.

Tracking Corrective Actions
After a disaster or exercise, lessons learned will often lead to corrective actions that need to be taken to improve performance of the recovery process or the business continuity program.  Tracking corrective actions is a key task for any business continuity program and another requirement of ISO 22301.  When tracking corrective actions be sure to document (and report to management) the following:

  • What the corrective action is to be taken (and the associated root cause)
  • Who is assigned responsibility for implementing the corrective action
  • When the corrective action is expected to be closed out
  • When the corrective action is actually closed out

Sometimes corrective actions include researching potential solutions to an issue identified in the business continuity program, the result of which is information for management to make decisions (potential solutions, costs associated with mitigating risk, and any residual risk left after implementing a solution).  It is possible that a corrective action leads to management choosing to accept a risk rather than implement a risk mitigation strategy that costs too much.  In this case, the decision should be noted and the corrective action closed out.

Selecting Lessons Learned to Include in Management Reviews
Not all corrective actions need to be reported to senior management.  Our standard approach is that minor corrective actions which do not directly pose a threat to effective recovery do not need to be reported to management as long as:

  1. They are documented and corrected in a timely fashion
  2. Are not a systemic issue

One example of a minor lesson learned would be that one member of the recovery team could not be contacted since his/her contact information was not updated prior to the incident.  Although the issue hindered recovery efforts, the issue is easily corrected and limited to one person on one recovery team.  The key to maintaining management support for the business continuity program is to document the information they want and need to see.

There are three key things to think about when selecting lessons learned to include in management reviews:

  1. Is the lesson learned significant enough to facilitate or impede recovery of products or services?
  2. Will the lesson learned help the management team understand the success or failure of the recovery effort?
  3. Can the corrective action only be acted upon with management’s intervention (allocating funding, risk acceptance, senior-level intervention, or other resource allocation)?

Answering these questions can help identify which lessons learned should be highlighted in the management review while the others can be documented and made available through the post-incident report.  A post-incident review is a key opportunity to highlight how the business continuity program has reduced the organization’s risk, as well as how it will implement additional measures to further reduce risk.

Following an incident, it is easy to be caught up in the return to normal operations and overlook the need to document and act on lessons learned.  However, lessons learned and the resulting corrective actions are a critical part of any recovery – they help to improve future program performance.  Lessons learned help us to identify and document factors that cause gaps, which in turn can be corrected (or accepted by management).  Regardless of where the lessons learned come from, we need to ensure that we are capturing, documenting, and acting on lessons learned – not only to remain in compliance with ISO 22301, but also to ensure that business continuity procedures can meet the organization’s recovery objectives.

Continue to visit our blog for more posts in Castellan’s Conforming to ISO 22301 series.

In the meantime, don’t hesitate to reach out to us to discuss aligning to the standard or pursuing certification. We look forward to hearing from you!

Get The Business Continuity [Re]Vision Builder Guide

Ready for some hands-on help? Let’s discuss how to best achieve your resilience goals.