Have you ever had to walk into an organization to access a service, be it purchasing products or making some kind of payment? You would agree with me that the most intimidating thing is t0 be told that our system is down. You then ask, when will it be up again and the lady/gentleman being the computer tells you, “We don’t know yet” . This is an answer which gives enough evidence that the organization does not have well implemented Business continuity and disaster recovery plans.
Today i want to talk about business continuity and disaster recovery in an organization, maybe you might get one or two hints. The topic will be split into two focus areas; Business Continuity and Disaster Recovery.
What is Business Continuity
Well, this is basically concerned with ensuring that the business continues to offer the same level of service regardless any hindrances that may be as a result of any disasters. It is very important to always ensure that your business is running smoothly and that all critical services are being offered 24/7 (well, only if you work 24/7). In order to achieve this, a company has to have a properly crafted Business Continuity Plan (BCP).
The business continuity plan (BCP)
This is a document which outlines the methods and steps that the organisation adopts during a crisis in-order to ensure continuity of operations. The BCP has to focus on 3 major objectives:
- Availability – Service should always be available to be consumed by clients
- Confidentiality – regardless of the disaster, you should ensure that the privacy of information is always preserved therefore your BCP should focus on maintaining information privacy
- Integrity – the BCP should ensure that there is no unauthorized alteration of information.
Creating the BCP
The process of creating the BCP involves 5 major phases:
- INITIATION – this stage involve carrying out a risk assessment of the potential risks that are associated with your business and having a full understanding of the risks, gathering the team that will form the BCP committee and those that will participate in the BCP implementation, coming up with a report to present to the management for approval and coming up with a work-plan.
- BUSINESS IMPACT ANALYSIS (BIA) – this is where you identify the key business operations and analyse the impact it has to the business if those operations are idle for a period of time. You analyze the impact on revenue, customer base, regulatory requirements, company image, etc. This analysis will help you in coming up with an ideal Maximum Tolerable Downtime (MTD), which is basically the amount of time a business can afford to be down amidst business hours. The MTD is very critical, any large investors out there would agree with me.
- RECOVERY STRATEGIES – These are ways that you will adopt to recover from a disaster. This is part of the Disaster Recovery Plan which we will talk about in a bit.
- DEVELOPMENT & IMPLEMENTATION – This is when you craft the detailed plan for recovery, summing out all the information gathered during the previous steps. This is where it becomes clear as who does what when a specific sort of disaster occurs. Also outlines how the plan is going to be maintained, tested and how to deliver awareness to the concerned stakeholders.
- TESTING – You can not say that you have a working BCP without testing it. One ways you can test a BCP is to carryout drills. Ever heard of a fire drill, that’s one way of testing a BCP.
Now me talk about the Disaster Recovery Plan (DRP)
Just like the BCP a DRP is also plan but it focuses on how to recover after a disaster as compared to how to continue operating after a disaster. I will be more technical now. I want to focus on the recovery of I.C.T infrastructure because i feel that many organizations in Zimbabwe do not have sound DRPs put in place when it comes to I.C.T infrastructure. As an example, take any organization that you might think of. Imagine if a fire burns it’s head office today, would they still be able to offer the same service to their clients within a space of like 24 hours? I will not raise the National Registry as an example.
These are some of the technical recovery strategies that you can adopt for your organization:
- Subscription services – where you setup recovery sites depending on your budget. You can setup an identical office building in another location which is fully equipped and has the required network infrastructure and servers. Under such a setting, the servers will be fully operational and having identical copies of transactions with the main production environment. When a disaster occurs, it will be a matter of evacuating the main office and going to the recovery site to resume operations.
- Mutual aid agreements – this when two organisations decide to share infrastructure when a disaster occurs to one of them. I don’t know what happens then if a disaster occurs to both of them.
- Redundant processing centers – you can setup infrastructure which does the same operations on the same company. A simple example is an Uninterruptible power supply (UPS). This will act as a fail save in terms of a problem arising with the main power supply.