A hybrid of remote working and office culture is set to continue, according to recent research. In this climate, Chris Williams discusses three key areas internal audit teams should be aware of when assessing technology resilience.
While some organisations will no doubt return to the ‘old ways’ of working, research conducted by the Business Continuity Institute shows that over 50% of organisations do not plan to go back to their old business model, and even more indicating that there will be increased home working opportunities for their staff.
In addition to employees working remotely, the number of customers shopping or interacting with businesses online has increased significantly. Overall online sales grew by 36% in 2020, the highest growth seen for 13 years, with mobile commerce growing by 73%.
In an environment where technology outages impact a workforce’s ability to operate and customers expect organisations to be ‘always on’, near-instantaneous recovery is often required with minimal data loss. As a result, the threat and impact of technology outages are an even-greater concern for organisations.
However, the backdrop of multiple third-party cloud solutions, agile workforces, fragile supply chains, and cyber threats means technology resilience is more complicated than ever.
Traditionally, internal audit teams may have taken a high-level or light-touch approach to looking at technology resilience. For example, looking at backup and recovery arrangements or whether there is a regularly tested technology disaster recovery plan.
These are all important, but internal audit’s focus must now move beyond these component parts, looking at all the interconnected layers of the technology environment:
A failure within any one of these could cause a critical system or process to fall over. Below, I've laid out three key aspects internal audit should consider when assessing technology resilience:
Whether your organisation’s technology environment is on-premise, in the cloud, or a combination of the two, your environment will be made up of inter-related layers, each with its own risks and resilience requirements:
This is the foundation or base layer. It consists of components such as power, network connectivity, physical security and environment controls, and datacentres providing the hosting environment.
The 'environment' is the storage and compute functions of the systems, whether that is in the cloud, on physical or virtual servers.
The 'platform' is the operating systems, databases, and the management of storage and compute resources that host the applications.
The applications are the software tools that are in an organisation’s ‘service catalogue’ and allow it to perform its business and operational process.
The information and raw data entered by users, or automated processes, into applications for processing.
We often find that technology resilience plans haven't considered all the layers or their interconnectivity.
The adoption of cloud solutions has increased significantly, and it is now common for data to be stored outside its network perimeter, such as in cloud data centres. An increasing number of organisations are adopting cloud versions of software (software-as-a-service or SaaS).
Recent research by the Cloud Industry Forum showed that 94% of organisations are using at least one cloud service and that 91% of decision-makers claimed that the cloud played an important part in their response to the COVID-19 situation, with 40% describing its role as critical.
While these cloud solutions bring many benefits, they also give rise to several resilience risks that can be technically and contractually complex to manage. For example, defining what the cloud provider is responsible for and what your in-house team do when your technology environment experiences a disruptive event, or what and where data should be mirrored to mitigate against an outage at the cloud provider.
Internal audit can play a role to ensure that these requirements are considered upfront and regularly reviewed. If this isn't done, it can become a difficult and costly exercise to update arrangements or switch providers.
Internal audit needs to review the extent to which technology resilience plans and arrangements include both proactive risk management and response planning.
Is your technology environment able to ‘absorb shock’ and continue operating when experiencing potentially disruptive events, such as a cyber attack or a data centre outage? Can you use proactive risk management to mitigate the impact when an outage occurs?
Your plans and arrangements should consider all the layers and the integrated systems (ie how data flows upstream and downstream from feeder systems to data warehouses). For example, traditional backup solutions can mitigate the risk of data loss at the ‘data layer’, but other solutions, such as diverse routing, separate circuits and burstable capacity, are also needed to absorb shock in the infrastructure layer.
DRaaS (Disaster Recovery-as-a-Service) products and services are available at all layers. These are often procured in isolation, however, without considering how systems integrate throughout the layers.
'Bounce back' refers to the organisation's ability to recover from a disruptive event. As with 'absorb shock', these arrangements need to consider all the layers and integrated systems. A common misunderstanding is that systems in the cloud are more resilient. This is not correct, and cloud solutions often require extra provisions to make them resilient, similar to on-premises solutions.
Cloud providers offer disaster recovery options, and there are many available for private cloud, dual cloud and multi-cloud environments. However, these are not always provided by default and often need to be tailored to an individual organisation’s requirements.
The need for your technology environments and systems to be resilient has increased significantly in the last year, and boards, management teams, and audit committees are looking to internal audit to provide assurance over resilience arrangements.
In addition to considering how well your organisation can recover after an event, you should be assessing how well your technology environments can continue unaffected during potential disruptions or attacks. For smaller organisations with only a handful of critical systems and processes, this may not require too much additional effort. But for complex organisations, it will require significantly more time and technical capability.
The requirement for assurance goes beyond just the organisation’s own systems but also to any third parties they rely upon, including SaaS and cloud solution providers. This will require internal audit to continually develop their understanding of evolving cloud technologies.
For more-mature organisations with tried-and-tested resilience arrangements, you may want to adopt novel testing approaches. For example, by facilitating crisis management exercises using scenarios such as a data breach, cyber attack, or the failure of one of the cloud providers. This may enhance assurance to stakeholders while helping to upskill people and embed working practices.