Recent global events have highlighted the need to strengthen operational resilience arrangements across the financial services sector. This comes at a time when enterprises are continually put to the test by an avalanche of cyber security threats, complex FinCrime networks, operational vulnerabilities, to name a few challenges.
Operational resilience is a strategic priority for regulatory bodies and since the UK FCA and PRA issued new laws in March 2022, the financial services sector has faced significant demands to build more secure and operationally resilient IT environments, including a requirement for firms to identify and set impact tolerances for each of their important business services. The latest Supervisory Statement SS2/21 sets out the expectations of how PRA-regulated firms should comply with regulatory requirements and expectations relating to outsourcing and third-party risk management. The Supervisory Statement addresses areas such as business continuity, governance, operational resilience, and risk management (including but not limited to cyber risk), and lists the areas where the PRA expects written agreements relating to material outsourcing.
The infographic below by Corporater illustrates the operational resilience compliance timeline.
It was also reported earlier this year that UK financial regulators plan to intensify scrutiny of cloud computing providers “amid growing fears that an outage or hack of their services could severely disrupt a banking system increasingly reliant on them”.
Similarly, the European Union's Digital Operational Resilience Act (DORA) lays out a unified set of requirements for a wide range of EU financial services in the areas of cyber and ICT risk management, resilience testing, incident reporting, and third-party outsourcing. DORA also defines a new framework for financial services to oversee Critical ICT Third Party Providers (CTTPs) including Cloud Service Providers (CSPs).
Regulatory bodies in many other jurisdictions are also taking a hard look at financial, reputational, operational, and other risks faced by the financial services sector as the shift from antiquated applications hosted on legacy systems to digital infrastructure accelerates.
Financial digitalisation and supervisory efforts are set to ramp up - here's why:
- The shock factor and disruptive effects of the COVID-19 pandemic, climate change, macroeconomic and geopolitical events etc. have heightened the urgency for organisations to reassess the risks to integrated digital networks, legacy infrastructure, supply chains etc., and an increasing number of governmental agencies across the globe are working to make operational resilience a regulatory requirement.
- Evolving consumer demands and competition from nimble start-ups mean that, as new enterprise capabilities are rolled out over the coming years (migrating assets to multi-cloud/hybrid environments, artificial Intelligence, advanced data analytics, robotics process automation, new remote/virtual work models etc.), this will invariably elevate the levels of risk exposure to the business as the cyber attack surface expands.
- Plans to implement new payment messaging standards and fintech solutions (#ISO20022, #realtimepayments, #embeddedfinance, #openbanking , #openfinance, etc.) continue to gain ground in many jurisdictions, providing wide-ranging opportunities to improve financial inclusion, create new business models and deliver differentiated customer experiences using the rich new data sets.
- Many large incumbents have successfully reverse-engineered significant segments of their monolithic application portfolios and migrated to smaller, interconnected microservices-based architectures as part of the digital transformation roadmap. A recent survey by #TechRepublic found that organisations using microservices were reaping clear benefits, with 69% experiencing faster deployment of services, 61% had greater flexibility to respond to changing conditions, and 56% saw material gains from rapidly scaling up new features into large applications. That’s not to say that every legacy application stands to gain from being decomposed and rebuilt using a microservices architecture: Organisations should consider the trade-offs, such as the loss of simplicity and operational overhead that comes with managing hundreds of services.
As we approach a raft of new regulatory requirements, and financial institutions big and small assess the value and ROI of digital initiatives, operational resilience will be front and centre of many c-suite discussions. Compliance with Cyber Security and Risk Management department rules and regulations are an absolute must, but with 60 to 80 percent of IT incidents and service disruptions arising from change (www.cio.com), and the lack of governance and transparency that permeates many large IT departments, the technical and operational landscape warrants far greater attention if operational resilience is to be strengthened in a timely and sustainable manner.
Here are four steps you can take to help strengthen the resilience of the digital estate:
1) Risk Awareness:
The ownership and economics of risk management are two topics that are often subject to lengthy debates among digital transformation stakeholders. In a 2021 study, McKinsey found that unlike financial risk management where firms generally have established roles and processes (e. g. model risk management), their survey showed that firms rarely have established roles, processes, or indeed a collective understanding of digital and analytics risk drivers. The corporate leaders surveyed say their biggest challenge in managing digital and analytics risks is simply identifying them – this probably goes some way to explain why managing risk in enterprises with a heavy digital footprint can be challenging.
By implementing a highly responsive risk awareness initiative that engages people (including staff, partners, third party vendors etc.), processes, technology, policy, and other key corporate assets, the digital transformation journey is less arduous and more streamlined. This requires strong governance controls, training, policies, and well-defined performance metrics, but enterprise-wide risk awareness initiatives rooted in collaboration and a shared sense of purpose and ownership are more likely to yield positive results.
On the economic front, it is interesting to see Mckinsey's survey results show no correlation between IT spending levels and overall risk-management maturity for digital and analytics transformations – see chart below.
The fact that risk management challenges cannot be solved by increased budgetary allocation is perhaps further evidence that a well coordinated enterprise-wide risk awareness program is of greater importance.
2) Organisational Design:
Take a holistic look across the enterprise and determine which divisions / teams will benefit from a recalibration of the core business areas (e. g. performance, resilience, risk, and recovery) in anticipation of, and in response to new regulations, unexpected disruption, and other external factors. Ensure people, structures, data, policies and processes are organised with flexibility, staff wellbeing and knowledge sharing in mind to adapt to changing operational needs and market conditions, and they continue to create value through disruption. Traditional thinking often prohibits the restructuring of project delivery and BAU teams, but the evidence shows that building next-generation digital capabilities that are efficient, financially transparent, and operationally resilient requires a fresh and dynamic culture shift. For example, #Spotify's people-centric approach for scaling #Agile teams gained traction in engineering circles as there is less emphasis on formal processes and ceremonies, and a greater focus on flexibility and autonomy. #Twilio and #Netflix organise teams to use chaos engineering principles to intentionally break and test the resilience of their cloud-based services to ensure they can withstand unexpected disruption.
3) Add Observability to your Site Reliability Engineering (SRE) strategy:
Demands for increased security, reliability, and new customer experiences are leading to shorter timelines for delivering new digital products and services. This can be extremely challenging for many organisations heavily invested in traditional monitoring solutions that scan for issues or outages across application, data, and infrastructure assets. Implementing #observability in addition to traditional monitoring enhances the SRE team's ability to understand and orchestrate abnormal behaviour detected in the expansive data captured across the digital ecosystem, respond quickly to any issues or downtime, and achieve the objectives for Service Level Agreement (SLA), Service Level Objectives (SLOs), and Service Level Indicators (SLIs).
4) Finally, consider applying a multidimensional operational resilience framework to drive policies and best practices across people, data, technology, and processes. There is no shortage of post-pandemic operational resilience frameworks and guidelines aimed at providing a pathway for financial services and other highly regulated institutions to plug business continuity maturity gaps, and many of them quite rightly place an emphasis on impacts, risk appetites and tolerance levels for disruption. This is certainly a good starting point, but unlike the global pandemic, not all threats to operational stability and safety are slow to develop, prolonged and symmetric (think cyber attacks, major network outages, global financial market volatility etc.). It follows that the jury is still out on the efficacy of many operational resilience frameworks and guidelines until resilience test results and performance metrics have been standardised, aggregated, and subject to audit.