While organisations are starting to understand the value of being data-centric, they're finding it increasingly challenging to manage and process data for existing/legacy processes. Nikhil Asthana explains how automation helped us optimise our own corporate governance review. 

As large amounts of information are being processed from multiple different sources, manual methods of obtaining data are becoming increasingly challenging for firms. It's more likely that information obtained will be erroneous, complex and unstructured, meaning that pinpointing key information and gaining insight is difficult.

We look at an example of our latest automated system that we developed to deal with these data challenges using various advanced tools and techniques.

Identifying key outcomes

We wanted to strengthen our corporate governance review process by automating the collection of annual reports and providing a streamlined method of visualising the data. The process of automation covered a range of factors to enhance our systems, including using structured data-processing and a centralised storage unit that could be easily integrated into our current infrastructure.

We also wanted to highlight the key outcomes of improving our data automation tools. Through a range of testing methods and analysis, we identified the main factors that explained our overall approach:

  • Annual reports are manually sourced by the in-house team and many resources are required to complete the corporate governance review. The process is also time-consuming, so an easier way to collect this data was needed
  • Extracted data often appears unstructured and difficult to filter through. A process of data recognition was required to highlight the relevant metrics and improve the standard of reporting
  • A centralised data platform was also needed to provide direct access to the team and increase the security of the information. This would also need to be available to access 24/7 and provide all the relevant information in one place
  • Structuring the data would also require a clear method of visualisation to provide strong insight on the large data sets. This user interface (UI) would allow individuals to access real-time information and identify data inaccuracies

Our previous method of completing an annual corporate governance review required answering 200+ questions on the latest annual reports produced by each FTSE 250 company. The data would then need to be collected and answered manually by our team and took approximately a day to complete per annual report. They found that this method was time-consuming and had difficulty extracting the strongest data.

Our solution was to automate the collection of annual reports and visualise the results. The dashboard broke down the information by different categories, such as inclusion and diversity, environmental, social and governance (ESG), and audit risk.

Distinguishing data

To improve the process, we identified a series of solutions that would automate the data and could be integrated easily into our current systems. This required using robotic process automation (RPA), machine learning (ML) and artificial intelligence (AI) to automatically gather the information from the annual reports and extract the latest reports from each company.

The information would need to be stored in an internal network, so the solution was to create an in-house web app to host and manage this information that could be integrated into our current systems. Identifying the key strengths and weaknesses of each company was also necessary to get the most out of the information, so we developed a dashboard that clearly indicated the key metrics and allowed the client to easily distinguish data.

Corporate governance automation components


How we added value

By establishing an automated system of collecting data, we found a range of benefits across the board. Our solution enhanced the metrics by providing more accurate data at a faster rate and, in most cases, the information was processed in five minutes. Incorporating a web application into this process created an improved functionality that allowed the storage of the latest annual reports and made it easier for us to review the information through a centralised method that could be used by the whole team.

Previously, the corporate governance process required a manual system of going through the information individually and using resources to answer the questions. Our feedback showed that an automated system returned a more accurate automation model that gave better information and required fewer overall resources to answer the questions. Our data collation impact was 40 minutes saved per report and 20 minutes saved per report QA, which totalled 32 days of manual work saved across all reports.

Making the most of the system required a clear method of visualisation. The original UI of the corporate governance system was a basic dashboard that had limited visuals and no roll-over for future years, which slowed the process of obtaining data and required more manual observation to distinguish key metrics. Our updated system enhanced the dashboard with a complete picture of our key areas and a method of data comparison to competitors in the industry. This sped up the process of obtaining the relevant data and clearly highlighted the findings of the reports.


Web Application

  • Allows authorised users to login via a website
  • Users can automatically collect or manually upload annual reports
  • Users can select which report to process through machine learning engine
  • Users are notified and sent output of machine learning engine

Web scraper 

  • Built in Python, the web scraper can automatically fetch the latest annual reports
  • User is notified once annual reports have been collected
  • Annual reports are stored within Azure storage for easy access to all users
  • Web scraper can collect FTSE 350 annual reports within 5 minutes

Machine Learning Engine

Using AI and Data Science techniques, extracted meaningful insights form the annual reports, Data Science module to automatically identify the Contents Page of the annual Reports using Natural Language Processing. Extracting the contents of the Annual Reports to parse information at different sections of the report. 

A combination of Object Detection, Optical Character Recognition (OCR) and Named Entity Recognition is applied to extract domain-specific information (eg Board of Directors page).

Data visualisation

  • Power BI dashboard to show information regarding key topic areas asked in the machine learning/data collation phase
  • Users can drill-down by different filters including question type, company year-end date etc.
  • Users can compare companies against each other plus FTSE/Industry averages 

Enhancing your existing business process

Having access to the strongest data is necessary to improve your processes. Building systems that obtain the strongest metrics on a centralised platform is an important step to building an effective infrastructure and streamlining the overall process of reporting.

Meet the needs of your firm for developing stronger data tools

We can support firms looking to enhance their systems by automating their existing processes and integrating a tailored solution that will provide more accurate data.