That Excel problem
Public Health England and by extension the government itself has taken a lot of criticism this week as a result of the Excel debacle that led to almost 16,000 positive test results not being uploaded to the test & trace dashboards in a timely fashion.
Give our own client experience it’s reasonable to imagine any number of business leaders now checking their own reliance on Excel spreadsheets and identifying any potential vulnerabilities. After all, there remains an astonishing number of organisations managing critical data through Excel.
And while we could talk about the urgent need for greater digitalisation in government and business, or introduce the different levels of data maturity (clue, dependence on Excel is ground zero in terms of data maturity), we thought it might be more useful to provide a simple solution to the specific issue.
The most dangerous software in the world?
But first, it’s worth reflecting on Excel itself. First launched for the Mac in 1985 and for Windows in 1987, it has dominated the world of spreadsheets ever since, benefiting from Windows dominance of the desktop market and creating anything up to 2bn users across the globe.
Excel is a fantastically useful tool when used for the right purposes and in the right context. When it is deployed as a de facto database however it suffers from many limitations including managing version control, accidental or deliberate alteration of formulae, and a general lack of transparency.
Ultimately it’s perhaps the easy familiarity with Excel that is it’s biggest problem today, tempting users to deploy it for inappropriate tasks, leading to a lack of oversight, governance, and QA and ultimately to problems like the one we have just experienced with Public Health England.
First, it’s important to emphasise that we sympathise with all the civil servants and public sector personnel scrambling to respond to the multiple challenges arising from the pandemic. Working in haste to create a solution tends to mean that you rely on what you know – and what most people know (to a greater or lesser extent) is Excel.
Second, there has to be oversight and QA processes in place, if the risks of relying on Excel were not identified when the project was being set-up then this is the stage at which they should be spotted. It is vital that projects aren’t simply set up and forgotten, they must be subject to ongoing review.
As Vasileios Vasileiou from the Profusion data team put it, if we build a lift for a building of a certain size – which is subsequently extended to a skyscraper – then it becomes vital that we upgrade the lift to meet the new requirements (or people will be walking up a lot of stairs). Thanks, Vas!
Working at speed, especially on something as sensitive and important as this, can’t be an excuse for not following reasonable processes. Henrik Nordmark, Head of Data at Profusion adds that when the team receives batch data from customers there is always a process of comparison with previous data deliveries with any anomalies automatically detected and flagged for a response. We are then able to query such anomalies with the client team.
To be clear there should be at least two tiers of automation – one for checking the incoming data (data ingestion) and one for ensuring that the automation is working correctly (e.g. did it run overnight). This can obviously continue to higher levels, for example at Profusion we have a higher-level solution monitoring all projects.
No matter the number of layers the principle remains that the system will flag potential issues for the project team to address. In this instance, it appears that any warnings received were not seen or actioned – only emphasising the need for active (human) monitoring.
Third, even if no alternative to Excel is considered feasible in the time available (for procurement, security, infrastructure, skills, security, or other reasons), it is possible to mitigate the risks by introducing automated system checks as well as performing regular QA on data outputs.
Fourth, and most importantly perhaps, no individual public official should be left to make these decisions on their own, there has to be appropriate support and access to the right knowledge and skills.
The first database option to explore for Excel users is the MS Access database included as standard within Office (and therefore not subject to any additional costs). We appreciate that this hasn’t always been the most popular package but suspect that in many cases this is because of a general lack of training and psychological resistance to shifting from the familiarity of Excel itself.
Beyond Access, the next solution to this problem is to look at a MySQL database. MySQL is an open-source database solution, freely available to download, compatible with all major programming languages, and most importantly, fully compatible with Excel (there is even a MySQL for Excel package).
Finally, with a database in place, it is important to look for an effective Business Intelligence (dashboard) solution that can really bring your data to life for users. Note that the best of these tools will also provide automated checks and preparation for the data to be used.
At Profusion, our BI partner of choice is Sisense with whom we have partnered for a number of years, and who we would have no hesitation in recommending – but don’t just take our word for it, check out their 2020 Visionary Magic Quadrant rating from Gartner.
Ultimately the solution to this and similar issues is better education, training, and especially awareness, around data and technology. Let’s be honest if your office was emblazoned with a poster saying that Excel is the most dangerous software on the planet you might think twice about deciding to rely on it for your project!
For any aspiring data organisation today, it’s essential that everyone has a baseline of data understanding while also appreciating the importance of (quality) data as an organisational asset. This extends from front-line data entry to senior leadership decision making.
And that is exactly what the nascent Profusion Data Academy is designed to support with dedicated modules targeted at Grads and Apprentices, Managers and Departmental Teams as well as C-Suite Executives. Talk to us about how we can accelerate your organisational learning.
After all data is too important to be left to the data team (alone)!
Michael Brennan, Consultant at Profusion