Discussion Prompt: When, if ever, is predictive policing effective, fair, and legitimate? What is the role of data reliability in this?

See all contributions to this question.

Terms like ‘predictive’ and ‘intelligence-led’ policing seemingly crop up everywhere these days, but their meaning and usage are far from clear. The UK’s West Midlands Police seeks to demystify these strategies, address technical shortcomings of the tools and tactics involved, and as a moral imperative – harness relevant data to save lives.


After 24 years in the UK’s West Midland’s Police, I have seen the evolution of policing strategies and tactics change in line with the evolution of technology. Predictive policing is one of these strategies, which within West Midlands Police, and as national lead for data analytics, I have worked hard to counter risks associated with it. As Detective Chief Superintendent and Head of Professional Standards, I hope I can influence others to follow in this same vein.


Assumptions & misconceptions

When I started looking at how best to answer this question I coincidentally came across an article from US media, which conveniently captures some of the misconceptions about predictive policing. It quite succinctly highlights the risks of using the term without clear definition.

Dallas police recently publicised a decision to be intentionally ‘intelligence-led’ and the headline that followed was “‘Predictive policing’ part of Chief Hall’s crime plan, raises concerns. The article discusses the risks of marginalising and disadvantaging those from black and minority ethnic communities. When I read and re-read the article however, I realised it was not clear what exactly it was that Dallas police were doing (i.e. how exactly they were implementing ‘intelligence-led’ policing), and therein lies the problem. 

Too often, assumptions are made either about what ‘predictive policing’ is, or that data analytics in a policing context is necessarily ‘predictive policing’. When such assumptions are inaccurate or untested they can lead to mistrust, the development of perceptions of unfairness, and ultimately the undermining of legitimacy. So how do we address this?

The first thing is to clarify that when we talk about data analytics, we are simply talking about analysis of information sped up by the use of tools. This is the sort of work human analysts have always done using spreadsheets, and before that, with pencil and tally marks. It’s descriptive and nothing more. While true that some might take further explanatory insight from such analysis and determine next actions based upon it — I would contend this is not ‘predictive policing’ as the term is usually used.

When literature generally refers to ‘predictive policing’ it is assuming applications of algorithmic decision making through automated analysis of datasets and perhaps even integrated datasets. This is the sort of capability promoted by some well-known software providers and that has since been dropped by law enforcement agencies both in the UK and the US, after trials that sometimes proved inconclusive. Critics of these systems and trials have suggested that the data used is drawn from those areas within our communities that are already ‘over-policed’ and disproportionately populated by black and minority ethnic groups; thus the interventions delivered by police — as a result of decisions made by algorithms — disproportionately affect people within those communities, which clearly isn’t fair.


Professional judgement 

So this leads us to more concretely ask, what is an algorithm? And are all algorithms equal or inherently unfair? An algorithm is simply a “process or set of rules to be followed in problem-solving. It is a structured process. It follows logical steps”.

On its face, this doesn’t sound too problematic. As a young community beat officer I used to turn up for a shift and flick through, or analyse, the crime reports, offending behaviours and intelligence records received since my last shift and cross reference these with my areas of responsibility, the timing of my shift, and add in a bit of professional judgement. Articulating exactly what that professional judgement amounted to is difficult though. However, it would generally involve a process or set of rules I felt were logical and helped me determine where I should go and when whilst on patrol, in order to solve the age old problem of reducing crime and disorder, protecting the vulnerable, and keeping our communities safe.

People like beat bobbies (police officers who patrol the streets on foot); they trust them. The vast majority of the public believe them to be fair in their approach to policing, but they don’t trust algorithms.

Here we must look again at my definition of professional judgement and then look again at the definition of an algorithm. Of course I chose to use the same terminology in my own definition, but precisely because the words used to describe algorithms so accurately described my process of policing as a beat bobbie.

So why does mistrust creep in when we start to work with automated analytical tools, but not when exercising professional judgement based on experience? Particularly those tools that suggest or ‘predict’ what we should do in response to that analysis, and particularly where that recommendation is based upon the application of an algorithm?

The answer is that I could always be held to account for my decisions. I could be directly questioned by those that tasked me, my seniors, my critics, and members of the public that I served.

Embedding an interpretation of my thought process into a coded algorithm though, risks creating a sense — amongst those experiencing the impact of the decisions being made — of being subject to power which is fixed and over which they have no control or means of redress.

Is it therefore the algorithm or the decision making which is the problem? Or is it the lack of transparency, explainability, and accountability that leads to perceptions of unfairness and undermines legitimacy?

I would suggest that just as a beat bobby’s professional judgement can be noble, well researched, and accurately and fairly applied, so too can an algorithmic process be a tool for public good. That being said, we need to show: 1) how it works; 2) what data is being fed into it and why; 3) what outcomes it produces; 4) how those outcomes will influence the delivery of public services; and 5) what such interventions look like. It is this illustration that is often missing, and was certainly missing from the reporting about Dallas Police’s intelligence-led approach to ‘predictive policing’.


Data quality

This of course leads to the second question; that of data reliability and its role.

I think there are two main areas of concern around data reliability. First, the quality and consistency of the data within our systems; and second, the validity of the sources feeding those systems.

When we developed our own Data Driven Insights capability in West Midlands Police, we set about integrating disparate datasets using a common data model. This now gives us confidence that we can draw upon a single version of the truth across those datasets with respect to POLE entities: Person, Object, Location, Event. Before adopting this approach we had a scenario where we ‘produced’ data about such entities in various forms — different across different systems, and too frequently even varying within the same system. We then dumped all that data in a poorly structured fashion, in a warehouse, and if anyone wanted it we would point them to the warehouse and say, “it’s in there somewhere, help yourself. And good luck!”[1] Can you imagine any other business treating its ‘product’ in such a careless way? This is why policing needs to have a common data model. With confidence in the quality and accuracy of the data, we can eliminate more errors and use it in a more meaningful and fair way.

The second point however, is that we must also have confidence that the data we collect is the right data. To do this we need to be an intelligent organisation; we need to consider what data we require as an output of every process, at the first point that we build that process. So often in public services we try to retro-fit data collection to existing processes in order to satisfy a subsequent consideration, such as super-imposed inspection regimes.

With the right will and application this can be easily achieved. Where the challenge is greater though, is in the areas of policing we deliver that naturally generate data collection, and the suggestion that this in itself leads to bias through over-policing of certain communities.

This is a very relevant challenge but one that is more complex than some would suggest. It is more complex than dismissing the concept on the basis that we simply respond to public demand, and also more complex than suggesting areas of high demand correlate with areas of disproportionate populations and therefore all resultant data is skewed and should be disregarded.


Data as responsibility

I intend to commission more detailed work in this area, but for now I will just highlight one of the risks the latter viewpoint presents: Domestic abuse is often a hidden crime. We know many victims will undergo multiple attacks before reporting to authorities and some may never report their abuse. The point at which all domestic abuse becomes visible though, without exception, is at the point it results in homicide. Every domestic homicide in England and Wales is subject to a multi-agency domestic homicide review. Almost invariably these reviews highlight areas where information or data was not adequately, if at all, shared between agencies. In many cases, that data sharing could have prevented a death. 

It is without doubt therefore, that there is an inordinate amount of relevant data that can help policing better fulfil its obligation to protect our communities and uphold their Article 2 and 3 rights; the right to life and the right to protection from torture and inhumane treatment. We would be negligent if we disregarded this data.


Consent and fairness

Therefore, in West Midlands Police we’ve taken great effort to address the risks described above. Furthermore, leading data analytics in the nation, I hope I can encourage others to adopt these practices. We have developed an independent data ethics committee. The panel was recruited not by members of the police service, but by our elected overseers.[2] It comprises lawyers, ethicists, philosophers, criminologists, data scientists, independent policing specialists, and informed community members. All the documents, minutes, decisions, and advice offered are published openly. Principally the work of our data science laboratory will not move from tasking to the operational arena without oversight and scrutiny by this committee. 

Having iterated this process several times before securing their satisfaction, we have now presented the work before a more representative group from our communities. And all this is still before it reaches the operational arena. The ethics committee, and broader community group, are being drawn together, again, not by members of the police service but by a social enterprise. They incorporate representatives of diverse communities, including those with relevant lived experience. Ultimately, the committee and community group are observed by colleagues from civil society to give independent and objective feedback on top of community insight.

This, we hope, is how we achieve transparency and ensure fairness in the eyes of those directly affected by our actions. This, we hope, is how we develop trust and build legitimacy. For nearly two hundred years now, we have been drilling ourselves in the Peelian principles of policing by consent; principles that have stood the test of time. “The power of the police to fulfil their functions and duties is dependent on public approval of their existence, actions and behaviour, and on their ability to secure and maintain public respect”.

Predictive policing can and absolutely should be a force for public good, but we need to understand exactly what it entails.



[1] Police data has naturally been held securely, managed in accordance with the Management of Police Information (MOPI) requirements, protected in line with the various iterations of the Data Protection Acts, and only accessible to those sufficiently trained and vetted in line with Information Security requirements. The sentence referenced above is purposely flippant for effect and to demonstrate that organisation of data can nevertheless be improved upon.

[2] Every Police Force in England and Wales is held to account by an elected official, the Police and Crime Commissioner.