Help us shape the 2018 Ember Award categories
July 30, 2018Applications now open for the 2018 Ember Awards
September 9, 2018Identifying “criminals” through machine learning – why ethics should be included in all new tech
Biometrics are measurements or records of a body that can be used to uniquely identify a person. Fingerprints, DNA samples, and retinal scans are all biometric data commonly used for ID purposes in Canada. If you are a government employee, work with a vulnerable population such as children, or hold a Nexus card, chances are high you have already provided at least one Canadian government branch with biometric data.
Biometrics are also used across Canada for law enforcement. For example, Canadian passports have included biometric data (a digital photo of the holder’s face) for several years. Many individuals entering Canada from Europe, Africa, or the Middle East must submit biometric information, including fingerprints, for their visa application to be considered.
But as biometrics become the new ID norm, courts are having to play catch-up with changing technologies and the ethical implications of their use, particularly with law enforcement.
A Modern Case of Phrenology
Facial recognition is an increasingly common type of biometric data. It is non-invasive and quick to collect. It is also non-intrusive, to the point where individuals may not even realize their data has been recorded. In part because it is so non-invasive, many law enforcement branches believe facial recognition technology, capitalizing on biometric databases, could significantly improve public safety. It was with this goal — increasing public safety — that two computer science professors co-authored a study in 2016 on the use of machine learning to aid in locating convicted criminals.
Xiaolin Wu at McMaster University in Hamilton, Canada, and his colleague Xi Zhang at Shanghai Jiao Tong University, China, initially set out to disprove the idea that there could be a reliable link between someone’s face and whether or not they held a criminal record. Instead, the algorithm they designed correctly identified individuals with an existing criminal record by their facial features alone, with as much as 89.9% accuracy. Needless to say, Wu stated in an interview with New Scientist that they were “very surprised by the result.” (The original paper has been archived online, and is available here.)
The study also claims that the algorithm noticed something else: there was more variation in the facial structures of individuals with criminal records than those in the general population. In other words, the facial features of individuals with existing criminal convictions were literally “deviant”.
The study included many controls when assembling its data sets. The age, gender, and ethnicity of the subjects in the ID photos used were tightly controlled. No mug shots or police photos were included. The crimes that individuals in the criminal data set were convicted of varied widely, from non-violent crimes such as forgery, through to kidnaping, sexual assault, and murder. Individuals without a criminal record also varied in their professions, and included everything from truck drivers through to doctors, lawyers, and academic professors. Further, individuals with particularly distinct facial features such as tattoos or scarring were excluded from the outset. This left what should have been a fairly homogenous set of ID photos in the sample population for the algorithm to sort through.
Where Bias Factors In
Science-wise, the control methods of the study did not appear to be compromised. However, their method did not account for existing structural issues and bias already entrenched in judicial systems, such as who judicial systems are more or less likely to convict in the first place. This is particularly problematic because we already know that algorithms are incredible at quietly picking up on human bias.
Wu and Zhang were able to account only for biases they were aware of, and not these structural biases, which often result in disproportionate rates of incarceration for minorities and individuals without economic power. For example, it is well documented that a jury presented with a cleanly shaved man in a suit is likely to be much more sympathetic to him than a bearded man in dirty jeans and a t-shirt.
This is the root of the problem with the study Wu and Zhang conducted. Although it was done with the best of intentions, their algorithm reinforced existing inequalities, particularly by adding the perceived objectivity of computer science.
The idea of an algorithm or computer program that operates like a crystal ball to predict criminality is alluring. But it can only work if the algorithm identifies criminals better than human beings.
Unfortunately, by using convicted criminals in the study, without controlling for disproportionate incarceration of vulnerable, “deviant-looking” persons, the researchers were incorporating significantly compromised data into the algorithm. This algorithm could not accurately identify convicted criminals whose facial features were more commonly present in the population of law-abiding citizens. In other words, if a criminal looked “normal” enough, the algorithm could not identify him. The algorithm was just as likely to miss a clean-shaven criminal dressed in a suit as a jury of human beings, because it had no data to allow it to consider that sometimes criminals might wear suits.
The authors themselves noted that false positives and false negatives were present for all four of the facial classifiers that were tested. Sometimes individuals without criminal records were identified as criminals, and sometimes individuals who did have a criminal record slipped though undetected.
To its credit, the algorithm did exactly what it was programmed to do: establish whether or not there were any consistent facial features associated with individuals convicted of a crime. Unfortunately, programming an algorithm to look for similarities in the facial structures of already-convicted criminals is more an exposition of our human desire to ostracize those that are different, and less a way to reliably predict the likelihood of a given person to engage in future criminal activities. This is one of those cases where correlation is not causation, and the cause of the correlation discovered should give anyone reading the study significant pause.
The Need For A Broader Perspective
This study beautifully demonstrates what can happen when a broader range of social scientists, such as anthropologists, sociologists, criminologists, and psychologists, are left out of the development of crime-fighting tech. These experts would have been better positioned to help the study’s authors navigate the ethical dilemmas not only with their results, but would have helped them avoid using biased data in the first place.
The algorithm itself wasn’t bad science, just innocently ignorant science. Its authors didn’t know what they didn’t know, and in the end no one was hurt because, thankfully, the algorithm was not used in for police work.
The good news is that there are plenty of resources out there to help inventors, innovators, researchers, and entrepreneurs move forward ethically with their work. The best way to avoid sinking time and effort into an accidentally ignorant or biased project is to stop and think: What if my product were being used by law enforcement? What if someone other than me got a hold of my data, and used it for a different purpose than I had intended? Could my research be negatively exploited?
If you’re on the fence, head to your local university or college and search for a social scientist or an ethics board familiar with digital innovation and ethics, and just ask.