Earlier this summer I completed a graduate student summer internship for the Mayor’s Office of the Chief Data Officer (also known as DataSF). Over the course of two months, I worked on a number of “starter” projects, like the Budget Visualization mentioned in a previous post, and a main project, which consisted of researching and summarizing the literature on “Ethics and Accountability in Algorithms”.
In the process of compiling information on the topic, I interviewed half a dozen scholars and practitioners in the field. This ended up being a particularly worthwhile experience, as I hadn’t conducted formal conversations with experts in support of my own research before. One of the leading thinkers I had the honor and pleasure of speaking with was Professor Ellen Goodman of Rutgers Law. Goodman was a panelist at the What Works Cities Summit discussing these issues with regard specifically to local government, and my team at DataSF (who also attended the conference) suggested I incorporate her input into my research summary. My hope is that the summary will be used to develop a toolkit or guiding framework for cities around the US to use when developing (or soliciting) and deploying algorithms.
Professor Goodman and Professor Robert Brauneis of George Washington University recently published their findings from open records requests of algorithmic programs as a paper titled ‘Algorithmic Transparency for the Smart City’. I summarize those portions salient to my developing interest and some preliminary reactions below.
Opaque algorithms are particularly concerning in the public sector because they undermine what we’ve come to expect (at least ideally) from local government in terms of accountability to constituents. The authors filed 42 open records requests in 23 states seeking essential information about six different predictive algorithm programs, designed in various cases by for-profit, non-profit, and academic partners. “Meaningful algorithmic transparency” was not delivered in any instance, though Allegheny County did give substantial insight into their Family Screening Tool.
“An algorithmic process is perfectly transparent when its rules of operation and process of creation and validation are completely known. Meaningful, sufficing transparency is a lower standard. An algorithmic process is accountable when its stakeholders, possessed of meaningful transparency, can intervene to effect change in the algorithm, or in its use or implementation.”
What’s at risk when algorithms deployed in the public sector remain obscure is corporate capture of public power. The three primary drivers of this threat when we talk about algorithms and government, which I describe in more detail in following sections, are the lack of record keeping standards, insufficient pressure from officials who buy these solutions, and abuse of confidentiality claims by solution vendors.
“Use of big data and predictive algorithms is a form of governance – that is, a way for authorities to manage individual behavior and allocate resources.”¹
What’s the allure to governments of deploying algorithms? They bring with them the promises of predicting citizen behavior more accurately (by surfacing patterns in large, cross-sector datasets), distributing resources or services more efficiently, and reducing or eliminating bias in decision-making. The potential here is certainly enticing, especially for cash-strapped departments who experience increasing pressure to do more with fewer people and/or less money (which, in my opinion, is reflective more broadly of how the general public feels about how technology is supposed to help the world). In ramping these things up, however, in both public and private sector settings, a whole host of pitfalls have been overlooked. It’s the sort of thing we’ll look back on 5-10+ years from now and reflect on how reckless we went about things by rushing in.
A predictive algorithm’s recommendation “actually masks an underlying series of subjective judgments on the part of the system designers about what data to use, include or exclude, how to weight the data, and what information to emphasize or deemphasize.”² Goodman and Brauneis highlight this counterintuitive notion of algorithmic subjectivity with an example about how classification algorithms are designed to handle false positives and negatives. They also delve into the significance of validation studies and opening up the design and results of such tests to public scrutiny.
Results of Open Records Requests
Existing open data practices provide a great blueprint for how algorithmic transparency should work, but governments aren’t (proactively) pushing key details about their algorithms. Pull requests, like those conducted by Goodman and Brauneis, are coming up short as well. The parties currently responsible for handing over such information suffer from a dearth of documentation about the algorithmic programs their departments have purchased and put into place.
“These include records about model design choices, data selection, factor weighting, and validation designs. At an even more basic level, there would be a record of what problem the model was supposed to address and what are the metrics of success.”
Such records should conceivably be handed over by contracted vendors very early on, but, as the authors found, government clients aren’t exercising their leverage around disclosure limitations. Vendors aggressively assert claims of trade secrecy, and governments are content to go along with it because they themselves are afraid of what the public might think about how they’re using algorithms to govern, or about the fact that they’re using them at all. Furthermore, the groups selling these software products lock governments into long-term, burdensome contracts that stifle innovation and relinquish control of public data.
Results, Potential Fixes, Conclusion
The authors encountered a variety of responses to their requests. Some jurisdictions simply didn’t respond. Others claimed an open records exemption. When information was provided, the scope and quality was all over the map. As mentioned, none of the correspondence was up to snuff for what might be considered meaningful transparency. The six programs they tried looking at were:
- Public Safety Assessment
- Eckerd Rapid Safety Feedback
- Allegheny Family Screening Tool
- New York City Value-Added Measures
Though the Allegheny FST can’t be considered an entirely open source project, it gets closer than any of the other five covered.
In addition to putting more pressure on vendors to play ball with transparency standards, governments should be maintaining their own records about all sorts of information regarding the algorithms they deploy.
“Algorithmic accountability in the public sphere requires that government actually be held accountable for the algorithms it deploys… the challenge is to specify a degree and form of transparency that is meaningful for the public and practical for developers and governments.”
While additional, multi-stakeholder deliberation will help further shape what is meant by ‘meaningful algorithmic transparency’ as it relates to local governance, the paper identified eight categories of documentation that we the people deserve ready access to:
- the algorithmic model’s general predictive goal
- relevant, available, and collectible data
- considered exclusion of data
- specific predictive criteria
- analytic techniques used
- principal policy choices made
- results of validation studies and audits
- explanation of the predictive algorithm and the algorithm output
There already exist very real examples of injustice inflicted by exactly the type of algorithmic programs that the authors submitted open records requests for and describe in their paper. Cathy O’Neil’s Weapons of Math Destruction does a wonderful job of covering some of the more frightening cases. As O’Neil details in her book, it’s the ability of algorithms, and the bias many conceal, to scale unchecked that makes this whole phenomenon so alarming. In wrapping up the paper, Goodman and Brauneis leave us with a bit of urbane advice for urban administrators: “What public entities should be focused on is undertaking the design, procurement, and implementation of algorithmic processes in more thoughtful and transparent ways.”