E-mail:
Jack Balkin: jackbalkin at yahoo.com
Bruce Ackerman bruce.ackerman at yale.edu
Ian Ayres ian.ayres at yale.edu
Corey Brettschneider corey_brettschneider at brown.edu
Mary Dudziak mary.l.dudziak at emory.edu
Joey Fishkin joey.fishkin at gmail.com
Heather Gerken heather.gerken at yale.edu
Abbe Gluck abbe.gluck at yale.edu
Mark Graber mgraber at law.umaryland.edu
Stephen Griffin sgriffin at tulane.edu
Jonathan Hafetz jonathan.hafetz at shu.edu
Jeremy Kessler jkessler at law.columbia.edu
Andrew Koppelman akoppelman at law.northwestern.edu
Marty Lederman msl46 at law.georgetown.edu
Sanford Levinson slevinson at law.utexas.edu
David Luban david.luban at gmail.com
Gerard Magliocca gmaglioc at iupui.edu
Jason Mazzone mazzonej at illinois.edu
Linda McClain lmcclain at bu.edu
John Mikhail mikhail at law.georgetown.edu
Frank Pasquale pasquale.frank at gmail.com
Nate Persily npersily at gmail.com
Michael Stokes Paulsen michaelstokespaulsen at gmail.com
Deborah Pearlstein dpearlst at yu.edu
Rick Pildes rick.pildes at nyu.edu
David Pozen dpozen at law.columbia.edu
Richard Primus raprimus at umich.edu
K. Sabeel Rahmansabeel.rahman at brooklaw.edu
Alice Ristroph alice.ristroph at shu.edu
Neil Siegel siegel at law.duke.edu
David Super david.super at law.georgetown.edu
Brian Tamanaha btamanaha at wulaw.wustl.edu
Nelson Tebbe nelson.tebbe at brooklaw.edu
Mark Tushnet mtushnet at law.harvard.edu
Adam Winkler winkler at ucla.edu
Medicine is a notoriously
unpredictable science. A treatment that provides a miraculous recovery for one
patient may do nothing for the next. A new chemotherapy drug may extend patient
lives by two years on average, but that average is made up of some patients who
live many years longer and some patients whose lives are not extended at all,
or even are shortened. And with new drugs costing more and more money, personalizing
medicine is increasingly important, so that doctors can predict disease
risk and choose treatments tailored for individual patients.
This unpredictability has a simple
cause. The human body is extraordinarily complex, with endless genetic
variations, biological pathways, protein expression patterns, metabolite
concentrations, and exercise patterns (to name just a few of the dozens of
variables) affecting each person differently. And only a few of these variables
are well-understood by scientists. When a drug doesn’t work, then, or a patient
develops a rare disease, it could be because of some genetic variation, or a
particular metabolite concentration, or several of these things acting together
in ways doctors may never understand.
Black-box
medicine—the use of big data and sophisticated machine-learning techniques
in opaque medical applications—could be the answer.It takes significant time, money, and luck
for scientists to discover the precise combination of variables that makes a
drug work or not—if it can be discovered that way at all—but with enough data,
a machine-learning algorithm could find a predictive correlation much more
rapidly. Using datasets of genetic and health information, then, researchers
can uncover previously unknown connections between patient characteristics,
symptoms, and medical conditions. And these connections promise to yield new
diagnostic tests and treatments and to enable individually tailored medical
decisions.
Big-data techniques are only as
powerful as the input data and the methods used to analyze those data. Health
care is especially ripe for a big-data
revolution, though, because of the sheer quantity of data available:
researchers can obtain an endless variety of data points from literally
millions of patients. And because assembling and analyzing such large-scale
datasets is becoming easier
and cheaper by the hour, many different researchers, from both industry and
the academy, are working on ways of using data for everything from guiding
choices between different drugs to best allocating scarce hospital resources
among different patients.
The sheer scale and scope of
health data available to researchers, and the sensitivity of that data, lead to
two related but opposing problems. The first problem is algorithmic
accountability. Because of the black-box nature of big-data techniques and the
sheer complexity of biological systems, it can be difficult or impossible to
know if conclusions drawn are incomplete, inaccurate, or biased, whether due to
data or analytical limitations or due to intentional interference. These
conclusions can sometimes be validated by researchers or government agencies,
but doing so can be expensive and difficult, and can require access to the same
extensive medical data from which the conclusions were drawn.
The second problem is privacy.
Medical information is some of the most private and sensitive information that
exists, and black-box medicine requires access to a lot of that information. It
also creates new information, like predictions based on the models developed
with big data. And this information may be used in ways that harms individuals,
whether through marketing, sales to others, or discrimination in employment,
insurance, or other decisions. Even when it is not used in these ways, its
collection, disclosure, and use can infringe individual autonomy and decisional
privacy.
These two problems are
interrelated because efforts to reduce one will usually make the other worse.
The solution to the accountability problem is to validate black-box models, but
that requires access to more information, which can exacerbate the privacy
problem. And the solution to the privacy problem is to limit the amount of
information to which researchers, companies, and the government have access,
but that can make it harder to validate models and easier to hide or overlook
algorithmic problems. Algorithms need to be validated to ensure high-quality
medicine, but at the same time, a data free-for-all would eviscerate patient
privacy.
Solutions to the accountability
and privacy problems, then, must consider the broader effects on black-box
medicine. We propose three pillars to an effective verification system that
respects patient privacy. The first is a system of limitations on the collection,
use, and dissemination of medical data, so that data that is gathered and used
to develop and verify black-box algorithms is not also used for illegitimate
purposes. The second is a system of independent gatekeepers to govern access
to, and transmission of, patient data, so that government and independent
researchers can work to verify big-data models. And the third is robust
information-security provisions, so that unintended outsiders cannot obtain,
use, or disseminate patient data. The design of these verification systems can
draw on the ongoing debate over the disclosure of clinical-trial
data, which has addressed related issues of how to promote data sharing
without sacrificing patient privacy. These verification systems provide the
greatest means of ensuring that black-box medicine lives up to its promise
through a system of privacy-protecting verification.