Friday, October 17, 2014

Ensuring Appropriate Use of Health Data Without Violating the First Amendment

Frank Pasquale

For the conference Public Health in the Shadow of the First Amendment.

Back in 2002, Dan Solove made the following observation on data brokers:
Consolidating various bits of information, each itself relatively unrevealing, can, in the aggregate, begin to paint a portrait of a person's life ... a ‘digital biography.’ A growing number of private sector organizations are using public records to construct digital biographies on millions of individuals .... These uses are resulting in a growing dehumanization, powerlessness, and vulnerability for individuals.
Twelve years later, these profiles have become even more invasive. Health data, in particular, has become ubiquitous. Brokers have lists of thousands of people with sexually transmitted diseases, diabetes, cancer, Alzheimer’s, dementia and AIDS. They don't need access to medical records to get this information. As Nicolas Terry has shown, patient-curated and medically-inflected data can be used as proxies for health status. When a firm has a sufficient quantity of customers and loyalty programs, it can even predict whether women are pregnant. Consumer health websites and online surveys are also analyzed for clues as to health status.

But not all the resulting lists are accurate. For example, reporters have documented mixups on "diabetes interest" lists and MS lists. I believe we deserve some rights to review, correct, and annotate this data. Some say that regulation along those lines will impede innovation. But in an era of big bad data, we need to encourage processes that verify and improve information sources.

First Amendment rights to disseminate inaccurate information should be limited. And note that one of the key cases defending opportunities to lie--US v. Alvarez--proposed as a "less restrictive means" the creation of an accurate government database:
[W]hen the Government seeks to regulate protected speech, the restriction must be the “least restrictive means among available, effective alternatives.” There is, however, at least one less speech-restrictive means by which the Government could likely protect the integrity of the military awards system. A Government-created database could list Congressional Medal of Honor winners. Were a database accessible through the Internet, it would be easy to verifyand expose false claims. It appears some private individuals have already created databases similar to this, see Brief for Respondent 25, and at least one data- base of past winners is online and fully searchable, see Congressional Medal of Honor Society, Full Archive,
Presumably, if the government created such a database, and then mandated that all other (purported) databases of Medal of Honor winners indicate that they are not the official, government-created one, that would not offend the First Amendment (see, e.g., the Gay Olympics case). Concerns about the proliferation of inaccurate databases are not fanciful: consider how credit reporting agencies responded to a mandate that they give out free reports. Rather than making it easy to find the official website, they promoted ersatz sites. As Solove observed then, "By maintaining the website, Experian [caused confusion and exploited] its legal requirement to provide free credit reports to hawk its expensive credit monitoring service instead."

The idea of maintaining official sources of data, for certain important uses, should not offend the First Amendment. Nor should required disclosures about sources of data. To help prevent the infiltration of health data into, say, scores and files used by employers, government could mandate that employers reveal the sources and data they use in making employment decisions. Without such transparency, we may never know whether discrimination based on health status--often illegal under the ADA--has occurred.

In an era of runaway data, we need some assurance that certain types of information are not polluting reputational databases. Improving their accuracy is an important public goal--a way to foster the production of knowledge.

And if the owners of such databases successfully insist that they have a First Amendment right to create inaccurate profiles, then lawmakers should prevent important decisionmakers from using such information in critical contexts. No one has a First Amendment right to fire an employee for being sick, or for appearing to be sick according to a database and algorithms.

Such reinforcement of extant anti-discrimination law would not require those collecting data to steer clear of including health-inflected material in their dossiers on individuals--they are free to create as much “expression” in this way as they want. They would just need to be sure that their database was clear of sensitive information before it was used in certain important decisionmaking contexts. The distinction between collection and use of data is a critical bulwark against inaccurate, unfair, or irrelevant information eroding the scientific validity of data-driven decisionmaking. It should also support fair opportunity for those who want to be judged on the basis of their aptitude, rather than their health status. Collection of data may persist, for some, as a "say anything" realm of unsupported assertions and associations. But important data uses can legitimately be confined to better vetted sources.

