Thursday, March 26, 2015

Understanding Value in a Black Box Society

Frank Pasquale

For the Innovation Law Beyond IP 2 conference, March 28-29 at Yale Law School

A “big data revolution” is afoot in the social sciences. The increasing volume, variety, and velocity of data are irresistible raw material for inquiry. For its most optimistic exponents, the “datistic turn” renews social science by focusing inquiry on objective, verifiable, and measurable facts.* Explicit models of behavior premised on (quasi-)experimental evidence may render once-soft fields as hard as biology, chemistry, or physics. On this account, experimental or quantitative social science has led the way, and other fields must conform their methods accordingly, or risk marginalization or extinction.

The datistic turn should revive interest in a neglected meta-field: the philosophy of social science. Lively debates raged in mid-20th century between some forerunners of today’s big data devotees (behaviorists), and interpretive social scientists committed to more narrative, normative, and holistic inquiry. The behaviorists’ tendency to treat mental processes as a “black box” is uncannily echoed in many current researchers’ uncritical acceptance of extant corporate data sets (and limits imposed on their use) as objective records.

Given firms’ triple layers of real and legal secrecy, and obfuscation, journals should be wary of such research until it is truly reproducible. Moreover, given the importance of key firms themselves to understanding our society, their internal decisionmaking should be archived for eventual release (even if it is decades in the future).  Social scientists might consider going beyond analysis of extant data, and joining coalitions of activists, to assure a more expansive, comprehensible, and balanced set of “raw materials” for analysis, synthesis, and critique. In short, rather than solely watching society, social science must now commit to assuring the representativeness and relevance of what is watched. The only alternative to “future-forming” research is to let the most powerful pull the strings in comfortable obscurity, while scholars’ agendas are dictated by the information that, by happenstance or design, is readily available.

The same cautions should govern legal scholarship on the platform economy. Digital labor remains highly controversial. For example, Uber has very creatively orchestrated a series of studies and alliances purporting to demonstrate the value and importance of its services. However, in order to truly understand its social costs, as Brishen Rogers shows, we would need to have access to far more information, which is now proprietary and hidden. For example, who approved its fake ride requests to undermine its competitor, Lyft? What types of returns are investors being promised? How much of the firm’s success is due to real, productive innovation, and how much simply reflects regulatory arbitrage (akin to Amazon’s famous tax advantages over brick-and-mortar retailers)? Qualitative researchers who study platforms have as much or more to say about such questions as quantitative ones.

Similarly, the extraordinary controversy over the only partially-available FTC staff report on the agency’s antitrust investigation of Google shows how even innovation policy itself can remain “in the dark” when it is politically convenient for it to remain so. I called for release of the report in 2013, only to be met with stony silence by the agency. Now, every other page of the report has been inadvertently released, and even this partial disclosure has several damning allegations and pieces of evidence. Until the full report is released (as well as some indication of the scope and nature of the controversy between the enforcement and economics staff working on the case), critical digital competition policy in the US remains opaque. Given what we have now, it’s hard to resist the conclusion that brute political calculations overrode the agency’s expert judgment. In this environment, it is no wonder that European policymakers remain suspicious of US firms and their coadjutants in key US agencies.

When state and trade secrecy impose severe limits on the availability and use of sources, we must be very cautious about drawing conclusions too quickly about the nature of the digital economy. Leading firms have an agenda, which researchers can unwittingly advance when they focus inquiry on data which (executives have decided) are innocuous enough to be disclosed. A diverse coalition of watchdog groups, archivists, open data activists, and public interest attorneys are now working to assure a more balanced and representative set of “raw materials” for analysis. The critical and emancipatory potentials of social science and legal scholarship depend on the success of such efforts.

*For contrast, I am trying to evoke here the "linguistic turn," with this neologistic merger of "data" and "statistics." I hope to develop the ideas in this post more in the lecture below (click to enlarge), later this Spring.

Frank Pasquale is a Professor at the University of Maryland School of Law and a member of the Council for Big Data, Ethics, and Society. He can be reached at

Older Posts
Newer Posts