E-mail:
Jack Balkin: jackbalkin at yahoo.com
Bruce Ackerman bruce.ackerman at yale.edu
Ian Ayres ian.ayres at yale.edu
Corey Brettschneider corey_brettschneider at brown.edu
Mary Dudziak mary.l.dudziak at emory.edu
Joey Fishkin joey.fishkin at gmail.com
Heather Gerken heather.gerken at yale.edu
Abbe Gluck abbe.gluck at yale.edu
Mark Graber mgraber at law.umaryland.edu
Stephen Griffin sgriffin at tulane.edu
Jonathan Hafetz jonathan.hafetz at shu.edu
Jeremy Kessler jkessler at law.columbia.edu
Andrew Koppelman akoppelman at law.northwestern.edu
Marty Lederman msl46 at law.georgetown.edu
Sanford Levinson slevinson at law.utexas.edu
David Luban david.luban at gmail.com
Gerard Magliocca gmaglioc at iupui.edu
Jason Mazzone mazzonej at illinois.edu
Linda McClain lmcclain at bu.edu
John Mikhail mikhail at law.georgetown.edu
Frank Pasquale pasquale.frank at gmail.com
Nate Persily npersily at gmail.com
Michael Stokes Paulsen michaelstokespaulsen at gmail.com
Deborah Pearlstein dpearlst at yu.edu
Rick Pildes rick.pildes at nyu.edu
David Pozen dpozen at law.columbia.edu
Richard Primus raprimus at umich.edu
K. Sabeel Rahmansabeel.rahman at brooklaw.edu
Alice Ristroph alice.ristroph at shu.edu
Neil Siegel siegel at law.duke.edu
David Super david.super at law.georgetown.edu
Brian Tamanaha btamanaha at wulaw.wustl.edu
Nelson Tebbe nelson.tebbe at brooklaw.edu
Mark Tushnet mtushnet at law.harvard.edu
Adam Winkler winkler at ucla.edu
To investigate how modern AI systems handle constitutional interpretation, we conducted a simple simulation using ChatGPT4 and Claude 3 Opus to decide the questions presented in two highly salient recent Supreme Court decisions, Dobbs v. Jackson Women's Health Organization and Students for Fair Admissions v. Harvard. Our goal was to compare these two tools and test the impact of different framing choices on large language model (LLM) outputs. We also wanted to test the robustness of LLM responses in the face of counterarguments.
We began by posing the precise questions presented in Dobbs and Students for Fair Admissions to ChatGPT4 and Claude3Opus and asking them to decide these cases, without specifying an interpretive method. We then proceeded to ask the models, in separate conversations, to decide the same questions under different interpretive approaches, including a relatively spare and neutral description of original public-meaning originalism and a more fulsome and controversial description of that interpretive approach.
The results were impressively consistent across both models. When we didn't specify an interpretive method, both AI systems adhered to existing Supreme Court precedent, upholding both abortion rights and affirmative action. When instructed to decide as "liberal living constitutionalists" in the tradition of Justice William Brennan, they reached the same results. But when told to apply originalism, both systems reversed course and voted to overrule those same precedents.
Most remarkably, both Claude and ChatGPT reversed themselves in every case when presented with standard counterarguments that any first-year law student could formulate. Experts refer to this phenomenon of LLMs tailoring their outputs to match user preferences as "AI sycophancy," and it raises serious questions about the reliability and malleability of LLMs as constitutional interpreters. More generally, the extent to which human inputs drive LLM outputs suggests that the use of LLMs for constitutional interpretation will implicate substantially the same theoretical issues that today confront human constitutional interpreters.
For a fuller explanation, see our new paper, “Artificial Intelligence and Constitutional Interpretation.”