Innovation Policy Colloquium

Professors Ignacio Cofone and Katherine Strandburg
Spring 2025
Thursdays 4:45-6:45pm Vanderbilt Hall, Room 208

LW.10930
3 credits

The Colloquium on Innovation Policy focuses each year on different aspects of the law’s role in promoting creativity, invention, and new technology. This year, we will discuss the theoretical and practical challenges at the intersection of the regulation of data privacy (or data protection) and artificial intelligence, with attention to the current global sociotechnical situation.

Schedule of Presenters

Thursday, JANUARY 23
Talia Gillis
, Associate Professor of Law, Columbia Law School
"Price Discrimination" Discrimination

Abstract: Credit price personalization, where lenders set prices based on individual borrower and loan characteristics, is a common practice across many loan types. And conventional accounts of its harms focus on the ways in which risk-based pricing, or setting prices based on borrowers’ credit risk, can lead to disparities for protected groups like racial minorities and women. This Article examines an often-overlooked yet potentially harmful form of price personalization—charging borrowers different rates based on their willingness-to-pay, known as price discrimination—and argues that this practice can exploit vulnerable borrowers, including protected groups like racial minorities and women, by imposing higher costs unrelated to their credit risk, resulting in what I term “price discrimination” discrimination. Beyond entrenching financial disparities, price discrimination can exacerbate default risks, especially as the use of big data and artificial intelligence can make price discrimination more pervasive. Despite the potential risks of price discrimination for protected groups, the existing discrimination legal framework treats price discrimination categorically, as either entirely permissible or entirely impermissible, without providing clear or consistent criteria for when such practices are justified. In contrast, I propose a harm-based approach to addressing price discrimination discrimination, which evaluates the permissibility of pricing policies based on the extent of harm they cause. This approach considers two key factors: the magnitude of the disparities and the legitimacy of the pricing strategy. Focusing on these dimensions offers a more direct approach to addressing price discrimination concerns and aligns with the statutory framework prohibiting unfair, deceptive, and abusive acts or practices.

Thursday, JANUARY 30

Ignacio Cofone, Professor of Law and Regulation of AI, The Faculty of Law, University of Oxford; Visiting NYU School of Law Spring 2025
The Privacy Fallacy: Harm and Power in the Information Economy

Abstract: Our privacy is besieged by tech companies. Companies can do this because our laws are built on outdated ideas that trap lawmakers, regulators, and courts into wrong assumptions about privacy, resulting in ineffective protections to one of the most pressing concerns of our generation. Drawing on behavioral science, sociology, and economics, Ignacio Cofone challenges existing laws and reform proposals, and dispels enduring misconceptions about data-driven interactions. This exploration offers readers a holistic view of why current laws and regulations fail to protect us against corporate digital harms, particularly those created by AI. Cofone proposes a better response: meaningful accountability for the consequences of corporate data practices, which ultimately entails creating a new type of liability that recognizes the value of privacy.

Thursday, FEBRUARY 6
Julie Cohen, Mark Claster Mamolen Professor of Law and Technology, Georgetown Law
Oligarchy, State, and Cryptopia

Abstract: Theoretical accounts of power in networked digital environments typically do not give systematic attention to the phenomenon of oligarchy—to extreme concentrations of material wealth deployed to obtain and protect durable personal advantage. The biggest technology platform companies are dominated to a singular extent by a small group of very powerful and extremely wealthy men who have played uniquely influential roles in structuring technological development in particular ways that align with their personal beliefs and who now wield unprecedented informational, sociotechnical, and political power. Developing an account of oligarchy and, more specifically, of tech oligarchy within contemporary political economy therefore has become a project of considerable urgency. This essay undertakes that project. As I will show, tech oligarchs’ power derives partly from legal entrepreneurship related to corporate governance and partly from the infrastructural character of the functions the largest technology platform firms (and through them, their oligarchic leaders) now perform. This account of tech oligarchy has important implications for three large categories of hotly debated issues. First, it sheds new light on the much-remarked inability of nation states to govern giant technology platform firms effectively. Second, it explains why efforts to rebalance the scales by recoding networked digital environments for decentralization—using means such as cryptocurrencies, decentralized social media protocols, and so-called digital autonomous organizations designed to devolve decision making authority—have not produced and will not produce the utopian results their backers promise. Third, it counsels more careful attention to an array of oligarchic projects—from dreams of space colonization to genome editing to the quest to develop artificial general intelligence—that have struck many observers as fantastical. The projects may or may not pan out, but through them, tech oligarchs are working to dismantle existing forms of social and political organization and define a human future that they alone control.

Thursday, FEBRUARY 13
James Grimmelmann, Tessler Family Professor of Digital and Information Law, Cornell Tech
The Files are in the Computer: On Copyright, Memorization, and Generative AI

Abstract: The New York Times’s copyright lawsuit against OpenAI and Microsoft alleges that OpenAI’s GPT models have “memorized” Times articles. Other lawsuits make similar claims. But parties, courts, and scholars disagree on what memorization is, whether it is taking place, and what its copyright implications are. Unfortunately, these debates are clouded by deep ambiguities over the nature of “memorization,” leading participants to talk past one another. In this Essay, we attempt to bring clarity to the conversation over memorization and its relationship to copyright law. Memorization is a highly active area of research in machine learning, and we draw on that literature to provide a firm technical foundation for legal discussions. The core of the Essay is a precise definition of memorization for a legal audience. We say that a model has “memorized” a piece of training data when (1) it is possible to reconstruct from the model (2) a near-exact copy of (3) a substantial portion of (4) that specific piece of training data. We distinguish memorization from “extraction” (in which a user intentionally causes a model to generate a near-exact copy), from “regurgitation” (in which a model generates a near-exact copy, regardless of the user’s intentions), and from “reconstruction” (in which the near-exact copy can be obtained from the model by any means, not necessarily the ordinary generation process). Several important consequences follow from these definitions. First, not all learning is memorization: much of what generative-AI models do involves generalizing from large amounts of training data, not just memorizing individual pieces of it. Second, memorization occurs when a model is trained; it is not something that happens when a model generates a regurgitated output. Regurgitation is a symptom of memorization in the model, not its cause. Third, when a model has memorized training data, the model is a “copy” of that training data in the sense used by copyright law. Fourth, a model is not like a VCR or other general-purpose copying technology; it is better at generating some types of outputs (possibly including regurgitated ones) than others. Fifth, memorization is not just a phenomenon that is caused by “adversarial” users bent on extraction; it is a capability that is latent in the model itself. Sixth, the amount of training data that a model memorizes is a consequence of choices made in the training process; different decisions about what data to train on and how to train on it can affect what the model memorizes. Seventh, system design choices also matter at generation time. Whether or not a model that has memorized training data actually regurgitates that data depends on the design of the overall system: developers can use other guardrails to prevent extraction and regurgitation. In a very real sense, memorized training data is in the model—to quote Zoolander, the files are in the computer.

Wednesday, MARCH 5 (2:35-4:35pm Furman Hall Room 212)
Jeremias Adams-Prassl, Professor of Law, Magdalen College, The Faculty of Law, University of Oxford

Thursday, MARCH 6
Omri Ben-Shahar
, Leo and Eileen Herzel Distinguished Service Professor of Law, The University of Chicago Law School
Privacy Protection, At What Cost?

Abstract: Why Fear Data
is a book that challenges how we think about the technologies that process personal data. Let me briefly bring you up to speed—a background to the chapters enclosed. Our data protection laws, I argue, are barking up the wrong tree. They diagnose the central problem of the digital society as one of data privacy and seek solutions in the form of privacy protections. People are given legal rights to determine how their data are tracked (and shown countless daily reminders of these rights in every website.) The cardinal flaw in this scheme is one of misdiagnosis. By focusing on private rather than social harms – on thepoten.al intrusions suffered by individuals whose personal data are collected, instead of the effect on public institutions – it misses the bigger picture. This misdiagnosis is responsible for two striking distortions. First, big data’s most troubling societal harms – things like algorithmic discrimination, dissemination of fake content, political polarization, erosion of communal norms, social media’s impact on youth, or harmful meddling by foreign governments – remain beyond the reach of data privacy laws with their focus on the well-being of individuals rather than environments. Second, many of data analytics’ promising benefits, like reduced auto accidents, improved health treatments, and prevention of heinous crimes, known to save thousands of lives, millions of injuries, and trillions of dollars, are tragically restricted by heavy-handed data privacy laws. In a part of the book not distributed here, I develop a theoretical framework of what I call “data pollution” – an account that shifts the lens from private injuries to social harms. I apply that framework to suggest various interventions, of the types deployed to control industrial pollution and other externali.es—a “data environmental law.” (I’ll briefly demonstrate this argument in my colloquium opening remarks.) The two chapters here consist of part III of the book, where I examine another blind zone of data privacy law – the indifference to data’s social benefits. I show how privacy-protec.ve regulation of data technologies restricts data applications with enormous social value. Surprisingly, this question – privacy protection, at what cost? – is rarely studied. Exposing the overlooked costs of privacy law could reshape the design of data protection law.

Thursday, APRIL 3
Alexander Feder Cooper, Postdoctoral Researcher, Microsoft Research
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Abstract: “Machine unlearning”—an approach for purging unwanted information from machine-learning (ML) models—has been increasingly embraced to support law and policy aims. We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have in practice. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model's parameters, e.g., a particular individual's personal data or in-copyright expression of Spiderman that was included in the model's training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual's data or reflect the concept of "Spiderman." Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We provide recommendations for how ML experts should focus their research and how policymakers can adjust their expectations concerning reasonable best efforts when using unlearning in practice. Our contributions require expertise in ML, law, and policy. We intend for our audience to be members of all of those communities. We organized a team of experts in each discipline. The resulting paper reflects the efforts of a large-scale collaboration across academic institutions, civil society, and industry labs. We aim for conceptual clarity and to encourage more thoughtful communication among ML, law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives.

Thursday, APRIL 10
Anita Allen,
Henry R. Silverman Professor of Law and Professor of Philosophy, University of Pennsylvania Carey Law School


Questions about the Colloquium should be addressed to Nicole Arzt. For those interested in attending any of the talks without an NYU ID should email if you plan to attend.