by Dennis Crouch
In my view, some of the weakest anti-AI copyright claims have fallen under 17 U.S.C. § 1202(b)(1) – an element of the Digital Millennium Copyright Act (DMCA) that prohibits intentional removal or alteration of copyright management information (CMI). The statute broadly defines CMI to include not just copyright notices, but also titles, author information, owner information, terms of use, and other identifying information conveyed with copies of works. Any violation also requires proof that the CMI-remover had “reasonable grounds to know” that such removal would enable or conceal copyright infringement.
In Raw Story v. OpenAI, the online news organization alleged that OpenAI violated § 1202(b)(1) by removing copyright management information (CMI) from thousands of their news articles when incorporating them into training datasets for ChatGPT. Notably, the plaintiffs did not bring direct copyright infringement claims, instead focusing solely on alleged CMI removal. The articles in question were published online with author, title, and copyright information, which plaintiffs claimed OpenAI stripped away when creating its training sets. While OpenAI has not published the contents of these training sets, plaintiffs relied on “approximations” suggesting their articles appeared without CMI. They argued this evidenced intentional CMI removal, reasoning that if ChatGPT had been trained on articles with intact CMI, it would output such information when generating responses.
The most recent news in the case is that S.D.N.Y. Judge Colleen McMahon has dismissed the claims brought by Raw Story (and AlterNet Media) — holding that the plaintiffs lacked Article III standing to pursue the case.