The discussion around AI models and copyright has ramped up, as The New York Times claims various AI chatbots are trained from millions of its articles.
The New York Times ended 2023 with a bang by launching a legal battle against OpenAI and Microsoft for alleged copyright infringement.
The US media outlet claims AI chatbots made by these companies – including the popular ChatGPT – are trained on millions of articles published by The New York Times. The media outlet also claims that it now competes with these chatbots as a source of reliable information.
Multiple groups have spoken out in recent years with claims that AI models are being trained on copyrighted material without the permission of creators – and without providing credit or compensation.
What is The New York Times case about?
In its court filing, the US media outlet claims that AI models from OpenAI and Microsoft (the defendants) copied and use millions of copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides and other articles. The outlet also claims that this threatens its ability to provide its journalism services.
“Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment,” the outlet said in its court filing.
In one example, The Times claims Microsoft’s Bing search index generates responses that contain “verbatim excerpts and detailed summaries” of articles from The Times, which are “significantly longer and more detailed than those returned by traditional search engines”.
“By providing Times content without The Times’s permission or authorisation, defendants’ tools undermine and damage The Times’s relationship with its readers and deprive The Times of subscription, licensing, advertising and affiliate revenue,” the media outlet said.
What does The New York Times want?
While a specific monetary figure is not listed, the filing claims The New York Times wants to hold OpenAI and Microsoft responsible for “the billions of dollars in statutory and actual damages that they owe” for the alleged unlawful copying and use of its material.
One part of the filing says the US media outlet is “entitled” to statutory damages, actual damages, restitution of profits and “other remedies provided by law, including full costs and attorneys’ fee”.
Did OpenAI and Microsoft respond?
Microsoft has declined requests for comment from multiple media outlets, but OpenAI released a statement that was shared by The New York Times. In this statement, a company spokesperson said it had been “moving forward constructively” in conversations with The New York Times and that it was “surprised and disappointed” by the lawsuit.
“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models,” the spokesperson said. “We’re hopeful that we will find a mutually beneficial way to work together, as we are doing with many other publishers.”
Have other groups raised concerns over AI models?
Concerns around AI models and copyrighted material has been developing for some time now, even before the launch of ChatGPT. Groups of artists spoke out against AI text-to-image generators in 2022, with claims that these models were trained on the copyrighted material of artists.
Last year, a group of authors filed joint lawsuits against OpenAI and Meta, with allegations that their AI products used copyrighted materials without permission.
The decision by The New York Times to sue OpenAI and Microsoft could open the flood gates to more lawsuits by media outlets. A report from the News Media Alliance in November 2023 claimed many large language models use training datasets that contain copyrighted content from news, magazine and digital media organisations.
The report also claimed that some of the most “widely used LLMs” have a preference for publisher content over “generic” content scraped from the internet.
10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.