US programmer sues Microsoft and OpenAI for open-source piracy

7 Nov 2022

Image: © MichaelVi/Stock.adobe.com

Matthew Butterick is alleging that the way AI coding assistant GitHub Copilot uses public open-source code without attribution is illegal.

Microsoft, GitHub and OpenAI are facing a class-action lawsuit over GitHub Copilot, the AI tool that is like predictive text for programming.

The lawsuit was brought forward by US programmer and lawyer Matthew Butterick “on behalf of a pro­posed class of possibly millions of GitHub users”.

Butterick is arguing that because Microsoft-owned GitHub trains Copilot on public repositories of code written by a vast number of creators who made their work public under specific open-source licenses on GitHub, it is violating the rights of these programmers by using their work without attribution.

First revealed in June 2021 as an AI assistant for programmers, Copilot is powered by the OpenAI Codex algorithm, which was trained on a large dataset of public source code.

Butterick is alleging that Copilot violates at least 11 open-source licenses that require attribution of the author’s name and copyright.

He claimed that Microsoft and the other defendants have also violated GitHub’s own terms of service and privacy policies, as well as a copyright act that forbids the removal of copyright-management information, and the California Consumer Privacy Act.

“In the weeks ahead, we will likely amend this complaint to add other parties and claims,” Butterick wrote on a website dedicated to the litigation.

GitHub had kept Copilot in technical preview until June earlier this year, during which period it was used by more than 1.2m developers. By June, Copilot was also helping write nearly 40pc of all code on GitHub.

“Just like the rise of compilers and open-source, we believe AI-assisted coding will fundamentally change the nature of software development, giving developers a new tool to write code easier and faster,” GitHub CEO Thomas Dohmke said at the time.

While Copilot depends on open-source code written by thousands of programmers, it is available to developers for $10 a month or $100 a year following a 60-day free trial. It is free for students and maintainers of popular open-source projects.

The lawsuit has been filed in San Francisco by Butterick and the Joseph Saveri Law Firm.

Butterick said that the lawsuit is the first step in “what will be a long journey” of what he believes is the first – but not the last – class-action case in the US challenging the training and output of AI systems.

“AI systems are not exempt from the law. Those who create and operate these systems must remain accountable. If companies like Microsoft, GitHub and OpenAI choose to disregard the law, they should not expect that we the public will sit still,” he said.

“AI needs to be fair and ethical for everyone. If it’s not, then it can never achieve its vaunted aims of elevating humanity. It will just become another way for the privileged few to profit from the work of the many.”

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.

Vish Gain is a journalist with Silicon Republic

editorial@siliconrepublic.com