New AI model gets ‘fairly trained’ licence from non-profit

21 Mar 2024

Image: © iuriimotov/Stock.adobe.com

The non-profit Fairly Trained issued a Licensed Model certificate to K3LM and said generative AI can exist without exploiting copyrighted work.

A new large language model (LLM) has entered the playing field and claims to tackle one of the biggest complaints surrounding generative AI: the use of copyrighted content.

Various media organisations and creatives have filed lawsuits against companies such as OpenAI, with claims that their models are trained on copyrighted works without getting permission from the creators.

OpenAI has previously claimed that it would be impossible to create tools such as its ChatGPT product without access to copyrighted material. But a non-profit organisation called Fairly Trained claims that one LLM – KL3M – has been trained in a fair way, without taking copyrighted content without permission.

The non-profit issued KL3M with a Licensed Model certification, which means its training data meets certain criteria such as being “fully owned by the model developer” or being “in the public domain globally”. This LLM developed by 273 Ventures is the first to be granted the certification by Fairly Trained.

Fairly Trained also offered its first certification to a company offering AI speech and singing models, by certifying Voicemod.

The non-profit claims to have gained more industry supporters recently – including The Authors Guild, which previously criticised AI models for using copyrighted works, and SAG-AFTRA.

“Expanding the Fairly Trained certification into text and speech generation is a significant milestone, because it shows that it’s possible to train generative AI models in a fairer way in any creative field,” Fairly Trained said. “Generative AI can exist without exploiting copyrighted work without permission. We’re pleased that we continue to meet and certify great AI companies and developers who prove this.”

The K3LM model claims to be the first “clean” LLM and that it avoids IP and “toxicity” issues. Its creator, 273 Ventures, claims its family of models have no copyright issues, do not “scrape” the internet for content and avoid “toxic sources”.

The company claims that its models are able to outperform those that have 10 times the number of parameters. A graph shared on K3LM’s website compares its products with earlier models such as OpenAI’s GPT2 – so it is unclear how it competes with modern models such as GPT4.

Find out how emerging tech trends are transforming tomorrow with our new podcast, Future Human: The Series. Listen now on Spotify, on Apple or wherever you get your podcasts.

Leigh Mc Gowran is a journalist with Silicon Republic

editorial@siliconrepublic.com