Meta said the scientific community needs to be able to work together to advance AI research and probe for vulnerabilities.
Facebook’s parent company Meta is sharing its large language model that has 175bn parameters trained on publicly available datasets, making it available for AI researchers.
The social media giant said it is sharing access to both the pretrained models and the code needed to train and use them. It added that this will allow for “more community engagement in understanding this foundational new technology”.
“Access to the model will be granted to academic researchers, those affiliated with organisations in government, civil society and academia, along with industry research laboratories around the world,” Meta AI said in a blogpost yesterday (3 May)
Large language models are natural language processing (NLP) systems that are trained on a massive volume of text. These models are able to answer reading comprehension questions, solve basic maths problems and generate text.
Meta said full research access to large language models is usually restricted to a “few highly resourced labs”, which hinders efforts to increase their “robustness” and remove issues such as bias and toxicity within the models.
“For AI research to advance, the broader scientific community must be able to work together with cutting-edge models to effectively explore their potential while also probing for their vulnerabilities at the same time,” the company said.
“Meta AI believes that collaboration across research organisations is critical to the responsible development of AI technologies.”
The social media company said it designed its model – called OPT-175B – to be energy efficient and it was trained using roughly 14pc of the carbon footprint used to train OpenAI’s GPT-3 model.
Meta also said it is releasing a suite of “smaller-scale baseline models”, trained on the same dataset and using similar settings as OPT-175B.
Meta has been investing in AI research for some time. In February the company shared some of the AI research projects it is focused on, including universal speech translation, AI that can learn like a human and a more conversational AI assistant.
In January, Meta also revealed that its AI research team has been working for years on a supercomputer that could be the world’s “largest and fastest” when fully built out, which it hoped to achieve by mid-2022.
Meta isn’t the only company looking into large language models. Last October, tech giants Microsoft and Nvidia teamed up to create a language model with 105 layers and 530bn parameters, three times as many parameters as OpenAI’s GPT-3.
10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.