Microsoft and Nvidia join up to train massive AI-powered language model

12 Oct 2021

Image: © Elnur/

The new model has 105 layers and 530bn parameters, but the tech giants said that bias is still a problem for the system.

Tech giants Microsoft and Nvidia have teamed up to create what they claim is “the largest and the most powerful monolithic transformer language model trained to date”.

The Megatron-Turing Natural Language Generation (MT-NLG) model has 105 layers and 530bn parameters, three times as many parameters as OpenAI’s GPT-3.

Language models analyse text and calculate how likely one word is to appear, given the other words in that text, using machine learning. It is most commonly seen in the form of predictive text on our phones and emails.

These models are a core component of natural language processing, which is based on deep learning and enables computers to acquire meaning from inputs given by users.

The more parameters and layers a system has, the bigger and more complex it is, meaning it can acquire a more nuanced understanding of language.

MT-NLG’s 105 layers and 530bn parameters make it more powerful than the previous transformer-based systems trained by both companies.

“The 105-layer, transformer-based MT-NLG improved upon the prior state-of-the-art models in zero-, one-, and few-shot settings and set the new standard for large-scale language models in both model scale and quality,” the companies said in a blog post.

According to Microsoft and Nvidia, MT-NLG demonstrates “unmatched accuracy” in a broad set of natural language tasks, including text prediction, reading comprehension, common-sense reasoning, natural language inferences and word-sense disambiguation.

Bias still prevails

While this latest advancement in the world of language processing and deep learning is an impressive feat, the ongoing problem of bias within AI remains.

Some of the most notable examples of this include an MIT image library used to train AI that was found last year to contain racist and misogynistic terms, and Microsoft’s own chatbot in 2016 that turned racist and misogynistic within 24 hours.

Microsoft and Nvidia found that the MT-NLG model also picks up stereotypes and biases from the data on which it is trained.

“Microsoft and Nvidia are committed to working on addressing this problem. We encourage continued research to help in quantifying the bias of the model,” they said.

“In addition, any use of MT-NLG in production scenarios must ensure that proper measures are put in place to mitigate and minimise potential harm to users.”

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Jenny Darmody is the editor of Silicon Republic