The tech giant said its Universal Speech Model is a ‘critical first step’ towards creating an AI that can understand and translate 1,000 languages.
Google has shared details on its universal speech AI model that it designed to understand hundreds of spoken languages.
The company said its Universal Speech Model (USM) is trained on 12m hours of speech and 28bn sentences of text, which span more than 300 languages.
Google said the AI model is designed to be used in creating captions on YouTube videos and can currently perform automatic speech recognition on 100 languages.
As machine translation models require lots of data for training, it can be difficult to develop tools for languages that have few written examples online.
The tech giant said some of these languages are spoken by less than 20m people worldwide, which makes it “very hard to find the necessary training data”.
In a new research paper, Google Research claims the AI model was able to recognise under-represented languages by pre-training the model’s encoder and “fine-tuning on a smaller set of labeled data”.
The researchers also said this training process behind USM makes it “effective for adapting to new languages and data”. The AI model’s API is being made available for other researchers upon request.
Google said USM was able to achieve less than a 30pc word error rate on average across 73 languages, which is a “milestone we have never achieved before”.
Google said the new model is a “critical first step” on its mission to build an AI model that can support 1,000 spoken languages. First announced last November, Google said this initiative will help bring “greater inclusion to billions of people” in marginalised communities worldwide.
“The development of USM is a critical effort towards realising Google’s mission to organise the world’s information and make it universally accessible,” The Google researchers said in a blog post.
“We believe USM’s base model architecture and training pipeline comprise a foundation on which we can build to expand speech modelling to the next 1,000 languages.”
Last July, Meta claimed it had developed an AI model that was the first to translate 200 different languages.
Google seems to have rekindled its focus on AI in recent months, after ChatGPT caused a shake-up in the industry.
At around the time that Microsoft announced it would use ChatGPT to boost its Bing search engine, Google revealed its own Bard chatbot to try rival the OpenAI creation.
10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.