Meta is using two Nvidia GPU clusters to train Llama 3

13 Mar 2024

Image: © klyaksun/Stock.adobe.com

The company said both of these clusters contain more than 24,000 Nvidia H100 GPUs and are being used to support the company’s AI developments.

Meta has shared details on two of its new GPU clusters that it is using to train its future AI models, including a successor to Llama 2.

The company said it has two “data centre scale” clusters that both contain more than 24,000 Nvidia H100 GPUs. These clusters are being used to develop Llama 3, the company’s planned next-generation AI model. The clusters are also being used to support Meta’s AI research and development.

Meta said the new training clusters are an evolution of its AI Research SuperCluster (RSC), which the company revealed in 2022. This cluster featured 16,000 Nvidia A100 GPUs and was used to help Meta build its “first generation of advanced AI models”.

“Our newer AI clusters build upon the successes and lessons learned from RSC,” the company said in a blogpost. “We focused on building end-to-end AI systems with a major emphasis on researcher and developer experience and productivity.”

The company said the two new AI training cluster designs are “a part of our larger roadmap for the future of AI”.

“By the end of 2024, we’re aiming to continue to grow our infrastructure build-out that will include 350,000 Nvidia H100s as part of a portfolio that will feature compute power equivalent to nearly 600,000 H100s.”

Meta was previously focused on research related to the metaverse – which coincided with its name change back in 2021 – but the company’s commitments to this vision have been costly. Reality Labs – the division handling Meta’s VR operations and metaverse ambitions – has cost the company billions.

In January, Meta CEO Mark Zuckerberg announced that the company’s long-term vision is to build “general intelligence, open source it responsibly and make it widely available”. He said the company’s two main research groups – FAIR and GenAI – were being expanded and brought closer together to accelerate this goal.

Despite the apparent switch in focus, Zuckerberg claimed the work on AI is interlinked with the company’s metaverse goals.

“The two major parts of our vision – AI and the metaverse – are connected,” Zuckerberg said. “By the end of the decade, I think lots of people will talk to AIs frequently throughout the day using smart glasses like what we’re building with Ray Ban Meta.”

Find out how emerging tech trends are transforming tomorrow with our new podcast, Future Human: The Series. Listen now on Spotify, on Apple or wherever you get your podcasts.

Leigh Mc Gowran is a journalist with Silicon Republic

editorial@siliconrepublic.com