Stability AI unveils new model that can turn images into 3D objects

19 Mar 2024

Image: © ZeedLa/Stock.adobe.com

Based in London, the AI start-up detailed the new 3D model in a paper and said that is now available for commercial use through a Stability AI membership.

Stability AI, the start-up behind the AI-powered text-to-image generator Stable Diffusion, has unveiled its latest creation: a generative AI model called Stable Video 3D that can turn still images into 3D objects and videos.

In an announcement yesterday (18 March), Stability AI said that Stable Video 3D, or SV3D, is a significant improvement over other existing models in the space and can generate multiview videos of an object.

“When we released Stable Video Diffusion, we highlighted the versatility of our video model across various applications. Building upon this foundation, we are excited to release Stable Video 3D,” the London-headquartered start-up wrote.

“This new model advances the field of 3D technology, delivering greatly improved quality and multiview when compared to the previously released Stable Zero123, as well as outperforming other open-source alternatives such as Zero123-XL.”

Multiview videos of objects on the AI model are created by adapting its image-to-video diffusion model with the addition of what it calls camera path conditioning. Technical details of the two variants of SV3D were published in a paper on Hugging Face yesterday.

“The use of video diffusion models, in contrast to image diffusion models as used in Stable Zero123, provides major benefits in generalisation and view consistency of generated outputs,” the start-up said.

“Additionally, we propose improved 3D optimisation leveraging this powerful capability of Stable Video 3D to generate arbitrary orbits around an object. By further implementing these techniques with disentangled illumination optimisation as well as a new masked score distillation sampling loss function, Stable Video 3D is able to reliably output quality 3D meshes from single image inputs.”

SV3D is available for commercial use through a Stability AI membership. The company said that its latest 3D model can also be used for non-commercial use by downloading model weights on Hugging Face.

Led by founder and CEO Emad Mostaque, Stability AI recently appointed a new chief technology officer, Christian Laforte, after a rough 2023 that saw more than 10 senior executives leave the start-up, according to Sifted.

One of them was Ed Newton-Rex, who led the audio team at Stability AI. Newton-Rex cited his disagreement with the company’s opinion that training generative AI models on copyrighted works is “fair use” as his reason for resigning.

“There are lots of people at Stability who are deeply thoughtful about these issues,” wrote Newton-Rex, who joined the company in November 2022 and became vice-president of audio in February last year.

“I’m proud that we were able to launch a state-of-the-art AI music generation product trained on licensed training data, sharing the revenue from the model with rights-holders. I’m grateful to my many colleagues who worked on this with me and who supported our team.”

Find out how emerging tech trends are transforming tomorrow with our new podcast, Future Human: The Series. Listen now on Spotify, on Apple or wherever you get your podcasts.

Vish Gain is a journalist with Silicon Republic

editorial@siliconrepublic.com