Nvidia says its AI model can turn 2D photos into 3D scenes in seconds

29 Mar 2022

Image: © MichaelVi/Stock.adobe.com

Nvidia said this tech could be used to create virtual world avatars, reconstruct scenes for 3D maps or train self-driving cars about real-world objects.

AI researchers at Nvidia are developing new technology that can turn collections of 2D images into a 3D scene in a matter of seconds.

Nvidia said this is achieved through inverse rendering, a process where AI approximates how light behaves in the real world. This allows researchers to create a 3D scene from a bunch of 2D photos taken at different angles.

The research team said this task can be completed almost instantly using a combination of neural network training and rapid rendering. They have applied this approach to a new technology called neural radiance fields, or NeRF.

Nvidia has called its new process Instant NeRF and described it as being more than 1,000 times faster than other NeRF techniques. The tech giant said its model can be trained in seconds using a few dozen still photos, before rendering the 3D scene within tens of milliseconds.

Four photos at different angles of a woman holding a Polaroid camera.

As a tribute to Polaroid images, Nvidia recreated an iconic photo of Andy Warhol taking an instant photo, turning it into a 3D scene using Instant NeRF. Image: Nvidia

“If traditional 3D representations like polygonal meshes are akin to vector images, NeRFs are like bitmap images: they densely capture the way light radiates from an object or within a scene,” Nvidia VP for graphics research David Luebke said.

“In that sense, Instant NeRF could be as important to 3D as digital cameras and JPEG compression have been to 2D photography – vastly increasing the speed, ease and reach of 3D capture and sharing.”

Researchers said there are a variety of applications for this technology. It could be used to train robots and self-driving cars to understand the size and shape of real-world objects. The technology could also be used to create avatars for virtual worlds, capture video conference participants and their environments, or reconstruct scenes for 3D digital maps.

Nvidia is one of many companies focusing on the potential of the metaverse. Last week, the tech giant shared plans to make its platform for real-time 3D simulation and design collaboration accessible in the cloud, to accelerate the development of virtual worlds.

Nvidia’s research team is also exploring if the Instant NeRF input encoding technique can help with AI challenges such as reinforcement learning, language translation and general-purpose deep learning algorithms.

The company teamed up with Microsoft last year to create an AI-powered language model. The tech companies claimed their new model was “the largest and the most powerful monolithic transformer language model trained to date”.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.