Nvidia’s Neuralangelo AI Turns 2D Video Clips into 3D Worlds

Nvidia Research is releasing a new AI model called Neuralangelo that can turn 2D iPhone video clips into 3D structures, virtually replicating sculptures, buildings and other real world objects in great detail. Named for Michelangelo’s life-like creations from blocks of marble, Neuralangelo is able to accurately capture repetitive texture patterns, homogenous colors, and strong color variations, tasks that were problematic for earlier AI models. Neuralangelo accomplishes the feat using instant neural graphics primitives, the technology behind Nvidia Instant NeRF.

Roof shingles, panes of glass and veined marble are some of the complex surfaces Neuralangelo can translate from 2D to 3D. “The high fidelity makes its 3D reconstructions easier for developers and creative professionals to rapidly create usable virtual objects for their projects using footage captured by smartphones,” Nvidia ex[plains in a blog post.

Last year, Nvidia Research introduced 3D MoMa, a tool that easily turns photographs into 3D objects, targeting it to architects, concept artists, designers and game developers as a way to quickly import objects into a graphics engine. “Neuralangelo builds on that concept to allow for far larger and more detailed spaces and objects to be imported,” writes PetaPixel.

“The 3D reconstruction capabilities Neuralangelo offers will be a huge benefit to creators, helping them recreate the real world in the digital world,” Nvidia senior director of research Ming-Yu Liu said of the new model, explaining it “will eventually enable developers to import detailed objects — whether small statues or massive buildings — into virtual environments for video games or industrial digital twins.”

“By using a video of an object or a scene filmed from various angles, Neuralangelo selects several of the frames that capture different viewpoints,” PetaPixel explains, adding that “it then determines the camera position in each frame and then creates a rough 3D representation of the scene” and “optimizes the render to sharpen the details before producing the final 3D object.”

Neuralangelo is one of almost 30 projects Nvidia Research will present at the Conference on Computer Vision and Pattern Recognition, June 18-22 in Vancouver. The CVPR papers span topics including pose estimation, 3D reconstruction and video generation. Included is one on DiffCollage, a diffusion method that creates large-scale content (including long landscape orientation, 360-degree panorama and looped-motion images).

Last week at Computex Nvidia unleashed a tsuami of new tech, including the Avatar Cloud Engine (ACE), a “foundry for intelligent in-game characters powered by generative AI,” detailed in Fast Company along with Neuralangelo.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.