December 5, 2018
Nvidia used processing power and neural networks to create a very convincing virtual city, which will be open for tours by attendees to this year’s NeurIPS AI conference in Montreal. Nvidia’s system, which uses existing videos of scenery and objects to create these interactive environments, also makes it easier for artists to create similar virtual worlds. Nvidia vice president of applied deep learning Bryan Catanzaro said generative models are key to making the process of creating virtual worlds cost effective.
Engadget reports that, “researchers trained the fledgling neural model with dashcam videos taken from self-driving car trials in cities for about a week on one of the company’s DGX-1 supercomputers.” (According to Nvidia chief executive Jensen Huang, the DGX-1 is the equivalent of “250 servers in a box.”) Then, a research team relied on Unreal Engine 4 to create a “semantic map,” which “essentially assigns every pixel on-screen a label,” such as ‘car’ or ‘tree’.
Unreal Engine then produced a sketch of a scene that was fed into Nvidia’s neural model, where “AI applied the visuals for what it knew a ‘car’ looked like to the clump of pixels labeled ‘car’ and repeated the same process for every other classified object in the scene.”
Catanzaro reported that AI rendered it all in real time, and that the car simulation ran at 25-frames-per-second. The team also “used this new video-to-video synthesis technique to digitally coax a team member into dancing like Psy,” in which AI had to figure out the dance poses, turn them into stick figures and then render another person’s appearance on top of them.
Engadget notes that the images aren’t as “graphically rich or as detailed” as a typical AAA video game scene, but that they “offer glimpses at digital cities filled with objects that do sort of look real.”
Nvidia’s code is all open source, but developers won’t be using it any time soon; “the company was quick to point out the neural network’s limitations … [saying that] its model wasn’t great at rendering vehicles as they turn because its label maps lacked sufficient information.” In other words, the images are still “a long way from being photo-realistic for long periods of time.”
Engadget adds that, “it’s sadly not hard to see how these techniques could be used for unsavory purposes,” such as deepfakes. But Catanzaro noted that, “most of the time they’re used for good things.” “We’re focused on the good applications,” he said, pointing out that, “Stalin was Photoshopping people out of pictures in the ‘50s before Photoshop even existed.” Still, “the existence of these tools also means the line between real events and fabricated ones will continue to grow more tenuous.”