Nvidia Audio2Face AI Avatar-Generator Is Now Open Source

Nvidia has made its Audio2Face open source, a potential boon for game developers and other 3D uses such as customer service. The generative AI facial animation system brings lifelike speech and expression to avatars on an accelerated basis using real-time facial animation and lip-sync. It works by analyzing acoustic features to create a stream of animation data that is then mapped onto a character’s facial poses. The data translates to “accurate lip-sync and emotional expressions,” says Nvidia, noting the imagery can be rendered offline for pre-scripted content or streamed in real time for dynamic characters with accurate lip-sync and emotional expressions.

“Nvidia is open sourcing the Audio2Face models and SDK so every game and 3D application developer can build and deploy high fidelity characters with cutting edge animations,” the company explains in a technical post. The Audio2Face training framework is also being made freely available “so anyone can fine-tune and customize our pre-existing models for specific use cases.”

PC Gamer reports that in theory the result should be ease-of-use “for a wide range of game developers” who want to create AI-driven characters that have convincing facial expressions, “including during real-time conversations with gamers.”

The tools being released in open source form include “the Audio2Face SDK, audio plugins for inputting voice streams, training frameworks, sample training data, a library of facial models and a specific Unreal Engine 5 plugin,” writes PC Gamer, adding that “the open source release also includes Audio2Emotion Models, which can ‘infer’ emotional state from audio in real time.”

Nvidia includes a complete table of resources in its post and directs those who want to get started to a developer page to download files and documentation.

Thus far, Audio2Face “has been primarily used in video game development and in the customer service industry,” writes NDTV Profit, listing clients including “NetEase (developer of ‘Marvel Rivals’), Streamlabs, UneeQ Digital Humans, Codemasters and more.”

PC Gamer tried Audio2Face last year and called it “frighteningly good,” noting that “the only really obvious giveaway that you’re dealing with an early, experimental system is the slight delay in responses, which made for ‘awkward pauses’ in conversation.”

Audio2Face is part of Nvidia’s ACE for Games platform, a suite of digital human technologies that powers actionable and conversational game characters.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.