Researchers from Meta Platforms’ Reality Labs and the University of Texas at Austin have developed audio tools that Meta CEO Mark Zuckerberg says will deliver a more realistic metaverse experience. Among the new tools is a visual acoustic matching model called AViTAR that adapts any audio clip to a chosen environment using a photograph, not advanced geometry, to create a simulation. Also in the pipeline is the Visually-Informed Dereverberation mode (VIDA), which will “remove reverberation,” isolating source sounds for a fully immersive effect.
Zuckerberg said in a Friday blog post that by isolating sounds, including spoken commands, VIDA can make understanding sound cues easier for both humans and machines. As for AViTAR, whether “you’re at a concert, or just talking with friends around a virtual table, a realistic sense of where sound is coming from makes you feel like you’re actually there,” Zuckerberg blogged.
Also part of what Engadget calls a “trio of open source audio ‘understanding tasks’” aimed at more lifelike VR/AR engagement is VisualVoice, which “does the same as VIDA but for voices,” using both visual and audio cues to separate voices from background noise.
“Meta anticipates this model getting a lot of work in the machine understanding applications and to improve accessibility,” Engadget writes, listing use cases including “more accurate subtitles, Siri understanding your request even when the room isn’t dead silent or having the acoustics in a virtual chat room shift as people speaking move around the digital room.”
“We envision a future where people can put on AR glasses and relive a holographic memory that looks and sounds the exact way they experienced it from their vantage point, or feel immersed by not just the graphics but also the sounds as they play games in a virtual world,” Zuckerberg blogged, adding that the technologies all require further training and development before they’ll be ready for public release.
A Meta news release says the three new models — “designed to push us toward a more immersive reality at a faster rate” — are being made available to developers.
“Meta’s already factored this in, to at least some degree, with the first generation version of its Ray-Ban Stories glasses, which include open air speakers that deliver sound directly into your ears,” writes Social Media Today, explaining the speakers are positioned to enable “fully immersive audio without the need for earbuds.”
In an interview with CNBC’s Jim Cramer, Zuckerberg said that he hopes to get “around a billion people in the metaverse doing hundreds of dollars of commerce, each buying digital goods, digital content, different things to express themselves.”