Adobe’s Firefly Video model has introduced new updates including Generate Sound Effects, in beta, and a text-to-avatar feature that lets users turn scripts into avatar-led videos “in just a few clicks.” Firefly becomes the second video model to generate audio, joining Veo 3, although unlike Google’s AI video tool Firefly does not yet generate dialogue. What it can do is output foley-like sound and sound effects, while text-to-avatar can generate speech. As with Firefly’s generative visuals, Adobe says Generate Sound Effects is “commercially safe,” which means they are trained only on licensed or publicly available material.
What really stands out about Adobe’s Generate Sound Effects feature is “the amount of control users have when inputting their own audio,” ZDNet writes, explaining that it found a demo “truly impressive” with the generated audio matching “the input audio’s flow, while also incorporating the text prompt to create a sound that actually sounded like the intended output,” simulating a lion’s roar. Users can layer written and voice prompts so vocal inflections guide the created sound.
CNET suggests there is a big difference between the type of audio Firefly can generate versus Veo 3: “The AI audio you can generate through Firefly is the kind of audio that could be created by a foley artist, like sound effects and impact noises,” adding “that doesn’t include dialogue, though.” However, Adobe has the AI avatar tool queued up in beta for dialogue creation, CNET points out.
Each prompt triggers four Generate Sound Effects clips, “usually 8 seconds long each,” CNET notes.
Firefly can follow “the energy and rhythm of your voice” in matching the audio to the video “with cinematic timing,” Adobe explains in a blog post, adding that once a video is complete, it can be exported directly to Adobe Express for further polish, resulting in “share-ready content for all your social channels.” Or it can be exported to Premiere Pro to add the video into your existing timeline,” Adobe says.
One downside of audio generation in Firefly is “you have to manually synchronize your AI audio to your video clips” in a process similar to using the timeline feature in Premiere Pro. For people “who don’t need or want that kind of manual, hands-on control, Veo 3’s automatically matching will take a lot of the work out of creating AI videos,” CNET observes.
As for Firefly’s text-to-avatar, it generates video of what looks like a live person reading a script. “When picking an avatar, you can browse through the library of avatars, pick a custom background and accents, and then Firefly creates the final output,” ZDNet says.
No Comments Yet
You can be the first to comment!
Leave a comment
You must be logged in to post a comment.