ElevenLabs Launches an AI Tool for Generating Sound Effects

ElevenLabs has launched its text-to-sound generator Sound Effects for all users, available now at the company’s website. The new AI tool can create audio effects, short instrumental tracks, soundscapes and even character voices. Sound Effects “has been designed to help creators — including film and television studios, video game developers, and social media content creators — generate rich and immersive soundscapes quickly, affordably and at scale,” according to the startup, which developed the tool in partnership with Shutterstock, using its library of licensed audio tracks. Continue reading ElevenLabs Launches an AI Tool for Generating Sound Effects

Adobe Considers Sora, Pika and Runway AI for Premiere Pro

Adobe plans to add generative AI capabilities to its Premiere Pro editing platform and is exploring the update with third-party AI technologies including OpenAI’s Sora, as well as models from Runway and Pika Labs, making it easier “to draw on the strengths of different models” within everyday workflows, according to Adobe. Editors will gain the ability to generate and add objects into scenes or shots, remove unwanted elements with a click, and even extend frames and footage length. The company is also developing a video model for its own Firefly AI for video and audio work in Premiere Pro. Continue reading Adobe Considers Sora, Pika and Runway AI for Premiere Pro

Microsoft’s VASA-1 Can Generate Talking Faces in Real Time

Microsoft has developed VASA, a framework for generating lifelike virtual characters with vocal capabilities including speaking and singing. The premiere model, VASA-1, can perform the feat in real time from a single static image and a vocalization clip. The research demo showcases realistic audio-enhanced faces that can be fine-tuned to look in different directions or change expression in video clips of up to one minute at 512 x 512 pixels and up to 40fps “with negligible starting latency,” according to Microsoft, which says “it paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors.” Continue reading Microsoft’s VASA-1 Can Generate Talking Faces in Real Time

Midjourney Creates a Feature to Advance Image Consistency

Artificial intelligence imaging service Midjourney has been embraced by storytellers who have also been clamoring for a feature that enables characters to regenerate consistently across new requests. Now Midjourney is delivering that functionality with the addition of the new “–cref” tag (short for Character Reference), available for those who are using Midjourney v6 on the Discord server. Users can achieve the effect by adding the tag to the end of text prompts, followed by a URL that contains the master image subsequent generations should match. Midjourney will then attempt to repeat the particulars of a character’s face, body and clothing characteristics. Continue reading Midjourney Creates a Feature to Advance Image Consistency

Lightricks LTX Studio Is a Text-to-Video Filmmaking Platform

Lightricks, the company behind apps including Facetune, Photoleap and Videoleap, has come up with a text-to-video tool called LTX Studio that it is being positioned as a turnkey AI tool for filmmakers and other creators. “From concept to creation,” the new app aims to enable “the transformation of a single idea into a cohesive, AI-generated video.” Currently waitlisted, Lightricks says it will make the web-based tool available to the public for free, at least initially, beginning in April, allowing users to “direct each scene down to specific camera angles with specialized AI.” Continue reading Lightricks LTX Studio Is a Text-to-Video Filmmaking Platform

Pika Taps ElevenLabs Audio App to Add Lip Sync to AI Video

On the heels of ElevenLabs’ demo of a text-to-sound app unveiled using clips generated by OpenAI’s text-to-video artificial intelligence platform Sora, Pika Labs is releasing a feature called Lip Sync that lets its paid subscribers use the ElevenLabs app to add AI-generated voices and dialogue to Pika-generated videos and have the characters’ lips moving in sync with the speech. Pika Lip Sync supports both uploaded audio files and text-to-audio AI, allowing users to type or record dialogue, or use pre-existing sound files, then apply AI to change the voicing style. Continue reading Pika Taps ElevenLabs Audio App to Add Lip Sync to AI Video

Runway Teams with Getty on AI Video for Hollywood and Ads

The Google and Nvidia-backed AI video startup Runway is partnering with Getty Images to develop Runway Getty Images Model (RGM), which it is positioning as a new type of generative AI model capable of “providing a new way to bring ideas and stories to life through video” for enterprise customers using copyright compliant means. Targeting Hollywood studios, advertising, media and broadcast clients, RGM will “provide a baseline model upon which companies can build their own custom models for the generation of video content,” Runway explains. Continue reading Runway Teams with Getty on AI Video for Hollywood and Ads

Stability Introduces GenAI Video Model: Stable Video Diffusion

Stability AI has opened research preview on its first foundation model for generative video, Stable Video Diffusion, offering text-to-video and image-to-video. Based on the company’s Stable Diffusion text-to-image model, the new open-source model generates video by animating existing still frames, including “multi-view synthesis.” While the company plans to enhance and extend the model’s capabilities, it currently comes in two versions: SVD, which transforms stills into 576×1024 videos of 14 frames, and SVD-XT that generates up to 24 frames — each at between three and 30 frames per second. Continue reading Stability Introduces GenAI Video Model: Stable Video Diffusion