QuickVid Uses AI to Create Short Videos from Text Prompts

QuickVid is a new AI-driven text-to-video platform aiming for a mass market user base. The tool draws on various generative AI systems to automatically create short-form videos for YouTube, Instagram, TikTok and other platforms. Created by former Meta Platforms programmer Daniel Habib “in a matter of weeks,” QuickVid is quite rudimentary, though Habib says he plans to continue fine tuning and adding features. Unlike Google and Meta have done with their nascent text-to-video systems, QuickVid has bypassed the formalities of research papers and industry previews and jumped directly to a public-facing website.

However, “due to a huge influx of traffic” QuickVid.ai now has a waitlist posted. “Create YouTube Shorts 10x faster,” the site coaxes, though that speed apparently comes at a price. “Tools like QuickVid threaten to flood already-crowded channels with spammy and duplicative content,” writes TechCrunch, noting “they also face potential backlash from creators who opt not to use the tools, whether because of cost ($10 per month) or on principle.”

Digital Trends calls it “a very early entry into the race for a text-to-video solution,” and cautions “this isn’t equivalent to generating thousands of Stable Diffusion stills and assembling them to create a video or getting access to the most advanced AI systems in the world for true video generation.”

As TechCrunch explains it, “QuickVid chooses a background video from a library, writes a script and keywords, overlays images generated by DALL-E 2 and adds a synthetic voiceover and background music from YouTube’s royalty-free music library.” For the video, QuickVid relies on Pexels royalty-free stock media catalog.

In other words, “QuickVid amalgamates existing AI to exploit the repetitive, templated format of B-roll-heavy short-form videos, getting around the problem of having to generate the footage itself,” reports TechCrunch, concluding that “QuickVid certainly isn’t pushing the boundaries of what’s possible with generative AI.”

While the voiceover is currently output via Google Cloud’s text-to-speech API, Habib tells TechCrunch “users will soon be able to clone their voice” when combining the various elements into a video.

In terms of quality, TechCrunch calls QuickVid’s results “a mixed bag.” The underlying videos “tend to be a bit random or only tangentially related to the topic, which isn’t surprising given QuickVids being currently limited to the Pexels catalog.” The DALL-E 2-generated images “exhibit the limitations of today’s text-to-image tech, like garbled text and off proportions.”

These being early days. Digital Trends predicts “next year could see many more text-to-video solutions arrive.”

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.