September 21, 2022
Artificial intelligence company D-ID has launched a new presentation platform that can generate video from a single image and text. Creative Reality Studio offers from among 270 voices and 119 languages that users can pair with one of the company’s original avatar creations or an uploaded photo. The product is aimed at markets including education, the metaverse, advertising and sales. The company is offering a limited free 14-day trial, after which users would be required to switch to a $49 per month Pro subscription or higher-end Enterprise plan (pricing available on request).
Those who pay more “access premium presenters who are more ‘expressive,’” with better facial movement and hand gestures, per TechCrunch. The algorithmically generated video can either voice a written script or sync to an uploaded audio clip. Intonation and expressions range from cheerful, friendly, adolescent, excited, serious or sad.
In a blog post, D-ID promises more “engaging content” via AI narrators at a fraction of the cost of a traditional video production with human presenters.
“Platform exclusives include presenters with both facial and upper body gestures” and a Microsoft PowerPoint plug-in “for direct, in-slide customization,” working from “the users’ own photos or any image they have the rights to use,” according to MarTech Series.
D-ID CEO Gil Perry told TechCrunch that the cost and effort of producing educational content is “a big problem for organizations” for whom such work can be “dry and boring. Plus, they have to spend thousands of dollars to hire actors and create educational videos.” D-ID uses its AI “to create presenters and tutors to reenact humans and make the content more engaging and effective,” Perry further explained for TechCrunch.
Israel-based D-ID was established in 2017 by Perry, COO Sella Blondheim and CTO Eliran Kuta. Its products include Live Portrait, which imbues photos with motion and sound, “just one element of D-ID’s AI Face Platform.”
The company’s technology was used on a feature called Deep Nostalgia, marketed by ancestry firm MyHeritage “to make old photos of your relatives wink, nod, dance, and more,” per Singularity Hub.
D-ID is striving to ensure its creations pose no danger as deepfakes, since the company “has put guardrails like filtration of swear words and racist remarks, as well as image recognition to avoid the usage of famous people’s faces,” TechCrunch reports, adding that “it uses the Microsoft Azure text moderation API to weed out sexual remarks and offensive language in video scripts.”