OpenAI Voice Cloning Tool Needs Only a 15-Second Sample

OpenAI has debuted a new text-to-voice generation platform called Voice Engine, available in limited access. Voice Engine can generate a synthetic voice from a 15-second clip of someone’s voice. The synthetic voice can then read a provided text, even translating to other languages. For now, only a handful of companies are using the tech under a strict usage policy as OpenAI grapples with the potential for misuse. “These small scale deployments are helping to inform our approach, safeguards, and thinking about how Voice Engine could be used for good across various industries,” OpenAI explained. Continue reading OpenAI Voice Cloning Tool Needs Only a 15-Second Sample

ElevenLabs Promotes Its Latest Advances in AI Audio Effects

“What if you could describe a sound and generate it with AI?,” asks startup ElevenLabs, which set out to do just that, and says it has succeeded. The two-year-old company explains it “used text prompts like ‘waves crashing,’ ‘metal clanging,’ ‘birds chirping,’ and ‘racing car engine’ to generate audio.” Best known for using machine learning to clone voices, the AI firm founded by Google and Palantir alums has yet to make publicly available its new text-to-sound model but began teasing it by releasing online demos this week. Some see the technology as a natural complement to the latest wave of image generators. Continue reading ElevenLabs Promotes Its Latest Advances in AI Audio Effects

Amazon Claims ’Emergent Abilities’ for Text-to-Speech Model

Researchers at Amazon have trained what they are calling the largest text-to-speech model ever created, which they claim is exhibiting “emergent” qualities — the ability to inherently improve itself at speaking complex sentences naturally. Called BASE TTS, for Big Adaptive Streamable TTS with Emergent abilities, the new model could pave the way for more human-like interactions with AI, reports suggest. Trained on 100,000 hours of public domain speech data, BASE TTS offers “state-of-the-art naturalness” in English as well as some German, Dutch and Spanish. Text-to-speech models are used in developing voice assistants for smart devices and apps and accessibility. Continue reading Amazon Claims ’Emergent Abilities’ for Text-to-Speech Model

Newsom Report Examines Use of AI by California Government

California Governor Gavin Newsom has released a report examining the beneficial uses and potential harms of artificial intelligence in state government. Potential plusses include improving access to government services by identifying groups that are hindered due to language barriers or other reasons, while dangers highlight the need to prepare citizens with next generation skills so they don’t get left behind in the GenAI economy. “This is an important first step in our efforts to fully understand the scope of GenAI and the state’s role in deploying it,” Newsom said, calling California’s strategy “a nuanced, measured approach.” Continue reading Newsom Report Examines Use of AI by California Government

Captions Debuts AI Lipdub with Translation and Gen Z Slang

Captions, which leverages AI to help its customers produce “studio quality videos directly from their mobile devices,” has launched a new app called Lipdub that automatically translates and dubs content into 28 languages. The free download lets user dub anyone “and experience familiar voices and faces in a suite of new languages.” Lipdub’s translations not only duplicate what the company says is “the subject’s exact voice,” but also syncs lip movements to match. It also incorporates dialects and idioms, with options like Gen Z and Texas slang. Continue reading Captions Debuts AI Lipdub with Translation and Gen Z Slang

Magic Studio from Canva Offers AI Design for All Skill Levels

Web-based design app Canva has raised the curtain on its AI-powered Magic Studio as part of the company’s 10-year anniversary outreach. Canva is positioning Magic Studio as collecting diverse AI tools to provide a “comprehensive AI-design platform” for business and home users that want to automate labor-intensive tasks like creating and editing images and outputting to different formats using generative artificial intelligence. Created for “the 99 percent of the world without complex design skills,” Canva’s Magic Studio offers many of the features now being built-in to smartphones and software suites, but easier and “all in one place.” Continue reading Magic Studio from Canva Offers AI Design for All Skill Levels

Adobe Pursues Ethical, Responsible AI in the Creative Space

As a next step in its advances in ethical AI, Adobe has announced its Firefly generative AI platform now supports text prompts in more than 100 international languages. The company says its Firefly AI app has generated over one billion images in Firefly and Photoshop since implementation in March. Adobe has also deployed artificial intelligence in Express, Illustrator and the Creative Cloud. Positioning its latest news as an expansion of global proportions, Adobe’s generative AI products will now support text prompts in native dialects in the standalone Firefly web service, with localization coming to more than 20 additional languages. Continue reading Adobe Pursues Ethical, Responsible AI in the Creative Space

YouTube Introduces Multi-Language Audio Tracks Worldwide

Following several months of tests, YouTube is launching is multi-language audio track feature worldwide, with popular vlogger MrBeast helping to promote the new feature’s benefits. MrBeast, who has over 135 million global subscribers, is hoping to attract new subscribers to his channel now that the most popular videos are dubbed into 11 different languages. The multi-language audio feature allows creators to dub new and existing videos. YouTube says more than 3,500 multi-language videos have been uploaded to the site in 40-plus languages since January of this year. Continue reading YouTube Introduces Multi-Language Audio Tracks Worldwide

Business World Asks if Generative AI is Ready for Enterprise

IT pros are grappling with the ways ChatGPT can be worked into the enterprise stack. The generative artificial intelligence from OpenAI has demonstrated the ability to compile reports, craft marketing pitches and write software code, which makes it seem convenient for business use. Yet concerns remain, including potential security risks and sometimes erratic or inappropriate data feedback. In the past week, one third-party tester had ChatGPT pledge love for its interlocutor, while another received a detailed lecture on why cow eggs are bigger than chicken eggs. Continue reading Business World Asks if Generative AI is Ready for Enterprise

CES: Startup Leverages AI to Address Problematic Acoustics

There are a growing number of companies working on technologies that strive to make a person’s voice more intelligible to the listener over speakers, headphones, hearing aids and other consumer audio devices. Augmented Hearing, a Danish startup launched two years ago, is one of the more interesting companies at CES 2023 focusing on this space. The firm’s software-based solution runs on iOS, Windows and other CE operating systems. Their solution could mitigate the current trend of people across all age groups turning on closed captioning because they often find video dialogue difficult to understand. Continue reading CES: Startup Leverages AI to Address Problematic Acoustics

Facebook Adds 24 Languages to Rosetta Translation Feature

Facebook’s Rosetta is a machine learning system that extracts text in many languages from over one billion images in a real time. Facebook built its own optical character recognition system that can process such huge amount of content, day in and day out. In a recent blog post, Facebook explained how Rosetta works, using a convolutional neural network to recognize and transcribe text, even non-Latin alphabets and non-English words. The system was trained with a mix of human- and machine-annotated public images. Continue reading Facebook Adds 24 Languages to Rosetta Translation Feature

Twitter Doubles the Longstanding Character Limit for Tweets

After more than a decade of limiting tweets to 140 characters, Twitter announced yesterday that the limit has been doubled in most countries. The new 280-character limit has been testing since September in hopes that it would increase engagement. “In addition to more tweeting, people who had more room to tweet received more engagement (likes, retweets, @mentions), got more followers, and spent more time on Twitter,” the company explained in a blog post. Twitter considered expanding character limits in the past, but retreated due to negative response from its community. Continue reading Twitter Doubles the Longstanding Character Limit for Tweets

New Voice-Powered App Takes On Leading Digital Assistants

Santa Clara-based startup SoundHound has developed a voice-powered digital assistant that could take on early players in the field, including Siri, Google Now and Cortana. Like the others, the Hound app (for iOS and Android) allows users to interact via voice so that it can perform requested tasks. However, Hound claims to be faster and smarter than its competitors. The app has been in beta with 150,000 testers since last summer, and is now publicly available along with new Yelp and Uber partnerships for restaurant info and ride hailing from within the app. Continue reading New Voice-Powered App Takes On Leading Digital Assistants

Curator Tool Will Help Media Publishers Share Tweets and Vines

Twitter unveiled its new Curator media tool this week designed to make more tweets available to a larger audience outside of its own site — and ideally build upon its base of 288 million users that log in at least once a month. The tool will help media organizations locate tweets and Vine videos that can be posted with stories and broadcasts. Publishers regularly create and share compelling content related to live events and breaking news. Twitter aims to leverage these publishers to help address its slowing user growth rate. Continue reading Curator Tool Will Help Media Publishers Share Tweets and Vines

IBM’s SyNAPSE Chip Mimics the Workings of a Human Brain

IBM recently unveiled the second generation of a new type of computer chip that consumes less power and performs faster than traditional chips based on Von Neumann architecture. The SyNAPSE chip, which is still in development, was designed to function like the human brain, using more than a million “neurons” communicating through electrical spikes. This new technology requires a new type of programming language as well, but the performance gains are massive. Continue reading IBM’s SyNAPSE Chip Mimics the Workings of a Human Brain