OpenAI Rolls Out Open-Source Speech Recognition System

OpenAI has released a new open source AI speech recognition model called Whisper that can recognize and translate audio at levels it says compare in accuracy and robustness to human abilities. Case uses include transcription of speeches, interviews, podcasts and conversations. “Moreover, it enables transcription in multiple languages, as well as translation from those languages into English,” says OpenAI, which is open-sourcing models and inference code on GitHub “to serve as a foundation for building useful applications and for further research on robust speech processing.” Continue reading OpenAI Rolls Out Open-Source Speech Recognition System

Companies Turn to AI for New Approaches to Audio Solutions

To understand speech visually, by reading lips, in addition to aurally, is an advantage for which AI has been waiting, according to researchers at Meta Platforms (formerly Facebook). The company says it has developed a framework that learns by watching — Audio-Visual Hidden Unit BERT (AV-HuBERT) — and that it is 75 percent more accurate than competing automated speech recognition systems on several metrics. Meta claims that AV-HuBERT outperforms the former best audiovisual speech recognition system with only one-tenth the inuput, which makes it potentially useful with languages with little or no audio data. Continue reading Companies Turn to AI for New Approaches to Audio Solutions

Microsoft to Buy AI and Speech Recognition Provider Nuance

Microsoft is on track to acquire Nuance Communications, an AI and speech recognition software company, for about $16 billion. The company intends to expand its offerings in medical computing; Nuance already has speech and text data related to healthcare, an established customer base and the transcription tool Dragon. According to Microsoft, the purchase will “double the size of the healthcare market where it competed to almost $500 billion.” With the purchase, Microsoft could also develop advanced AI solutions for the workplace across numerous industries. Microsoft’s last big purchase was LinkedIn, for $26.2 billion in 2015.  Continue reading Microsoft to Buy AI and Speech Recognition Provider Nuance

Facebook Using Self-Supervised Models to Build AI Systems

Facebook debuted Learning from Videos, a project designed to learn audio, images and text from publicly available Facebook videos to improve its core AI systems. By culling data from hundreds of languages and countries, said Facebook, the project will also help to enable “entirely new experiences.” Learning from Videos, which began in 2020, has also helped to improve recommendations in Instagram Reels. Facebook, Google and others are focused on self-supervised techniques rather than labeled datasets to improve AI. Continue reading Facebook Using Self-Supervised Models to Build AI Systems

CES: Panel Examines Issues of Gender and Racial Bias in AI

During a CES 2021 panel moderated by The Female Quotient chief executive Shelley Zalis, AI industry executives probed issues related to gender and racial bias in artificial intelligence. Google head of product inclusion Annie Jean-Baptiste, SureStart founder and chief executive Dr. Taniya Mishra and ResMed senior director of health economics and outcomes research Kimberly Sterling described the parameters of such bias. At Google, Jean-Baptiste noted that, “the most important thing we need to remember is that inclusion inputs lead to inclusion outputs.” Continue reading CES: Panel Examines Issues of Gender and Racial Bias in AI

BBC Partners with Microsoft to Release Beeb Voice Assistant

The BBC partnered with Microsoft to release an early version of Beeb, its digital voice assistant. Its U.K. debut will be part of Microsoft’s Windows Insider Program to encourage users to help improve Beeb prior to a wider rollout. The BBC first announced Beeb last year, noting that the aim was to integrate voice services into all its products. The public broadcaster will collect data by requiring users to log in to Beeb with their BBC accounts but such data will not be used for targeted ads. Continue reading BBC Partners with Microsoft to Release Beeb Voice Assistant

Apple Researchers Improving Accuracy of Virtual Assistant

Over 50 million people worldwide use Apple’s virtual assistant Siri. Apple, focused on improving Siri’s capabilities, published research on how to improve voice trigger detection, speaker verification and language identification for multiple speakers. Apple researchers suggest that an AI model be trained for automatic speech recognition and speaker recognition. Rather than approach it as two independent tasks, the researchers proved that those tasks might actually help one another to “estimate both properties.” Continue reading Apple Researchers Improving Accuracy of Virtual Assistant

CES 2020: The Next Decade Brings the Intelligence of Things

At Sunday’s opening CES event, CTA’s VP of research Steve Koenig and director of research Lesley Rohrbaugh revealed trends for CES 2020, as we move “into the data age.” “In the previous decade, we could describe the dynamic in hardware, software, apps and even content as IoT, the Internet of Things,” said Koenig. “In the new decade, we’ll be increasingly confronted with a new IoT: the Intelligence of Things. This new IoT bears testimony to the fact that AI is permeating commerce and culture.” Continue reading CES 2020: The Next Decade Brings the Intelligence of Things

Congress Calls For End to Tech Firms’ Audio Transcriptions

A bipartisan group of Congress members castigated Facebook for hiring contractors to transcribe audio clips and urged regulation to prevent it in the future. The transcriptions were made to help Facebook improve its artificial intelligence-enabled speech recognition, and are part of a move to improve the capabilities of voice assistants (Amazon, Apple and Google are among companies that have taken similar approaches). Last year, Senator Ron Wyden (D-Oregon) circulated a draft law that would impose steep fines and even prison for executives who failed to protect users’ personal data. Continue reading Congress Calls For End to Tech Firms’ Audio Transcriptions

Google Stops Human Review of Assistant Voice Clips in EU

Google is pausing Google Assistant voice transcriptions in the European Union for at least three months. In mid-July, it admitted that about 1,000 private communications were made available to human contractors evaluating Assistant’s speech recognition accuracy, revealing personal and private information. A Google spokesperson reported that the company ceased voice transcription involving human moderators after learning of additional leaks in the Netherlands. Amazon will allow Alexa users to opt out of the human review of recordings and Apple has halted its program allowing human contractors to listen in on Siri recordings. Continue reading Google Stops Human Review of Assistant Voice Clips in EU

Google and Amazon Use AI to Improve Speech Recognition

Google’s artificial intelligence researchers made an unexpected discovery with its new SpecAugment data augmentation model for automatic speech recognition. Rather than augmenting input audio waveforms, SpecAugment applies augmentation directly to the audio spectrogram. Researchers discovered, to their surprise, that models trained with SpecAugment out-performed all other speech recognition methods, even without a language model. Amazon also revealed research on improving Alexa’s speech recognition by 15 percent. Continue reading Google and Amazon Use AI to Improve Speech Recognition

Startup Within to Release Augmented Reality App for Children

Los Angeles-based immersive media startup Within plans to release Wonderscope, an augmented reality app for children, later this month. With Wonderscope, mobile AR superimposes characters, scenes and stories onto an iPad camera view of a real-world environment. Within chief executive Chris Milk noted that, with Wonderscope and a smartphone, anyone can have “this new magical ability.” “It’s like a lens for invisible magical things that you couldn’t see with your naked eye,” he added. Continue reading Startup Within to Release Augmented Reality App for Children

Microsoft Builds on Existing Tech, Voices Moral Conscience

At its Build developer conference this week, Microsoft is showing products that highlight its changed direction under the aegis of chief executive Satya Nadella. Among them is a DJI drone loaded with Microsoft software to identify oil pipeline faults without an Internet connection. Although Microsoft is helping customers enhance their existing gear, the company promised “big things ahead” to those entirely in the Microsoft ecosystem. Uninvolved in recent data scandals, some deem Microsoft to be the tech industry’s moral conscience. Continue reading Microsoft Builds on Existing Tech, Voices Moral Conscience

NAB 2018: IBM Watson on Refining AI for Closed Captioning

Closed captioning isn’t just for the hard-of-hearing anymore. According to Digiday, 85 percent of Facebook video is viewed without sound. That signals a trend of viewers who prefer to watch closed captioning, putting the heat on solutions providers to come up with compliant systems that are also accurate and speedy. With artificial intelligence, says IBM Watson Media senior offering manager David Kulczar, closed captioning can be enhanced to go beyond transcription, and automatically identify background audio descriptions. Continue reading NAB 2018: IBM Watson on Refining AI for Closed Captioning

Google Offers Its AI Chips to All Comers via Cloud Computing

Google, which created tensor processing units (TPUs) for its artificial intelligence systems some years ago, will now make those computer chips available to other companies via its cloud computing service. Google is currently focusing on computer vision technology, which allows computers to recognize objects; Lyft used these chips for its driverless car project. Amazon is also building its own AI chips for use with the Alexa-powered Echo devices to shave seconds off its response time and potentially increase sales. Continue reading Google Offers Its AI Chips to All Comers via Cloud Computing