OpenAI Rolls Out Open-Source Speech Recognition System

OpenAI has released a new open source AI speech recognition model called Whisper that can recognize and translate audio at levels it says compare in accuracy and robustness to human abilities. Case uses include transcription of speeches, interviews, podcasts and conversations. “Moreover, it enables transcription in multiple languages, as well as translation from those languages into English,” says OpenAI, which is open-sourcing models and inference code on GitHub “to serve as a foundation for building useful applications and for further research on robust speech processing.” Continue reading OpenAI Rolls Out Open-Source Speech Recognition System

Apple Researchers Improving Accuracy of Virtual Assistant

Over 50 million people worldwide use Apple’s virtual assistant Siri. Apple, focused on improving Siri’s capabilities, published research on how to improve voice trigger detection, speaker verification and language identification for multiple speakers. Apple researchers suggest that an AI model be trained for automatic speech recognition and speaker recognition. Rather than approach it as two independent tasks, the researchers proved that those tasks might actually help one another to “estimate both properties.” Continue reading Apple Researchers Improving Accuracy of Virtual Assistant

Google and Amazon Use AI to Improve Speech Recognition

Google’s artificial intelligence researchers made an unexpected discovery with its new SpecAugment data augmentation model for automatic speech recognition. Rather than augmenting input audio waveforms, SpecAugment applies augmentation directly to the audio spectrogram. Researchers discovered, to their surprise, that models trained with SpecAugment out-performed all other speech recognition methods, even without a language model. Amazon also revealed research on improving Alexa’s speech recognition by 15 percent. Continue reading Google and Amazon Use AI to Improve Speech Recognition