Meta Says Its AI-Compressed Audio Codec Beats MP3 by 10x

Meta Platforms says its vision for the metaverse will rely heavily on compression technology “to deliver high-quality, uninterrupted experiences for everyone.” With that in mind, it’s trained its Fundamental AI Research (FAIR) lab on developing “hypercompression” solutions. First up is EnCodec, an audio technology it says compresses at 64 kbps, with no loss in quality, and at 10 times the efficiency of MP3. The EnCodec protocol has the potential to  greatly improve the sound and reliability of speech over low-bandwidth (like when your mobile phone is only getting one bar). It also works for music. Continue reading Meta Says Its AI-Compressed Audio Codec Beats MP3 by 10x

Google and Amazon Use AI to Improve Speech Recognition

Google’s artificial intelligence researchers made an unexpected discovery with its new SpecAugment data augmentation model for automatic speech recognition. Rather than augmenting input audio waveforms, SpecAugment applies augmentation directly to the audio spectrogram. Researchers discovered, to their surprise, that models trained with SpecAugment out-performed all other speech recognition methods, even without a language model. Amazon also revealed research on improving Alexa’s speech recognition by 15 percent. Continue reading Google and Amazon Use AI to Improve Speech Recognition

Facebook Introduces Open-Source Image Processing Library

Facebook unveiled Spectrum, an open-source image processing library to help improve the quality and reliability of images uploaded through its own apps. Spectrum, which Facebook first showed publicly and launched in beta in November, is now on GitHub, available to the developer community. As higher quality cameras on smartphones have become a key selling point, consumers are dealing with larger image files, which can be a stumbling block since they eat up more device memory and more network bandwidth. Continue reading Facebook Introduces Open-Source Image Processing Library

Facebook Adds 24 Languages to Rosetta Translation Feature

Facebook’s Rosetta is a machine learning system that extracts text in many languages from over one billion images in a real time. Facebook built its own optical character recognition system that can process such huge amount of content, day in and day out. In a recent blog post, Facebook explained how Rosetta works, using a convolutional neural network to recognize and transcribe text, even non-Latin alphabets and non-English words. The system was trained with a mix of human- and machine-annotated public images. Continue reading Facebook Adds 24 Languages to Rosetta Translation Feature

New Sony Media Player to Access 4K Library and Stream Netflix

As a follow-up to its original $700 4K media player, Sony has announced a new model, the FMP-X10, that will provide access to Sony’s Video Unlimited 4K download library and be able to stream 4K Netflix content. The new player, available this summer, will be compatible with Sony Ultra HD sets and include 1 terabyte of storage. A price has yet to be announced. Sony’s Video Unlimited 4K library currently features more than 200 titles (45GB-60GB files), about 50 of which are available for free. Continue reading New Sony Media Player to Access 4K Library and Stream Netflix