By
Paula ParisiOctober 27, 2023
The University of Science and Technology of China (USTC) and Tencent YouTu Lab have released a research paper on a new framework called Woodpecker, designed to correct hallucinations in multimodal large language AI models. “Hallucination is a big shadow hanging over the rapidly evolving MLLMs,” writes the group, describing the phenomenon as when MLLMs “output descriptions that are inconsistent with the input image.” Solutions to date focus mainly on “instruction-tuning,” a form of retraining that is data and computation intensive. Woodpecker takes a training-free approach that purports to correct hallucinations from the basis of the generated text. Continue reading Woodpecker: Chinese Researchers Combat AI Hallucinations
By
Paula ParisiOctober 25, 2023
OpenAI is developing an AI tool that can identify images created by artificial intelligence — specifically those made in whole or part by its Dall-E 3 image generator. Calling it a “provenance classifier,” company CTO Mira Murati began publicly discussing the detection app last week but said not to expect it in general release anytime soon. This, despite Murati’s claim it is “almost 99 percent reliable.” That is still not good enough for OpenAI, which knows there is much at stake when the public perception of artists’ work can be impacted by a filter applied by AI, which is notoriously capricious. Continue reading OpenAI Developing ‘Provenance Classifier’ for GenAI Images
By
Paula ParisiOctober 25, 2023
Nvidia Research has debuted Eureka, an AI agent that autonomously teaches robots complex motor skills. Powered by OpenAI’s GPT-4, Eureka has successfully trained a robotic hand to handle a pen with the dexterity of a human — a first, according to Nvidia. Eureka has also enabled robots to do things like open drawers, manipulate scissors and toss and catch balls, along with dozens of other tasks. “Eureka is a first step toward developing new algorithms that integrate generative and reinforcement learning methods to solve hard tasks,” according to Nvidia Senior Director of AI Research Anima Anandkumar said. Continue reading Nvidia Leverages OpenAI’s GPT-4 to Train Dexterous Robots
By
Paula ParisiOctober 11, 2023
OpenAI began previewing vision capabilities for GPT-4 in March, and the company is now starting to roll out the image input and output to users of its popular ChatGPT. The multimodal expansion also includes audio functionality, with OpenAI proclaiming late last month that “ChatGPT can now see, hear and speak.” The upgrade vaults GPT-4 into the multimodal category with what OpenAI is apparently calling GPT-4V (for “Vision,” though equally applicable to “Voice”). “We’re rolling out voice and images in ChatGPT to Plus and Enterprise users,” OpenAI announced. Continue reading ChatGPT Goes Multimodal: OpenAI Adds Vision, Voice Ability
By
Paula ParisiOctober 11, 2023
Startup Reka AI is releasing in preview its first artificial intelligence assistant, Yasa-1. The multimodal AI is described as “a language assistant with visual and auditory sensors.” The year-old company says it “trained Yasa-1 from scratch,” including pretraining foundation models “from ground zero,” then aligning them and optimizing to its training and server infrastructures. “Yasa-1 is not just a text assistant, it also understands images, short videos and audio (yes, sounds too),” said Reka AI co-founder and Chief Scientist Yi Tay. Yasa-1 is available via Reka’s APIs and as docker containers for on-site or virtual private cloud deployment. Continue reading Yasa-1: Startup Reka Launches New AI Multimodal Assistant
By
Paula ParisiOctober 9, 2023
Likewise, a startup discovery platform backed by Bill Gates, is launching its own free chatbot named Pix. Billed as “the world’s first personal entertainment companion,” Pix helps users find TV shows, movies, books and podcasts, drawing from 600 million consumer data points. Trained on OpenAI models, Pix uses natural-language processing to answer user questions submitted by text, email or on the web at Likewise.com. Responses are promised “within seconds,” and Pix will learn users’ preferences over time. Likewise claims to have more than six million registered users. Continue reading Likewise: Startup Backed by Bill Gates Launches Pix Chatbot
By
Paula ParisiOctober 5, 2023
LinkedIn is unveiling new AI features to improve job hunting, marketing and sales tools for its nearly 1 billion users. The Recruiter talent sourcing platform, LinkedIn Learning and more are all getting AI assists. A central use of AI is “to take on some of workers’ day-to-day drudgery, freeing extra time for the more people-centric, strategic aspects of their job,” according to the social business platform, which just wrapped its 12th annual Talent Connect Summit. The proliferation of evolving generative AI tools is triggering new workflows for recruiters, job hunters and employees. Continue reading LinkedIn Taps OpenAI to Upgrade Business Marketing Tools
By
Paula ParisiOctober 2, 2023
In a move to put “generative AI at the fingertips of every business, from startups to enterprises,” Amazon Web Services is commercially rolling out the Bedrock service it announced in April. Bedrock offers a wide range of foundation models from Amazon’s own Titan to products from Anthropic, Stability AI and soon Meta Platforms. The fully managed Bedrock service makes its generative FMs operable through a single, simple API. This means customers can experiment with various leading FMs and customize simple apps in-house, without the need for a third-party diving into their proprietary data. Continue reading AWS Rolls Out Bedrock Generative AI Service, Adds Llama 2
By
Paula ParisiOctober 2, 2023
Nvidia’s Picasso continues to gain market share among visual companies looking for an AI foundry to train models for generative use. Getty Images has partnered with Nvidia to create custom foundation models for still images and video. Generative AI by Getty Images lets customers create visuals using Getty’s library of licensed photos. The tool is trained on Getty’s own creative library and has the company’s guarantee of “full indemnification for commercial use.” Getty joins Shutterstock and Adobe among enterprise clients using Picasso. Runway and Cuebric are using it, too — and Picasso is still in development. Continue reading Getty GenAI Tool for Images and Video Is Powered by Nvidia
By
Paula ParisiOctober 2, 2023
A new AI-first form factor could be coming to market as the result of a partnership between OpenAI CEO Sam Altman, former Apple design guru Jony Ive and SoftBank CEO Masayoshi Son. Altman and Ive are said to be developing — and SoftBank potentially funding — an AI device to succeed the smartphone. Since co-founding OpenAI in 2015, Altman has been vocal about the need for a new type of device, purpose-built to leverage the capabilities of artificial intelligence. Ive, meanwhile, has been looking for a second act since exiting Apple after leading design on the iPhone, iPod and MacBook Air. Continue reading Purpose-Built AI Device May Be Coming from Ive and Altman
By
Paula ParisiSeptember 29, 2023
Fox Corporation’s Tubi TV video streaming service is rolling out a proprietary movie recommendation app called “Rabbit AI” in a beta test for iOS customers in the U.S., with other platforms to follow. Powered by OpenAI’s GPT-4, currently available only to enterprise and other paying customers, Rabbit AI provides “a new way to navigate” Tubi’s library of more than 200,000 movies and TV episodes, “providing hyper-personalized recommendations based on the contextual meaning of the terms,” the company says. A Rabbit AI plugin for ChatGPT is also now available to OpenAI subscribers, Tubi says. Continue reading Tubi Chooses ChatGPT to Power Content Recommendations
By
Paula ParisiSeptember 27, 2023
OpenAI is experimenting with new voice and image capabilities in ChatGPT. According to the company, users can now “speak with ChatGPT and have it talk back,” thanks to an intuitive new interface that, in addition to facilitating voice conversations, will allow users to show ChatGPT an image to discuss. “Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it,” OpenAI explains, alternatively suggesting you “snap pictures of your fridge and pantry to figure out what’s for dinner” or have it help with homework based on pictures of a math problem. Continue reading OpenAI’s ChatGPT Upgraded with ‘Talk’ Tech, Image Search
By
Paula ParisiSeptember 27, 2023
Spotify is using AI to drive podcast language translation in what sounds like the podcaster’s own voice, which has obvious implications for film and television dubbing. Working with podcast notables including Dax Shepard, Monica Padman and Bill Simmons, Spotify used AI to mimic their voices in Spanish, French and German for several episodes. The proprietary Spotify technology uses OpenAI’s new text-to-speech voice-generation technology as well as its open-source Whisper speech recognition system, which transcribes spoken words into text. The result, Spotify says, is “more authentic” and “more personal and natural” than traditional dubbing. Continue reading Spotify Uses AI to Copy Host Voices for Podcast Translations
By
Paula ParisiSeptember 26, 2023
Amazon has entered into a strategic investment in San Francisco-based Anthropic, founded by former members of OpenAI. The AI startup will train and deploy future models using AWS Trainium and Inferentia chips to train and deploy future foundation models with AWS as its primary cloud provider. In turn, Amazon says it will invest up to $4 billion in Anthropic, as it strives to compete with other technology firms in the race to develop generative AI, seeding growth for what is shaping up to be an entirely new economic and social landscape. Continue reading Amazon Plans to Invest Up to $4 Billion in AI Startup Anthropic
By
Paula ParisiSeptember 25, 2023
During its Surface and AI event in New York City on Thursday, Microsoft introduced a pair of new Surface laptops and an array of generative AI upgrades to Bing Chat, Windows Copilot and more. Taking center stage in hardware was the company’s more powerful Surface Laptop Studio 2 and the ultra-portable Surface Laptop Go 3. Also unveiled was the Surface Go 4 for Business, the latest miniature version of its Surface Pro tablet, and the company’s large touchscreen Surface Hub, designed for office use. Beginning this month, Microsoft rolls out Copilot — “your everyday AI companion” — in a free Windows 11 update, followed by Bing, Edge, and Microsoft 365 this fall. Continue reading Microsoft Unveils Next-Gen Surface Devices, New AI Features