OpenAI Operator Agent Available to ChatGPT Pro Subscribers

OpenAI has launched Operator, a semi-autonomous AI agent that uses a proprietary web browser to execute tasks like planning a vacation using Tripadvisor or booking restaurant reservations through OpenTable. “It can look at a webpage and interact with it by typing, clicking and scrolling,” explains OpenAI. Operator is powered by a new model called Computer-Using Agent (CUA), and is available in research preview to ChatGPT Pro subscribers in the U.S. Combining GPT-4o’s computer vision capabilities with advanced reasoning, CUA is trained to interact with graphical user interfaces (GUIs) — parsing menus, clicking buttons and reading screen text. Continue reading OpenAI Operator Agent Available to ChatGPT Pro Subscribers

Nvidia Targets Consumers with $249 Compact Supercomputer

Nvidia is hoping interest in artificial intelligence will translate to consumer sales of a relatively low-priced computer optimized for basic AI functionality. Last month, the company upgraded its Jetson line with a $249 “compact AI supercomputer,” the Jetson Orin Nano Super Developer Kit. At half the price of the original, the model aims to attract students, developers, hobbyists, small- and medium-sized businesses, and anyone who is AI curious. “As the AI world is moving from task-specific models into foundation models, it provides an accessible platform to transform ideas into reality,” according to Nvidia. Continue reading Nvidia Targets Consumers with $249 Compact Supercomputer

OpenAI Previews Two New Reasoning Models: o3 and o3-Mini

OpenAI has unveiled a new frontier model, OpenAI o3, which it claims can “reason” through challenges involving math, science and computer programming. Available to safety and research testers, it is expected to be available to individuals and businesses this year. OpenAI o3 is said to be over 20 percent more efficient at common programming tasks than its predecessor OpenAI o1 and beat a company scientist on a programming test. Model o3 is part of a broader effort to create AI systems that can reason through complex problems. In late December Google debuted a similar platform, the experimental Gemini 2.0 Flash Thinking Mode. Continue reading OpenAI Previews Two New Reasoning Models: o3 and o3-Mini

CES: Google TV Integrates Gemini AI for a Conversational Feel

Google TV is incorporating Gemini AI to make it easier to converse with a voice assistant as well as generating helpful onscreen information. These new Google TV devices will also feature an upgraded, Gemini-powered voice experience capable of handling more complex voice commands. “You and your family will be able to gather together and have a natural conversation with your TV,” Google announced at CES 2025, where it shared a preview of the new capabilities. The Gemini model also lets Google TV users create customized artwork, control smart home devices and get an overview of the day’s news. Continue reading CES: Google TV Integrates Gemini AI for a Conversational Feel

CES: AI Pioneer Yann LeCun on AI Agents, Human Intelligence

During CES 2025 in Las Vegas this week, Meta Vice President and Chief AI Scientist Yann LeCun had a compelling conversation with Wing Venture Capital Head of Research Rajeev Chand on the latest hot button topics in the rapidly evolving field of artificial intelligence. Some of the conclusions were that AI agents will become ubiquitous — but not for 10 to 15 years, human intelligence means different things to different AI experts, and nuclear power remains the best and safest source for powering AI. And, for those looking for more of LeCun’s tweets, he said he no longer posts on X. Continue reading CES: AI Pioneer Yann LeCun on AI Agents, Human Intelligence

Microsoft AI Forecast Includes $80B in Data Center Spending

Microsoft anticipates spending $80 billion to construct AI data centers in fiscal 2025, which ends in June. More than half of that investment will fund U.S. infrastructure, according to company Vice Chair and President Brad Smith. The move aims to keep Microsoft, which owns a stake in OpenAI, a leader in artificial intelligence, and bolster the nation’s position in the global AI race, which Smith says it currently leads, “thanks to the investment of private capital and innovations by American companies of all sizes, from dynamic startups to well-established enterprises.” Continue reading Microsoft AI Forecast Includes $80B in Data Center Spending

Veo 2 Is Unveiled Weeks After Google Debuted Veo in Preview

Attempting to stay ahead of OpenAI in the generative video race, Google announced Veo 2, which it says can output 4K clips of two-minutes-plus at 4096 x 2160 pixels. Competitor Sora can generate video of up to 20 seconds at 1080p. However, TechCrunch says Veo 2’s supremacy is “theoretical” since it is currently available only through Google Labs’ experimental VideoFX platform, which is limited to videos of up to 8-seconds at 720p. VideoFX is also waitlisted, but Google says it will expand access this week (with no comment on expanding the cap). Continue reading Veo 2 Is Unveiled Weeks After Google Debuted Veo in Preview

Ray-Ban Meta Gets Live AI, RT Language Translation, Shazam

Meta has added new features to Ray-Ban Metas in time for the holidays via a firmware update that make the smart glasses “the gift that keeps on giving,” per Meta marketing.  “Live AI” adds computer vision, letting Meta AI see and record what you see “and converse with you more naturally than ever before.” Along with Live AI, Live Translation is available for Meta Early Access members. Translation of Spanish, French or Italian will pipe through as English (or vice versa) in real time as audio in the glasses’ open-ear speakers. In addition, Shazam support is added for users interested in easily identifying songs. Continue reading Ray-Ban Meta Gets Live AI, RT Language Translation, Shazam

Twelve Labs Creating AI That Can Search and Analyze Video

Twelve Labs has raised $30 million in funding for its efforts to train video-analyzing models. The San Francisco-based company has received strategic investments from notable enterprise infrastructure providers Databricks and SK Telecom as well as Snowflake Ventures and HubSpot Ventures. Twelve Labs targets customers using video across a variety of fields including media and entertainment, professional sports leagues, content creators and business users. The funding coincides with the release of Twelve Labs’ new video foundation model, Marengo 2.7, which applies a multi-vector approach to video understanding. Continue reading Twelve Labs Creating AI That Can Search and Analyze Video

Pika 2.0 Video Generator Adds Character Integration, Objects

Pika Labs has updated its generative video model, Pika 2.0 adding more user control and customizability, the company says. Improvements include better “text alignment,” making it easier to have the AI follow through with intricate prompts. Enhanced motion rendering is said to deliver more “naturalistic movement” and better physics, including greater believability in transformations that tend toward the surreal, which has typically been a challenge for genAI tools. The biggest change may be “Scene Ingredients,” which lets users add their own images when building Pika-generated videos. Continue reading Pika 2.0 Video Generator Adds Character Integration, Objects

Grok-2 Chatbot Is Now Available Free to All Users of X Social

Elon Musk’s xAI has been rolling out an updated Grok-2 model that is now available free to all users of the X social platform. Prior to last week, the “unfiltered” chatbot — which debuted in November 2023 — was available only by paid subscription. Now Grok is coming to X’s masses, but those on the free tier can only ask the chatbot 10 questions in two hours, while Premium and Premium+ users will “get higher usage limits and will be the first to access any new capabilities.” There is also now a Grok button featured on X that aims to encourage exploration. Continue reading Grok-2 Chatbot Is Now Available Free to All Users of X Social

Google Releases Gemini 2.0 in Shift Toward Agentic Era of AI

Google has introduced Gemini 2.0, the latest version of its multimodal AI model, signaling a shift toward what the company is calling “the agentic era.” The upgraded model promises not only to outperform previous iterations on standard benchmarks but also introduces more proactive, or agentic, functions. The company announced that “Project Astra,” its experimental assistant, would receive updates that allow it to use Google Search, Lens, and Maps, and that “Project Mariner,” a Chrome extension, would enable Gemini 2.0 to navigate a user’s web browser to complete tasks autonomously. Continue reading Google Releases Gemini 2.0 in Shift Toward Agentic Era of AI

OpenAI Releases Sora, Adding It to ChatGPT Plus, Pro Plans

Ten months after its preview, OpenAI has officially released a Sora video model called Sora Turbo. Described as “hyperrealistic,” Sora Turbo generates clips of 10 to 20 seconds from text or image inputs. It outputs video in widescreen, vertical or square aspect ratios at resolutions from 480p to 1080p. The new product is being made available to ChatGPT Plus and Pro subscribers ($20 and $200 per month, respectively) but is not yet included with ChatGPT Team, Enterprise, or Edu plans, or available to minors. The company explains that Sora videos contain C2PA⁠ metadata indicating that they were generated by AI. Continue reading OpenAI Releases Sora, Adding It to ChatGPT Plus, Pro Plans

Meta’s Llama 3.3 Delivers More Processing for Less Compute

Meta Platforms has packed more artificial intelligence into a smaller package with Llama 3.3, which the company released last week. The open-source large language model (LLM) “improves core performance at a significantly lower cost, making it even more accessible to the entire open-source community,” Meta VP of Generative AI Ahmad Al-Dahle wrote on X social. The 70 billion parameter text-only Llama 3.3 is said to perform on par with the 405 billion parameter model that was part of Meta’s Llama 3.1 release in July, with less computing power required, significantly lowering its operational costs. Continue reading Meta’s Llama 3.3 Delivers More Processing for Less Compute

Perplexity Expands Publisher Program Despite AI Controversy

Perplexity is expanding the publisher program associated with its AI-powered search engine. Added to the list of participants who will share ad revenue and access performance data are Adweek, LA Times, Mexico News Daily, The Independent, Germany’s Stern, the World Encyclopedia and about 10 other media brands. They join existing partners including Time, Fortune and Der Spiegel. Emphasizing its ongoing investment in publishers, Perplexity named Jessica Chan, formerly with LinkedIn and its content partner program, as head of publisher partnerships. News of Perplexity’s content deals appears to be generating mixed feelings in newsrooms. Continue reading Perplexity Expands Publisher Program Despite AI Controversy