DeepSeek Debuts Its V3.2 Reasoning Model in Two Versions

DeepSeek-V3.2 is now in release, integrating thinking directly into tool-use for the first time, improving its predecessor DeepSeek-V3.2 Experimental. The model supports tool-use in both thinking and non-thinking modes. China-based DeepSeek began disrupting the U.S. AI market in January with the debut of foundation models that rival those from Google and OpenAI that are available for free. The company released internal benchmark scores indicating its new model can compete with OpenAI’s GPT-5 in reasoning benchmarks and agentic tasks. A variation, DeepSeek-V3.2-Speciale, has been released for specialized math and is said to perform comparably to Google’s Gemini 3 Pro. Continue reading DeepSeek Debuts Its V3.2 Reasoning Model in Two Versions

Anthropic and OpenAI Report Findings of Joint AI Safety Tests

OpenAI and Anthropic — rivals in the AI space who guard their proprietary systems — joined forces for a misalignment evaluation, safety testing each other’s models to identify when and how they fall short of human values. Among the findings: reasoning models including Anthropic’s Claude Opus 4 and Sonnet 4, and OpenAI’s o3 and o4-mini resist jailbreaks, while conversational models like GPT-4.1 were susceptible to prompts or techniques intended to bypass safety protocols. Although the test results were unveiled as users complain chatbots have become overly sycophantic, the tests were “primarily interested in understanding model propensities for harmful action,” per OpenAI. Continue reading Anthropic and OpenAI Report Findings of Joint AI Safety Tests

Grok 4 Offered Free in xAI Move on ChatGPT-5 Market Share

Elon Musk’s xAI has made Grok 4 available on its free tiers as it seeks to take advantage of initial user dissatisfaction with OpenAI’s new GPT-5. The company has positioned Grok as freewheeling and uncensored, a contrast to GPT-5, which has been criticized on Reddit and other social platforms as a “corporate beige zombie” with too many guardrails. After its February debut, Grok 3 was reined-in with checks including removal of its native image generator in March. Grok 4 was released in July with integrated image and video features as well as a “Spicy” mode for creating risqué content. Continue reading Grok 4 Offered Free in xAI Move on ChatGPT-5 Market Share

OpenAI Announces Launch of GPT-5 Model Across All Tiers

OpenAI is rolling a new foundation model, GPT-5, via API for developers and enterprise users in three branded sizes — gpt-5, gpt-5-mini and gpt-5-nano — “to give developers more flexibility to trade off performance, cost, and latency.” The company said Thursday that it is also making GPT‑5 available to all ChatGPT Plus, Pro, Team and Free tier users. Enterprise and Education tier users are promised access this week. While GPT‑5 in the API platform is the reasoning model that powers maximum performance in ChatGPT, “GPT‑5 in ChatGPT is a system of reasoning, non-reasoning, and router models,” OpenAI explains. Continue reading OpenAI Announces Launch of GPT-5 Model Across All Tiers

Anthropic Seeks to Raise $5 Billion, Debuts Claude Opus 4.1

Anthropic has released Claude Opus 4.1, an upgrade to Opus 4 that reportedly improves on agentic tasks, computer coding and reasoning. Pricing has not increased from what customers were paying for Opus 4, and the company promises “substantially larger improvements to our models in the coming weeks.” The move comes as Anthropic nears a new funding round targeting $3 to $5 billion, which could place a valuation of up to $170 billion on the startup. Recurring revenue hit $5 billion as of late July, which could increase to $9 billion by the end of the year. Claude Opus 4.1 was released two days before OpenAI unleashed GPT-5, and performs comparably in coding benchmarks. Continue reading Anthropic Seeks to Raise $5 Billion, Debuts Claude Opus 4.1

DeepSeek’s New Update Heightens Rivalry with U.S. AI Firms

DeepSeek-R1-0528 is here, and this latest iteration is generating almost as much stir as the initial open-source R1 reasoning model did in January. The Chinese startup, owned by quantitative analysis firm High-Flyer Capital, is touted by one media outlet as “near parity in reasoning capabilities with proprietary paid models such as OpenAI’s o3 and Google Gemini 2.5 Pro.” Promised are stronger capabilities in complex reasoning centered on math, science, business and coding, along with improved features for developers and researchers. As with the earlier release, the DeepSeek-R1-0528 is available under the MIT License, which supports commercial use and allows customization. Continue reading DeepSeek’s New Update Heightens Rivalry with U.S. AI Firms

New Reasoning Model Improves Smarts of OpenAI Operator

OpenAI has upgraded its autonomous web browsing agent Operator to the new reasoning model OpenAI o3 from the prior GPT-4o multimodal LLM engine. The update is being released globally in research preview this month for those who subscribe to OpenAI’s ChatGPT Pro for $200 per month. Operator serves OpenAI’s “computer-using agent” (CUA), a model trained to interact with graphical interfaces that uses the Web to perform tasks for people. “Using its own browser, it can look at a webpage, and interact with it much like a human would by typing, clicking, scrolling and more,” OpenAI explains. Continue reading New Reasoning Model Improves Smarts of OpenAI Operator

Alibaba Touts Advance in Open-Source AI with Qwen3 Series

China’s Alibaba Group has released a Qwen3 LLM series said to be at the leading edge of open-source models, nearly achieving the performance of proprietary models from AI competitors OpenAI and Google. Alibaba says Qwen3 offers improvements in reasoning, tool use, instruction following and multilingual abilities. The Qwen3 series features eight new models — two that are mixture-of-experts and six built on dense neural networks. Their sizes range from 600 million to 235 billion parameters. The size and scope of the Alibaba slate maintains China’s accelerated AI pace in the wake of DeepSeek’s game-changing debut. Continue reading Alibaba Touts Advance in Open-Source AI with Qwen3 Series

OpenAI Introduces New Models That Can Reason with Images

OpenAI has released two new AI models that use images as part of their reasoning process, “thinking with images.” OpenAI o3 and o4-mini “are the smartest models we’ve released to date, representing a step change in ChatGPT’s capabilities for everyone from curious users to advanced researchers,” the company says. The new entries in the “o” series also have agentic capabilities and can independently “use and combine every tool within ChatGPT, including searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images.” Continue reading OpenAI Introduces New Models That Can Reason with Images

Researchers Debut Preview of DeepCoder Reasoning Model

A new open-source code reasoning model called DeepCoder-14B-Preview has hit the market. Built atop DeepSeek-R1 and Qwen2.5 using reinforcement learning (RL), it aims to provide more flexibility by combining high-performance code generation with reasoning capabilities for real-world applications. Its performance is said to be comparable to OpenAI’s o3-mini, “but with a smaller footprint,” say its developers, the research-driven AI companies Together AI and Agentica. “We democratize the recipe for training a small model into a strong competitive coder,” explains Together AI. Continue reading Researchers Debut Preview of DeepCoder Reasoning Model

Google Debuts Next-Gen Reasoning Models with Gemini 2.5

Google has released what it calls its most intelligent AI model yet, Gemini 2.5. The first 2.5 model release, an experimental version of Gemini 2.5 Pro, is a next-gen reasoning model that Google says outperformed OpenAI o3-mini and Claude 3.7 Sonnet from Anthropic on common benchmarks “by meaningful margins.” Gemini 2.5 models “are thinking models, capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy,” according to Google. The new model comes just three months after Google released Gemini 2.0 with reasoning and agentic capabilities. Continue reading Google Debuts Next-Gen Reasoning Models with Gemini 2.5

Real-Time Web Access Informs Claude 3.7 Sonnet Responses

Anthropic’s Claude can now search the Internet in real time, allowing it to provide timely and relevant responses that are also more accurate than what the chatbot previously offered, according to the company. Claude incorporates direct citations for its Web-retrieved material, so users can fact-check its sources. “Instead of finding search results yourself, Claude processes and delivers relevant sources in a conversational format.” While this is not exactly groundbreaking — ChatGPT, Grok 3, Copilot, Perplexity and Gemini all have real-time Web retrieval and most include citations — Claude takes a slightly different approach. Continue reading Real-Time Web Access Informs Claude 3.7 Sonnet Responses

Baidu Releases New LLMs that Undercut Competition’s Price

Baidu has launched two new AI systems, the native multimodal foundation model Ernie 4.5 and deep-thinking reasoning model Ernie X1. The latter supports features like generative imaging, advanced search and webpage content comprehension. Baidu is touting Ernie X1 as of comparable performance to another Chinese model, DeepSeek-R1, but says it is half the price. Both Baidu models are available to the public, including individual users, through the Ernie website. Baidu, the dominant search engine in China, says its new models mark a milestone in both reasoning and multimodal AI, “offering advanced capabilities at a more accessible price point.” Continue reading Baidu Releases New LLMs that Undercut Competition’s Price

Foxconn AI Trained in Four Weeks, Suggesting Industry Shift

Taiwan’s Foxconn, the contract manufacturer that assembles Apple’s iPhones, has built its own AI. Called FoxBrain, the company says the large language model was trained in just four weeks with help from Nvidia, using 120 of that company’s H100 chips. FoxBrain has reasoning and mathematical skills and can analyze data and generate code. Initially built for in-house use, Foxconn says it intends to open source the model and hopes it will become a collaborative tool for its partners and enable advancements in manufacturing techniques and supply-chain management. Continue reading Foxconn AI Trained in Four Weeks, Suggesting Industry Shift

Alibaba Says Qwen Reasoning Model on Par with DeepSeek

Alibaba is making AI news again, releasing another Qwen reasoning model, QwQ-32B, which was trained and scaled using reinforcement learning (RL). The Qwen team says it “has the potential to enhance model performance beyond conventional pretraining and post-training methods.” QwQ-32B, a 32 billion parameter model, “achieves performance comparable to DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated),” Alibaba claims. While parameters refer to the total set of adjustable weights and biases in the model’s neural network, “activated” parameters are a subset used for a specific inference task, like generating a response. Continue reading Alibaba Says Qwen Reasoning Model on Par with DeepSeek