Anthropic and OpenAI Report Findings of Joint AI Safety Tests

OpenAI and Anthropic — rivals in the AI space who guard their proprietary systems — joined forces for a misalignment evaluation, safety testing each other’s models to identify when and how they fall short of human values. Among the findings: reasoning models including Anthropic’s Claude Opus 4 and Sonnet 4, and OpenAI’s o3 and o4-mini resist jailbreaks, while conversational models like GPT-4.1 were susceptible to prompts or techniques intended to bypass safety protocols. Although the test results were unveiled as users complain chatbots have become overly sycophantic, the tests were “primarily interested in understanding model propensities for harmful action,” per OpenAI. Continue reading Anthropic and OpenAI Report Findings of Joint AI Safety Tests

Alibaba Is Rolling Out Its ‘Most Agentic Code Model to Date’

Alibaba’s Qwen team has launched Qwen3-Coder, which it calls its “most agentic code model to date.” While it will be made available in multiple sizes, the most powerful variant — Qwen3-Coder-480B-A35B-Instruct — is being released first. The 480 billion parameter mixture-of-experts model has 35 billion active parameters supporting a context length of 256,000 tokens natively and 1 million tokens with extrapolation methods for “exceptional performance in both coding and agentic tasks,” explains the group, which claims the quasi-open source model has agentic coding, agentic browser use, and agentic tool use comparable to Anthropic’s proprietary Claude Sonnet 4. Continue reading Alibaba Is Rolling Out Its ‘Most Agentic Code Model to Date’

Anthropic Touts Mobile Voice Mode, Free Search for Claude

Anthropic’s new mobile conversation voice mode for its large language model Claude lets it search Google Docs, Drive, Calendar and more on smartphones. Just a week after debuting two new LLMs — Claude Opus 4 and Sonnet 4 — Anthropic announced the mobile updates for its Claude AI chatbot for iOS and Android and said it is extending web search for all users on free Claude plans. While Claude’s conversational voice interface is currently available only in English and only via mobile, an API for desktop use and browser-based support are part of future plans. Amazon and Google both have investment stakes in San Francisco-based Anthropic. Continue reading Anthropic Touts Mobile Voice Mode, Free Search for Claude