Anthropic and OpenAI Report Findings of Joint AI Safety Tests

OpenAI and Anthropic — rivals in the AI space who guard their proprietary systems — joined forces for a misalignment evaluation, safety testing each other’s models to identify when and how they fall short of human values. Among the findings: reasoning models including Anthropic’s Claude Opus 4 and Sonnet 4, and OpenAI’s o3 and o4-mini resist jailbreaks, while conversational models like GPT-4.1 were susceptible to prompts or techniques intended to bypass safety protocols. Although the test results were unveiled as users complain chatbots have become overly sycophantic, the tests were “primarily interested in understanding model propensities for harmful action,” per OpenAI. Continue reading Anthropic and OpenAI Report Findings of Joint AI Safety Tests

Open-Weight Models Are a First from OpenAI in AWS Catalog

OpenAI is releasing two lower-cost, open-weight reasoning models in an effort to be more competitive with Meta, Mistral and DeepSeek and they will be the first OpenAI models available from Amazon. The new offerings — gpt-oss-120b and gpt-oss-20b — will be among the model choices on AWS’s Bedrock and SageMaker AI services. Both models are said to be well-suited for agentic use. The gpt-oss-120b model performs comparably to OpenAI o4-mini on core reasoning and can run on a single 80GB GPU. The gpt-oss-20b model is compared to OpenAI o3‑mini and can run on edge devices with just 16GB of memory. Continue reading Open-Weight Models Are a First from OpenAI in AWS Catalog

OpenAI Introduces New Models That Can Reason with Images

OpenAI has released two new AI models that use images as part of their reasoning process, “thinking with images.” OpenAI o3 and o4-mini “are the smartest models we’ve released to date, representing a step change in ChatGPT’s capabilities for everyone from curious users to advanced researchers,” the company says. The new entries in the “o” series also have agentic capabilities and can independently “use and combine every tool within ChatGPT, including searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images.” Continue reading OpenAI Introduces New Models That Can Reason with Images