By
Paula ParisiFebruary 6, 2025
Anthropic has created a method to defend AI models against “jailbreaks” — unauthorized workarounds to get an AI model to do things it was trained not to do, like providing instructions for building chemical weapons. Called Constitutional Classifiers, the system was 95 percent effective in identifying and preventing jailbreaks of Anthropic’s Claude 3.5 Sonnet in a test environment. In an effort to drum up real-world red-teaming, the company offered cash prizes of up to $15,000 to anyone who could jailbreak its Sonnet AI model. After some 3,000 hours of attempts by 185 participants, none claimed an award. Now the company is offering additional incentives. Read more
By
Paula ParisiFebruary 5, 2025
Cloudflare is making it easier to assess the authenticity of online images by adopting the Content Credentials system advanced by Adobe and embraced by many others. Images hosted using Cloudflare now integrate Content Credentials, ensuring metadata remains intact. The platform tracks ownership and subsequent modifications, including whether artificial intelligence was used to edit the images. With touchpoints to an estimated 20 percent of Internet traffic, connectivity firm Cloudflare substantively expands the reach of the Content Authenticity Initiative (CAI), founded in 2019. Read more
By
Paula ParisiFebruary 5, 2025
Google is batting back against malware and backdoor computer infection by adding VPN app verification at the Google Play Store that includes a badge for trusted downloads. Google has indicated that simply selecting reputable brand-name VPNs (virtual private networks) is no longer an effective way of avoiding trouble, as nefarious actors have found ways to infect legitimate VPN apps with malware. Last month, the Google Managed Defense team warned that malware known as Playfulghost had reportedly infected some popular VPNs, using them to inject malware and remotely control infected devices. Read more
By
Paula ParisiFebruary 5, 2025
Most people know Hugging Face as a resource-sharing community, but it also builds open-source applications and tools for machine learning. Its recent release of vision-language models small enough to run on smartphones while outperforming competitors that rely on massive data centers is being hailed as “a remarkable breakthrough in AI.” The new models — SmolVLM-256M and SmolVLM-500M — are optimized for “constrained devices” with less than around 1GB of RAM, making them ideal for mobile devices including laptops and also convenient for those interested in processing large amounts of data cheaply and with a low-energy footprint. Read more
By
Paula ParisiFebruary 4, 2025
ChatGPT has a new “deep research” agent that OpenAI says uses reasoning to synthesize large amounts of online information and complete multi-step research tasks. “It accomplishes in tens of minutes what would take a human many hours,” OpenAI suggests, claiming it will “synthesize hundreds of online sources to create a comprehensive report at the level of a research analyst.” Powered by a version of the upcoming OpenAI o3 model optimized for web browsing and data analysis, the company says the deep research agent will typically take 5 to 30 minutes to complete its work. The agent is described as an ideal research tool for areas such as finance, science and engineering. Read more