OpenAI has upgraded its autonomous web browsing agent Operator to the new reasoning model OpenAI o3 from the prior GPT-4o multimodal LLM engine. The update is being released globally in research preview this month for those who subscribe to OpenAI’s ChatGPT Pro for $200 per month. Operator serves OpenAI’s “computer-using agent” (CUA), a model trained to interact with graphical interfaces that uses the Web to perform tasks for people. “Using its own browser, it can look at a webpage, and interact with it much like a human would by typing, clicking, scrolling and more,” OpenAI explains.
“The idea is to go beyond the chatbot interface of ChatGPT and allow OpenAI’s powerful AI models to start taking more actions on behalf of the user,” VentureBeat writes of Operator, which debuted in January.
The update to o3 aims to improve Operator’s performance across “several key dimensions,” including “persistence and accuracy during browser interactions,” VB reports. In practical terms, that means “it is more likely to complete user tasks successfully and with less need for correction or repetition,” offering “responses that are clearer, more structured, and more comprehensive.” The API will, however, remain based on 4o.
“Compared with other models in the o3 family, o3 Operator was fine-tuned with additional safety data for computer use, including safety datasets designed to teach the model our decision boundaries on confirmations and refusals,” OpenAI says in an update advisory, noting that while “o3 Operator inherits o3’s coding capabilities, it does not have native access to a coding environment or Terminal.”
“The o3 model is supposed to be smarter in a more intellectual way than GPT-4o. It can be more focused and is better at step-by-step thinking,” writes TechRadar. Its enhanced persistence “will help it work through the unexpected obstacles of web browsing, like login requests, pop-ups, and CAPTCHA requests.”
“The upgraded Operator stands to significantly enhance the workflows of professionals in AI engineering, orchestration, data management, and IT security,” according to VB. “For those building or maintaining machine learning models, the model’s improved accuracy and structured outputs reduce the overhead of test validation and troubleshooting.”
Data engineers will find they can “delegate manual web interactions — such as data verification and scraping — with more confidence, freeing time for higher-level optimization work,” VB adds, providing benchmark data indicating the new version may improve performance by close to 20 percentage points.
No Comments Yet
You can be the first to comment!
Leave a comment
You must be logged in to post a comment.