OpenAI has launched Operator, a largely autonomous AI agent designed to take your simple text prompts and turn them into real-world tasks completed via the internet. In theory, you can ask it to do almost anything that's possible via a web browser. In practice, early users seem to be finding the results rather hit and miss.
Examples of the sorts of things Operator can do are booking travel, making restaurant reservations for a certain time, or perhaps buying concert tickets for a specific band within a given price range.
Currently released as a research preview only available to ChatGPT Pro subscribers rather than a fully baked product, Operator is based on OpenAI's Computer-Using Agent (CUA) model, which combines the computer vision capabilities from GPT-4o's with specific graphical user interfaces (GUIs) training and advanced reasoning to create a tool capable of browsing the web, formulating multi-step tasks from a text prompt and executing the whole shebang.
Arguably, Operator is not unique, what with ByteDance's UI-TARS and Anthropic’s Computer Use having a somewhat similar remit. But perhaps what makes Operator a little different is that it doesn't need APIs.
«Operator can 'see' (through screenshots) and 'interact' (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations,' OpenAI says.
That said, it does seem like it helps if web services are optimized for Operator. „We’re collaborating with companies like DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, Uber, and others to ensure Operator addresses real-world needs while respecting established norms,“ OpenAI says.
Presumably, your results—or should that be Operator's results?—won't be as accurate with non-optimized services.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
Exactly how good Operator currently is at taking a prompt and running with it isn't
Read more on pcgamer.com