OpenAI is developing a new artificial intelligence assistant known as "Operator," which can automate complex tasks such as coding, travel booking, and online shopping. This product is anticipated to be released in January 2025, initially as a research preview and development tool, with API access for developers.
The new AI assistant is part of OpenAI's ongoing efforts to create intelligent agents capable of performing tasks in web browsers. These AI agents are designed to perceive the environment, make decisions, and execute actions, aiming to offer personalized applications for consumers and cost-effective solutions for businesses.
OpenAI's CEO, Sam Altman, has expressed optimism about AI assistants, indicating a significant breakthrough in this area. OpenAI's Chief Product Officer, Kevin Weil, expects 2025 to be the year AI agents become mainstream. With increasing pressure on OpenAI to commercialize its technology, the release of an innovative product like the Operator is seen as essential.
OpenAI has already released Swarm, an open-source AI agent that enhances task efficiency and reasoning capabilities. This development is a step towards achieving Artificial General Intelligence (AGI), with AI assistants potentially revolutionizing the mobile internet and app ecosystems.
Major tech companies worldwide are also racing to introduce AI assistant products. Microsoft has open-sourced OmniParser, integrated AI agents into Dynamics 365, and introduced a framework called Windows Agent Arena. Google's "Project Jarvis" is set to preview soon, focusing on automating tasks like research and online shopping. Anthropic's large model Claude has added new features enabling AI to control computers, while Apple integrates Siri with ChatGPT for enhanced interactions.
Chinese companies are also in the AI assistant race. Huawei has unveiled a new smartphone control architecture, and AI startup Zhipu AI launched AutoGLM, which can perform various mobile tasks via voice commands, promising convenience and efficiency for consumers.
AutoGLM and similar technologies offer a new interaction model, allowing for complex operations through voice commands, providing substantial convenience to users. It is expected to become a highlight feature in AI terminals, encouraging consumer upgrades.