Microsoft adds Agent Mode and Office Agent to Microsoft 365 Copilot

Microsoft adds Agent Mode and Office Agent to Microsoft 365 Copilot; OpenAI and Anthropic models generate, evaluate and refine Word and Excel outputs.

author-image
Manisha Sharma
New Update
Agent Mode and Office Agent to Microsoft 365 Copilot

Microsoft has expanded Microsoft 365 Copilot with two new capabilities , Agent Mode in Word and Excel and a new Office Agent that creates Word and PowerPoint artefacts from natural language prompts. Both tools aim to let users generate documents and spreadsheets using conversational instructions rather than manual authoring.

“In the same way vibe coding has transformed software development, the latest reasoning models in Copilot unlock agentic productivity for Office artefacts,” Sumit Chauhan, Corporate Vice President, Microsoft's Office Product Group, wrote in a blog post.

Microsoft Agent Mode benchmark

Microsoft says Agent Mode not only generates outputs but will “evaluate results, fix issues, and repeat the process until the outcome is verified.” The new agent mode uses OpenAI's models in order to be “democratising access to expert-level capabilities.”

Microsoft shared example use cases for Agent Mode in Word and Excel — with PowerPoint access coming soon- including financial analysis, loan calculations, updating monthly reports, document clean-up and personal budgeting. The company also published a benchmark from SpreadsheetBench showing Agent Mode in Excel has an accuracy of 57.2 per cent, which, while lower than a reported human accuracy of 71.3 per cent, is presented as higher than the accuracy of Shortcut.ai, ChatGPT, ChatGPT agent, and Claude Files Opus 4.1.

Microsoft's New Office Agent

The release also introduces Office Agent, a tool that “creates polished PowerPoint presentations and ready-to-use Word documents from chat in Copilot, and coming soon to Excel,” Chauhan says. Microsoft says Office Agent can interpret prompt specifics such as file length, visual theme, key focus areas and target audience; it can conduct web research to gather relevant information and provide a live preview of slides.

The Office Agent is described as powered by Anthropic’s agent:  a model family Microsoft has incorporated into recent Copilot features alongside OpenAI models.

What Agent Mode and Office Agent do for users

  • Let users author and refine documents and spreadsheets using plain language prompts.

  • Provide iterative evaluation and automated fixes until the agent “verifies” an outcome.

  • Offer task templates and examples spanning reporting, analysis and content clean-up.

  • Surface live previews for slide decks and perform background research where requested.

Advertisment

These capabilities are framed as ways to speed routine tasks and reduce manual effort for common Office workflows, while Microsoft positions the features as part of a broader move to embed reasoning-capable models into productivity tools.

Technical and market implications

Accuracy and human oversight: The SpreadsheetBench figure (57.2%) implies a substantial gap to human performance (71.3%). That gap underscores the need for human-in-the-loop verification, especially for financial analysis, legal language or other high-stakes documents where errors have real consequences.

Agentic loops and verification: The stated design — generate, evaluate, fix, repeat — creates an automated loop that can improve outputs but depends on the reliability of evaluation metrics. If evaluation is imperfect, the loop may reinforce errors. Tooling to surface provenance, editable audit trails and explicit confidence signals will be important for enterprise adoption.

Multi-vendor model stack: Microsoft’s use of OpenAI models for Agent Mode and Anthropic for Office Agent highlights a multi-vendor approach. That diversification can reduce dependence on a single supplier but raises product complexity: model selection, routing logic, latency, cost and consistent behaviour across models become engineering and product-design challenges.

Benchmarks and benchmarking context: Published benchmark numbers provide an early yardstick, but practical value will depend on how well agents perform on domain-specific tasks and how organisations validate outputs. Buyers will want independent, repeatable test cases relevant to their workflows.

Enterprise controls and compliance. Office workflows often include sensitive data and regulatory constraints. Features that perform web research or store intermediate results will need clear controls for data handling, privacy, and record-keeping. Enterprises will evaluate Copilot extensions on the basis of auditability, data residency and integration with existing document governance.

Advertisment

Productivity vs. risk trade-offs: Agentic productivity promises to reduce repetitive effort, but it shifts responsibility onto model behaviour and verification processes. Organisations that deploy these tools at scale will need policies for review, escalation and change management to avoid downstream risks from incorrect or biased outputs.

Agent Mode and Office Agent mark an extension of Microsoft’s strategy to embed reasoning models into mainstream productivity apps. The launch brings concrete capabilities — natural-language generation, iterative evaluation and model-driven slide and document creation — into Word, Excel and PowerPoint workflows. Early benchmark data and Microsoft’s multi-model approach point to both opportunity and caution: the features can speed routine tasks but will require verification, governance and integration work before they replace established human workflows in sensitive or mission-critical contexts.