AI News of the Week (6th March, 2026)
TLDR: OpenAI ships GPT-5.3 Instant, launches Codex App on Windows, and releases GPT-5.4 with native computer use. Anthropic introduces Claude Marketplace, faces a Pentagon supply chain risk designation, and ChatGPT integrates with Excel and Google Sheets.
OpenAI Ships GPT-5.3 Instant: Less Preachy, More Accurate, New Default for All Users
On March 3, OpenAI released GPT-5.3 Instant as the new default model for all ChatGPT users. The company acknowledged GPT-5.2 Instant could feel "cringe" — overbearing and prone to performative empathy that had frustrated users. The update addresses three problems: conversational tone, web search quality, and hallucination rates. The model reduces unnecessary refusals and moralising preambles, and reworks web search to surface what actually matters rather than loose link lists. Hallucinations dropped 26.8% in high-stakes domains and 19.7% on internal knowledge versus GPT-5.2. Microsoft added it to M365 Copilot on the same day. Within an hour of launch, OpenAI posted "5.4 sooner than you think" on X.
OpenAI Launches Codex App on Windows and Opens Access to Free Users
On March 4, OpenAI completed Codex's cross-platform rollout with a Windows app, and simultaneously opened access to Free and Go subscribers while doubling rate limits for paid tiers. The app acts as a command centre for running multiple AI coding agents in parallel across long-running workflows of up to 30 minutes. A configurable system-level sandbox ships built-in, limiting agents by default to file edits and cached web search, with explicit permission required for elevated commands. OpenAI also launched Automations — cloud-based triggers that keep agents running even when the local machine is off. Since GPT-5.2-Codex shipped in December, usage has doubled and over one million developers now use Codex weekly.
OpenAI Launches GPT-5.4: First General-Purpose Model with Native Computer Use
On March 5, OpenAI released GPT-5.4 in three tiers: Thinking (Plus, Team, Pro), Pro (Pro and Enterprise), and a standard API version with a 1M-token context window. The defining new capability is native computer use — GPT-5.4 can operate desktop apps, browsers, and spreadsheets via keyboard and mouse in a virtual environment, scoring 75.0% on OSWorld-Verified, above the 72.4% human baseline. A new Tool Search feature cuts token usage by 47% on tool-heavy tasks. The model is 33% less likely to make false factual claims than GPT-5.2, and on GDPval — OpenAI's knowledge-work benchmark across 44 occupations — matches or exceeds industry professionals in 83% of comparisons.
OpenAI Launches Codex Security: An Application Security Agent That Validates Its Own Findings
On March 6, OpenAI launched Codex Security in research preview for Enterprise, Business, and Edu customers, directly competing with Anthropic's Claude Code Security launched two weeks prior. Evolved from an internal project called Aardvark, it builds deep project context, identifies complex vulnerabilities, then pressure-tests them in sandboxed environments and generates proof-of-concept exploits before surfacing a finding — eliminating false positives. In 30 days of beta testing across 1.2 million commits, it found 792 critical and 10,561 high-severity issues. OpenAI also committed $10M in API credits for cyber defence research. CrowdStrike and Zscaler both declined on the news.
Anthropic Launches Claude Marketplace: One Enterprise Commitment, Multiple Claude-Powered Tools
Anthropic launched the Claude Marketplace in limited preview, letting enterprise customers apply existing Anthropic spend toward third-party Claude-powered solutions under a single contract. Launch partners include GitLab, Harvey, Lovable, Replit, Rogo, and Snowflake. The launch is paired with new department-specific plugins for HR and investment banking, connectors to Google Workspace, DocuSign, and LegalZoom, and a self-serve Enterprise purchasing option bundling Claude, Claude Code, and Cowork. The pressure on SaaS incumbents continues to mount: ServiceNow is down 23%, Salesforce 22%, Snowflake 20%, Intuit 33%, and Thomson Reuters 31% since Anthropic's Cowork launch in January.
Pentagon Formally Designates Anthropic a Supply Chain Risk — First Time for a US Company
On March 5, the U.S. Department of War designated Anthropic a national security supply chain risk — the first American company ever to receive a classification previously reserved for foreign adversaries like Huawei. The dispute centres on Anthropic's refusal to grant unrestricted "all lawful purposes" access to Claude, insisting on prohibitions against mass domestic surveillance and autonomous lethal weapons. Anthropic says the designation is legally unsound and will be challenged in court, and confirmed the scope covers only direct DoW contracts. Despite the ban, Claude was reported in use during U.S. military strikes on Iran. Claude's consumer downloads surged, with over one million daily new signups pushing it to the top AI app in 20+ countries on the App Store.
ChatGPT Launches Native Excel and Google Sheets Integrations for Financial Analysis
Alongside the GPT-5.4 launch on March 5, OpenAI announced ChatGPT for Excel and Google Sheets, embedding GPT-5.4 directly into spreadsheet cells and workflows. Users can generate summaries, run formulas, build financial models, and execute multi-step data tasks without leaving their spreadsheet. A new financial data feed integration also allows real-time and historical market data to flow directly into ChatGPT workflows within Excel. The release deepens OpenAI's push into professional knowledge work — territory Anthropic's Cowork has been targeting since January — and has been widely characterised as a further escalation of the "SaaSpocalypse" narrative around AI displacing point-solution enterprise software.