|
On the same day Anthropic told Pro account holders they could no longer run OpenClaw, they shipped Ultraplan. The timing is not accidental.
OpenClaw used Anthropic's API inside agentic loops to automate coding, email, and browser tasks. Anthropic's new policy classified these workloads as heavy infrastructure strain, and told users to switch to pay-as-you-go or use API keys. The same day, they released Ultraplan: first-party agentic planning built directly into Claude Code.
The pattern matches every major platform company that banned third-party clients once first-party tools caught up. The difference here is the timeline is compressed to a single 24-hour window.
To understand why this matters, you need to understand what OpenClaw was doing. It was not a wrapper that called Claude once per task. It ran Claude inside loops: an outer agent that planned, an inner agent that executed, tool calls that fed results back into the context, and another loop that evaluated outputs and decided whether to retry.
The whole system could make dozens of calls to complete a single engineering task. Pro and Max plans include a set number of messages per month. Agentic loops burned through that budget in hours, not days. Anthropic's infrastructure was absorbing the cost of multi-turn agent loops priced as if they were single-turn conversations.
Ultraplan does not have this problem because Anthropic controls both the pricing and the infrastructure. When Ultraplan runs Opus 4.6 in the cloud for 30 minutes across a multi-agent loop, the cost goes against a different budget entirely. That budget is Anthropic's, calibrated to their cloud costs, not to message-count pricing designed for conversational use. This is the structural reason first-party tools can do things third-party tools cannot: the economics are different at the infrastructure layer.
What Ultraplan Actually Does
Most coverage frames it as a smarter planner. That misses the structural shift. Ultraplan is a workflow handoff: your local command line initiates the task, and the planning runs remotely on Anthropic's Cloud Container Runtime with Opus 4.6.
While the cloud session works, your terminal stays available for other tasks. This is the actual product change. Not smarter output. A different location for where the thinking happens.
The cloud session runs for up to 30 minutes. Your command line polls for status every 3 seconds. When the plan is ready, you open a browser, leave inline comments on specific passages, use emoji reactions to flag sections, and choose where execution happens: cloud (auto PR) or back to your terminal.
The Browser Review Is the Real Product
When the status indicator in your terminal changes to "ultraplan ready," you open the session link in a browser. What you see is not a text dump. It is a structured review interface built specifically for plan iteration before any code changes.
You can highlight any passage in the plan and leave an inline comment for Claude to address. You can drop an emoji reaction on a section to signal approval or flag a concern without writing a full response. An outline sidebar lets you jump between sections of a long plan without scrolling. This matters on migration plans that span 40+ files: you can go straight to the "risk section" and annotate it without reading the whole document top to bottom.
When you ask Claude to address your comments, it revises the plan and presents an updated draft. You can iterate as many rounds as needed before choosing execution. This is the workflow gap that terminal-based planning never solved: targeted feedback on specific sections rather than rejecting or accepting the whole plan.
Once you approve, two execution paths appear. "Approve and start coding in your browser" hands off to the same cloud session, which implements the plan and opens a pull request. "Approve and teleport back to terminal" sends the plan to your waiting local session, where you choose: implement in the current conversation, start a fresh session with only the plan as context, or store the plan to a file and return to it later.
|
Ultraplan: From Terminal to Cloud to Code
|
[command line]
/ultraplan
Local command line
|
→ |
[the cloud runtime]
the cloud runtime + Opus 4.6
Cloud (up to 30 min)
|
→ |
[Review]
Browser Review
Comment. Approve.
|
|
|
3 Plan Variants (A/B assigned, from leaked source)
simple_plan
No subagents. Direct file exploration.
|
diagram_plan
+ Mermaid/text-based diagrams of data flow.
|
multi-agent *
3 explorer agents + 1 critic synthesizes.
|
|
|
On approval, choose:
|
|
Source: code.claude.com/docs + leaked npm source, March 31 + April 2026
|
Here Is the Part Nobody Is Talking About
On March 31, 2026, a packaging error published 512,000+ lines of Claude Code's TypeScript source on npm. Inside were the Ultraplan system prompts. They reveal that Ultraplan is not one planner. It is at least three variants, assigned through A/B testing.
Variant 1 (simple_plan): No subagents. Claude uses Glob, Grep, and Read to explore your codebase directly, then calls ExitPlanMode. This is regular plan mode running on cloud hardware.
Variant 2 (diagram_plan): Same as simple_plan with an added instruction to generate Mermaid or text-based diagrams showing dependency sequence, data flow, and change shape.
Variant 3 (multi-agent): Three parallel explorer agents each independently approach the problem in separate context windows. A dedicated critic agent receives all three outputs, evaluates them against the original task, and synthesizes the final plan. The critic cannot see the explorers working, which prevents self-consistency bias.
Why the Multi-Agent Variant Works Differently
To understand why the multi-agent variant produces different output than simple_plan, you need to understand a core limitation of single-agent planning. When a language model generates tokens sequentially, each token conditions the next. The first paragraph of a plan shapes everything that follows.
If the model opens with "migrate sessions to JWTs by adding middleware," that framing anchors all subsequent reasoning. Alternative approaches get filtered out, not because they are wrong, but because they conflict with the trajectory already in progress. This is the self-consistency problem: a single agent reasoning through a plan tends to converge on its first instinct.
The multi-agent variant breaks this by running three explorer agents in parallel, each in a completely separate context window with no knowledge of what the others are producing. One explorer might prioritize performance. Another might prioritize rollback safety. A third might identify a dependency the first two missed entirely.
Because they share no intermediate state, they can arrive at genuinely different conclusions. This is not redundancy. It is a structured form of independent verification, running in parallel so the cloud session does not take three times as long.
The critic agent then receives all three outputs alongside the original task specification. Its job is not to pick the best plan outright. It evaluates which explorer's approach best addresses the stated requirements, identifies strong elements from the others worth incorporating, and flags contradictions where the explorers disagree. The synthesis is grounded in the original spec, not in which explorer happened to be most confident in tone.
This pattern appears in research under the name Mixture of Agents. Work from Together AI showed that aggregating outputs from multiple independent model runs outperforms single-model runs on reasoning benchmarks. Ultraplan applies the same principle inside a single model family by using independent sampling to create diversity. Separating generation from evaluation produces better outputs than having one agent do both.
In practice, the multi-agent variant is most valuable for tasks with high blast radius: migrations touching authentication, database schema changes, or anything where an incomplete plan causes cascading failures across services. For a simple feature addition to a single service, simple_plan is faster and produces comparable quality. The three-explorer overhead only pays off when the cost of a wrong plan is high enough to justify it.
|
Ultraplan Architecture
|
|
|
→ |
Cloud Runtime · Opus 4.6 · up to 30 min
|
|
diagram_plan
+ text diagrams
|
multi-agent *
3 explorers + critic
|
|
→ |
|
→ |
or
|
|
|
* multi-agent: each explorer runs in a separate context window, no shared state
|
|
|
|
→ |
Critic Agent
evaluates all 3, synthesizes
|
|
|
Source: code.claude.com/docs + leaked npm source, April 2026
|
|
Ultraplan may matter more as planning infrastructure than as a fixed planner. Anthropic is using the cloud review loop to test and refine planning strategies across millions of runs. The slash command is the product. The A/B data is the real asset.
|
How to Run It
You need Claude Code v2.1.91 or later, Claude Code on the Web enabled, and a GitHub repository connected. Ultraplan does not run on Bedrock, Vertex, or Foundry.
Three trigger paths: type /ultraplan [task] directly. Or include the word "ultraplan" anywhere in a prompt and Claude detects it. Or, after a local plan finishes and the approval dialog appears, choose "No, refine with Ultraplan." That third path is the strongest: the cloud session starts with your local plan as context rather than from scratch.
One constraint: if Remote Control is active, it disconnects when Ultraplan starts. Both use the claude.ai/code interface. One can run at a time.
Getting Good Results
The third trigger path is the strongest way to use it. Run a local plan first, let it finish, then choose "No, refine with Ultraplan" at the approval dialog. The cloud session inherits your local plan as starting context rather than beginning from scratch.
You get the speed and review surface of Ultraplan without losing the codebase familiarity your local session already built up. This is the path Anthropic recommends for complex migrations.
Use the outline sidebar before leaving any comments. On large plans, it is easy to annotate a detail in section 3 when the real issue is in section 7. Read the structure first, then target your feedback. Claude addresses comments in the order they appear, so sequencing matters.
If you want to monitor the cloud session from your terminal while it runs, type /tasks and select the ultraplan entry. You get the session link, live agent activity, and a Stop action. You do not need to keep the browser open while the plan is being drafted.
|