Cost demo
A central orchestrator over lane orchestrators over bounded workers, with the durable state on disk so any piece respawns from a few KB.
The operating layer is Cursor. The lower cost of a long run is what this structure buys.
The problem
A monolithic agent keeps its plan, its state and its work in a single growing thread, so no part can be split off, replaced or audited on its own.
That thread re-sends its whole context every turn, so the input tokens that dominate the bill climb worse than linearly as the run goes on.
The shape
Put the durable state on disk so the thread becomes disposable, then split the work into lanes that a tiered orchestration routes and reconciles.
The architecture
Orchestration is tiered, so no single agent carries the whole program.
State on disk
The durable truth lives in a .cca folder with one lane per work area, so the folder is the source of truth and the chat is disposable.
Replace a worker
Because the worker's state is on disk, a fresh one reads a few KB and resumes, so it carries no stale or cross-task context and pays no replay tax.
Replace the orchestrator
O0 runs in its own chat and reconciles from the .cca workspace every turn, so it holds nothing irreplaceable and a fresh O0 picks up where it left off.
Why cycling is safe
On a real program verified
The lane, task-home and checkpoint counts are read straight from the central step log. The cycled-chat count is an estimate.
What it buys estimate
70 to 85 percent fewer input tokens on the metered surface.
An estimate from one real run. The structural cause is verified, durable state on disk, bounded per-task context and an orchestrator cycled in one hydration turn. Dollar figures are illustrative, not measured.Two levers estimates labeled
| Lever | What changes | Estimated effect |
|---|---|---|
| External orchestrator | Planning runs on a flat-fee chat, off the metered in-editor turns, cycled in one hydration turn. | token-heavy reasoning leaves the metered surface |
| Bounded workers | Each worker is scoped to one task home, near 10 to 25K tokens, where one monolithic thread grows to 80 to 150K. | about 70 to 80 percent fewer input tokens per task |
Caching lowers the absolute dollars and the structural win holds either way.
Read more
The method, the worked program and the reading version of this demo.
Arrow keys or space to move