Cascade AI

Classification	Example	Route	T2 Managers
Simple	"What is a closure?"	T3	—
Moderate	"Add pagination to the users API"	T2 T3 ×2	1
Complex	"Refactor auth module to JWT, add tests, open PR"	T1 T2 ×3 T3 ×n	3–5
Highly Complex	"Research, benchmark, and document the full auth ecosystem"	T1 T2 ×5+ T3 ×n	5+

Everything included

Production-grade from day one.

No plugin store to browse. The tools your agents need are already wired in.

⚡

Live Agent Tree

Watch the T1→T2→T3 hierarchy execute in real time directly in the terminal via ink rendering.

🔒

Permission Escalation

Dangerous tool calls escalate through T2 → T1 → user before executing. Never a silent file delete.

🔄

Provider Failover

Rate-limit hit? Cascade auto-switches providers with exponential backoff. Zero config required.

🛠️

Full Tool Suite

Shell, file CRUD, git, GitHub/GitLab PRs, Playwright browser automation, PDF creation, code interpreter.

🌐

Web Dashboard

React + ReactFlow live topology graph, session browser, cost tracker, JWT auth, WebSocket updates.

🔌

MCP Support

Connect any Model Context Protocol server. Its tools become available to every T3 worker automatically.

💰

Per-Tier Cost Breakdown & Budget Control

Every result exposes costByTier, tokensByTier, and percentage attribution. Set a live session budget with /budget set 0.50 — Cascade warns you at 80% spend (configurable via warnAtPct) and stops new tasks the moment the cap is hit, with no config-file edits required.

⌨️

Guided Setup Wizard

First-run TUI collects API keys for every provider — including multiple Azure deployments and custom OpenAI-compatible endpoints. Fetches live model lists, then assign T1/T2/T3 models or let Cascade Auto decide.

🖥️

Claude Code-Style CLI

Redesigned terminal UI with a top status bar showing live tier models and cost, a compact agent tree for T1→T2→T3 progress, and a keyboard hint bar — all purpose-built for Cascade's multi-tier hierarchy.

🎛️

Interactive Model Picker

Run /model inside the REPL for a three-step picker — provider → tier → model — with Auto at every step. Arrow keys, Tab, j/k and number keys all work; selections write .cascade/config.json and hot-swap the live router, no restart required.

⛔

Task Cancellation via AbortSignal

Pass an AbortSignal to cascade.run() to stop any in-progress run mid-flight. All active tiers (T1 → T2 → T3) halt at the next safe checkpoint before the next LLM call — no mid-stream interruptions, no orphaned agents. A run:cancelled event fires with partial output so you can still surface what was produced. Prevents runaway token spend on long tasks.

cascade.config.json

// .cascade/config.json
{
  "version": "1.0",
  "providers": [
    { "type": "anthropic",
      "apiKey": "sk-ant-..." },
    { "type": "ollama" }
  ],
  "models": {
    "t1": "claude-opus-4",
    "t2": "claude-sonnet-4",
    "t3": "llama3.2:3b"
  },
  "tools": {
    "shellBlocklist": ["rm -rf"],
    "requireApprovalFor": ["shell"]
  }
}

SDK usage

Embed in any
Node.js project.

Cascade exposes a first-class TypeScript SDK. Bring your own approval flow, stream tokens to any UI, or wire it into a CI pipeline.

Full TypeScript types for every option and result

Token-by-token streaming via callback

Custom approval callbacks for tool gating

Per-tier cost & token breakdown — costByTier, tokensByTier, and percentage attribution in every result

Live budget management — /budget set <$amount> caps session spend at runtime; /budget shows a visual spend bar; proactive warning fires at 80% (configurable warnAtPct) before the hard stop

runCascade, createCascade, streamCascade — three entry points

example.ts

import { streamCascade } from 'cascade-ai';

await streamCascade(
  'Refactor auth module to use JWT, add tests, open a PR',
  (token) => process.stdout.write(token),
  {
    workspacePath: '/my/project',
    approvalCallback: async (req) => {
      console.log(`Allow ${req.toolName}?`);
      return { approved: true, always: false };
    },
  }
);

Agents that
think in tiers.

Three tiers. One coherent output.

Administrator

Complexity determines the tier count.

Production-grade from day one.

Live Agent Tree

Permission Escalation

Provider Failover

Full Tool Suite

Web Dashboard

MCP Support

Per-Tier Cost Breakdown & Budget Control

Guided Setup Wizard

Claude Code-Style CLI

Interactive Model Picker

Task Cancellation via AbortSignal

Embed in any
Node.js project.

One command away.

Agents that think in tiers.

Three tiers. One coherent output.

Administrator

Complexity determines the tier count.

Production-grade from day one.

Live Agent Tree

Permission Escalation

Provider Failover

Full Tool Suite

Web Dashboard

MCP Support

Per-Tier Cost Breakdown & Budget Control

Guided Setup Wizard

Claude Code-Style CLI

Interactive Model Picker

Task Cancellation via AbortSignal

Embed in anyNode.js project.

One command away.

Agents that
think in tiers.

Embed in any
Node.js project.