The best frameworks don't constrain your thinking — they eliminate the thinking that shouldn't be yours. Parallel task graphs, self-spawning agents, runtime tool generation, native app control, built-in browser automation, polyglot code sandbox, and signal-driven learning: Cortex absorbs the infrastructure so your team ships the product.
Every AI team rebuilds the same stack. Cortex ships it pre-built, battle-tested, and driven entirely by config.
Anthropic, OpenAI, Gemini, Grok, Mistral, DeepSeek, AWS Bedrock, Azure AI — plus Ollama, LM Studio, and vLLM for fully offline runs. Swap providers via config with no code changes.
First-class SSE, stdio, and streamable-HTTP MCP tool servers. Dynamic tool discovery at session start. Any Cortex agent becomes an MCP server in one command.
LLM-generated dependency graph with parallel execution. Independent tasks run simultaneously. Topological ordering ensures dependencies always resolve. Cycle detection at compile time.
Build the whole agent in Python with CortexBuilder — no YAML required. The @node decorator wires plain functions in as graph nodes, LangGraph-style; the declared DAG runs verbatim as a static graph.
Every response scored on intent match, completeness, and coherence. Configurable quality floor. Responses below threshold are flagged and remediated automatically before the pipeline moves on.
Signal-gated end-of-session evolution. TaskComplexityScorer + validation thresholds gate learning. New task patterns stage as delta proposals. Distinct-principal confirmation gates promotion.
Orchestrator self-spawns specialist Cortex agents as MCP servers at runtime. Supervised, health-checked, auto-restarted. ToolForge generates brand-new MCP servers from LLM-written code.
Heuristic → LLM cascade classifier routes chat-shaped turns (greetings, small talk) directly to a streaming reply. Only task-shaped turns decompose. Zero LLM cost for conversational turns.
Built-in web frontend with file uploads, live task blueprint display, intent badges, workspace events, token usage tracking, full-text history search, and artifact ZIP download.
cortex publish docker · publish package · publish mcp · publish ui. One command per target. Docker, pip wheel, MCP server, or standalone chat UI.
Memory, SQLite (WAL mode), and Redis backends. Write-ahead log for crash recovery. Resumable sessions. Per-user concurrency limits. Session replay with cortex replay SESSION_ID.
Input sanitisation, credential scrubbing, sandboxed code execution, MCP output guard. WorkspaceBash enforces mandatory human-in-the-loop before any mutating file operation.
OpenTelemetry OTLP exporter, typed event stream (18 event types), token accounting per role, duration tracking, audit logs, anomaly detection, and configurable log levels.
The decomposer assigns forge_mcp tasks that generate brand-new MCP servers from LLM-written code, write them to disk, and register with Ant Colony at wave boundaries — dependent tasks see the new capability in the same session.
One # LANGUAGE: header picks the runtime — Python, Node, TypeScript, Deno, Shell, Ruby, Go, Rust, C, Java, or Kotlin. Per-ecosystem package install (npm, gem, go get) and background-process mode for long-running servers.
Launch and drive desktop apps via AppleScript (macOS), PowerShell + UI Automation (Windows), or a screenshot vision loop when no scripting interface exists. Auto-discovers each app's scripting dictionary. Every mutating action gated by HITL.
Playwright MCP starts inside the framework — agents get a browser capability automatically with chromium, firefox, or webkit. Session state persists across runs so logins survive. Zero tool-server wiring required.
The decomposer grades each task low / medium / high complexity and AMR maps the tier to a provider. Send fast tasks to Haiku, heavy reasoning to Opus — automatically. Per-task overrides always win.
No more guessing max_parallel_llm_calls in the setup wizard. The framework picks an initial ceiling from your provider+model — 1 for local Ollama, 8 for Haiku, 4 for Opus — and then AdaptiveLLMGate self-tunes it at runtime via AIMD on observed latency and errors. Details.
The entire agent — its identity, LLM, tools, tasks, quality bar, and deployment target — lives in one versioned YAML file.
agent:
name: ResearchAgent
description: Searches the web and writes reports
llm_access:
default:
provider: anthropic
model: claude-sonnet-4-6
api_key_env_var: ANTHROPIC_API_KEY
task_types:
- name: web_research
capability_hint: web_search
output_format: md
- name: write_report
capability_hint: document_generation
depends_on: [web_research]
validation:
threshold: 0.75
learning:
enabled: true
auto_apply_delta: true
from cortex.framework import CortexFramework
import asyncio
framework = CortexFramework("cortex.yaml")
await framework.initialize()
result = await framework.run_session(
user_id="user_1",
request="Research the latest vector DB "
"benchmarks and write a report",
event_queue=asyncio.Queue(),
)
print(result.response)
# validation, token usage, task completion
# — all in result.*
# Fan-out, tool calls, dependency resolution,
# synthesis, validation — all handled.
Prefer code to config? Build the whole agent with CortexBuilder and wire in code nodes — plain Python functions as graph nodes. No YAML, no decomposition LLM call: the DAG you declare is the plan.
from cortex import CortexBuilder, CortexFramework
agent = CortexBuilder("ResearchAgent",
"Searches the web, writes reports")
agent.llm("anthropic", model="claude-sonnet-4-6",
api_key_env="ANTHROPIC_API_KEY")
agent.tool_server("brave",
url="http://localhost:9000/sse")
@agent.node()
async def web_research(ctx):
return await ctx.call_tool(
"brave", "search", query=ctx.request)
@agent.node(depends_on=["web_research"])
async def write_report(ctx):
return await ctx.llm(
f"Write a report:\n{ctx.deps['web_research']}")
framework = CortexFramework(config=agent.build())
await framework.initialize()
result = await framework.run_session(
user_1, "Research vector DB benchmarks")
print(result.response) # synthesised
print(result.node_outputs["write_report"]) # raw node
# Registering a code node switches the agent to
# static execution — the declared graph runs
# verbatim. You still keep the wave engine,
# validation gate, retries, streaming, and
# session persistence. Cortex just skips the
# planner. Mix .task() (LLM-routed) and .node()
# (Python) freely in one agent.
Not another LangChain wrapper. Cortex is an opinionated production stack with MCP end-to-end and config as the first-class primitive.
| Capability | Cortex | Typical frameworks |
|---|---|---|
| Configuration | Single cortex.yaml — or a Python CortexBuilder | Scattered code, env vars, multiple files |
| Task orchestration | LLM-generated DAG — or a hand-authored static DAG of code nodes | Sequential chain or hand-coded state machine |
| Tool protocol | Native MCP (SSE, stdio, streamable-HTTP) | Custom tool wrappers per integration |
| Multi-agent | Any agent becomes an MCP tool in one command | Bespoke inter-agent protocols |
| Self-expanding mesh | Ant Colony + ToolForge generate agents at runtime | Static tool lists, no self-expansion |
| Intent routing | Heuristic → LLM cascade; small talk skips pipeline | Same path for every turn |
| Quality gates | Built-in validation agent with scoring + remediation | Manual testing or nothing |
| Learning | Autonomic gate → delta proposals + draft blueprints | Prompt tweaking by hand |
| LLM providers | 8 cloud + local runtime — swap via config | Usually 1–2, hard-coded |
| Deployment | publish docker/package/mcp/ui — one command each | Write your own Dockerfile |
| Chat UI | Cortex Synapse — full frontend, built-in | Build your own or use a third-party tool |
| Setup | Visual wizard + CLI | Read docs, write boilerplate |
From solo developers to enterprise architects. If you're building AI agents, Cortex gets you to production faster.
Get a production-grade agent runtime in an afternoon. Focus on your domain, not the orchestration layer that every AI team rebuilds from scratch.
Consistent runtime with audit trails, quality gates, and per-user isolation for every product team. One framework, one operational model.
Independent scaling per agent, configurable security, compliance-friendly session encryption, and delegation chains for agent-to-agent provenance.
One cortex.yaml. No framework-of-the-month to learn. Run cortex setup, edit config, and cortex publish ui — you have a working agent.
Change LLM providers, models, and tools from config. Run experiments at scale. Compare outputs across providers without rewriting integration code.
Validation scores, session replay, token accounting, autonomic learning telemetry, OpenTelemetry hooks, and typed event stream — all built in.
Each use case below is drawn from the framework's UAT suite — run against a live LLM and verified.
Adaptive tutoring with multi-task pipelines: assess prior knowledge → explain concept → generate exercise → write and run solution code. Multi-student session isolation.
Clinical triage: symptom analysis → urgency classification → care pathway → clinical handoff summary. Validation gate enforces quality floor (≥ 0.65). Full-year audit trail.
Portfolio risk assessment, VaR computation via Python sandbox, rebalancing recommendations, and executive report — all in one four-task pipeline with token usage accounting.
Climate literature synthesis → Python temperature projection model → structured research report. Multi-task dependency chains with numerical code execution.
Plan → implement → execute → test. Full BST pipeline with code sandbox. CI failure analysis, automated fix recommendations, and code store persistence across sessions.
Contract risk analysis: clause extraction → risk identification (CRITICAL/HIGH/MEDIUM/LOW with GDPR detection) → redline language → one-page risk summary for legal partners.
Full reference for every feature, every config key, every CLI command — all here.
Understand what Cortex is, who it's for, and why it exists.
Cortex is a production-grade AI agent framework for Python. You define an agent — its identity, LLM, tools, task types, quality bar, and deployment target — in a single cortex.yaml file, or in pure Python with CortexBuilder. Cortex handles everything else: decomposing user requests into parallel task graphs, calling MCP tool servers, streaming live progress, scoring response quality, persisting sessions, and deploying as Docker, a Python package, an MCP server, or a ready-made chat UI.
framework = CortexFramework("cortex.yaml")
await framework.initialize()
result = await framework.run_session(user_id="u1", request="Analyse Q3 revenue")
# That's the integration. Everything else is handled.
Or build it in code — and wire plain Python functions in as graph nodes:
agent = CortexBuilder("MyAgent", "...").llm("anthropic", api_key_env="ANTHROPIC_API_KEY")
@agent.node()
async def step(ctx):
return await ctx.llm(ctx.request)
framework = CortexFramework(config=agent.build()) # no YAML file
Every AI team eventually builds the same stack: task decomposition, parallel tool execution, streaming events, retry logic, session management, quality scoring, multi-provider routing, deployment. Most teams rebuild it two or three times before shipping. Cortex is that stack. Pre-built. Battle-tested. Driven by config, not code.
pip install cortex-agent-framework
cortex setup # visual wizard at localhost:7799
cortex publish ui # Cortex Synapse chat UI at localhost:8090
Three commands. You have a working agent with Cortex Synapse — a professional web frontend with task blueprint display, intent classification indicators, live workspace events, token usage tracking, full-text history search, and artifact downloads.
LLM provider? YAML. Task types? YAML. Concurrency limits? YAML. Validation threshold? YAML. Tool servers? YAML. Your Python code stays a thin wrapper — the agent's behavior lives in cortex.yaml, versioned, diffable, reviewable. Prefer code? CortexBuilder assembles the same config in Python, and @node functions become graph nodes for fully deterministic, LangGraph-style control.
Any Cortex agent can be published as an MCP server in one command. Another Cortex agent adds it to tool_servers and calls it like any tool. Standard MCP end-to-end, nothing custom.
Orchestrator → Research Agent (MCP :8081) → brave-search, wikipedia
→ Code Review Agent (MCP :8082) → github, filesystem
→ Writing Agent (MCP :8083) → document-gen
The autonomic Learning Engine fires automatically at end-of-session when complexity and validation scores clear their thresholds. New task patterns stage as delta proposals; once three distinct principals confirm the same pattern it auto-promotes into cortex.yaml. Blueprints capture workflow knowledge in versionable markdown, loaded into context on every run.
cortex.yaml or the CortexBuilder API. Config replaces boilerplate, not code.You describe your agent — in YAML or in Python. Cortex gives you:
@node functions become a static DAGDefine once. Deploy anywhere. Let it learn.
Build, configure, and deploy your first Cortex agent in 5 minutes.
pip install cortex-agent-framework
cortex --help # lists: setup, dev, dry-run, publish, spec, replay, delta, migrate, ants
# cortex.yaml
agent:
name: HelloAgent
description: A minimal Cortex agent
llm_access:
default:
provider: anthropic
model: claude-sonnet-4-6
api_key_env_var: ANTHROPIC_API_KEY
max_tokens: 2048
task_types:
- name: answer
description: Answer a user question
output_format: md
capability_hint: llm_synthesis
storage:
base_path: ./cortex_storage
export ANTHROPIC_API_KEY=sk-ant-...
cortex dry-run "Explain gradient descent in two sentences"
cortex dev
web_search is a built-in capability — it works out of the box via DuckDuckGo with no API key. Just add a task type with capability_hint: web_search.cortex setup # → http://localhost:7799
| Step | What you configure |
|---|---|
| Agent Identity | Name, description, interaction mode |
| LLM Provider | Model, API key — cloud or local (Ollama/LM Studio/vLLM) |
| Tool Servers | MCP integrations for external capabilities |
| Task Types | What your agent can do |
| Storage & Persistence | Memory / SQLite / Redis, retention, encryption |
| Adaptive Behaviour | Learning engine, validation, blueprints |
| Publish Mode | Docker, package, MCP, Chat UI |
from cortex.framework import CortexFramework
import asyncio
framework = CortexFramework("cortex.yaml")
await framework.initialize()
result = await framework.run_session(
user_id="user_123",
request="Analyse Q3 revenue trends",
event_queue=asyncio.Queue(),
)
print(result.response)
print(result.validation_report.composite_score)
print(result.token_usage)
print(result.duration_seconds)
@app.post("/chat")
async def chat(body: dict):
queue = asyncio.Queue()
asyncio.create_task(
framework.run_session(
user_id=body["user_id"],
request=body["message"],
event_queue=queue,
)
)
async def stream():
while True:
event = await queue.get()
payload = {"type": event.event_type.value}
if isinstance(event, ResultEvent):
payload["content"] = event.content
yield f"data: {json.dumps(payload)}\n\n"
if event.event_type in (EventType.SESSION_END, EventType.ERROR):
break
return StreamingResponse(stream(), media_type="text/event-stream")
cortex publish mcp --port 8081
# Other agents connect:
# tool_servers:
# research:
# url: http://localhost:8081/mcp
# transport: sse
@click.command()
@click.argument("request")
def run(request):
asyncio.run(_run(request))
async def _run(request):
fw = CortexFramework("cortex.yaml")
await fw.initialize()
q = asyncio.Queue()
result = await fw.run_session("cli_user", request, q)
print(result.response)
await fw.shutdown()
async def process_job(job: dict) -> str:
q = asyncio.Queue()
result = await framework.run_session(
user_id=job["user_id"],
request=job["prompt"],
event_queue=q,
)
return result.response
User Request
│
▼
[Primary Agent] ──── decomposes → task graph ────► [Task A] [Task B]
│ │
[MCP Agent] [MCP Agent]
│ │
[Task C depends on A+B]
│
[Primary Agent synthesises]
│
[Validation Agent scores]
│
[Learning Engine observes]
│
Final Response
| Event | What the UI does |
|---|---|
| session_start | Show "thinking..." indicator |
| intent_classified | Show chat vs task routing decision |
| task_blueprint | Render the full DAG before execution starts |
| task_start | Show progress ("Searching web...", "Analysing...") |
| task_tool_call | Show which MCP tool is being invoked |
| result (partial) | Stream text into the chat bubble |
| file_output | Show download link for agent-produced file |
| session_token_usage | Display cumulative token counters |
Complete capability matrix — orchestration, MCP, providers, learning, security.
| Feature | Description |
|---|---|
| Fan-out / fan-in | Dependency DAG; independent tasks run in parallel |
| Three execution modes | adaptive (LLM free-form), pinned (locked topology), scripted (Python handler) |
| Cycle detection | Task graph compiler rejects cyclic graphs before execution starts |
| Topological execution | Tasks run as soon as their dependencies complete |
| Intent Gate | Heuristic → LLM cascade routes chat turns directly; emits IntentClassifiedEvent |
interaction_mode | interactive for chat/CLI, rpc for MCP/automation — never blocks on clarifications |
| Smart synthesis | Keyword-grep excerpts (Tier 1) + concurrent LLM summaries (Tier 2) before synthesis pass |
| Clarification support | Agent can pause and ask follow-up questions via ClarificationEvent |
| Provider | Config value | Default env var |
|---|---|---|
| Anthropic | anthropic | ANTHROPIC_API_KEY |
| OpenAI | openai | OPENAI_API_KEY |
| Google Gemini | gemini | GEMINI_API_KEY |
| xAI Grok | grok | XAI_API_KEY |
| Mistral AI | mistral | MISTRAL_API_KEY |
| DeepSeek | deepseek | DEEPSEEK_API_KEY |
| AWS Bedrock | bedrock | AWS credentials |
| Azure AI | azure_ai | AZURE_AI_API_KEY |
| Local runtime | local | optional |
| Custom | custom | provide Python dotted path |
Per-task model routing: override the default model for specific task types via task_types[n].llm_provider, or enable Adaptive Model Routing (AMR) to let the decomposer select the LLM based on task complexity.
| Feature | Description |
|---|---|
| SSE transport | Connect to remote MCP servers over Server-Sent Events |
| stdio transport | Spawn MCP servers as subprocesses; full JSON-RPC discovery |
| streamable-HTTP | Full MCP 1.x streamable HTTP support |
| Publish as MCP server | Export your Cortex agent as a live MCP server for other agents to call |
| Event class | Key fields |
|---|---|
StatusEvent | message, session_id, event_type, metadata |
ResultEvent | content, partial, validation_score, metadata |
ClarificationEvent | question, options, clarification_id |
IntentClassifiedEvent | intent_mode, confidence, reasoning |
TaskBlueprintEvent | tasks, waves — full DAG before execution |
TaskToolCallEvent | task_id, task_name, tool_name, tool_input |
WorkspaceEvent | action, path, is_dir |
FileOutputEvent | filename, mime_type, size_bytes |
SessionTokenUsageEvent | input_tokens, output_tokens, cache tokens |
LearningEvent | action, complexity_score, validation_score |
| Feature | Description |
|---|---|
| Composite scoring | Every response scored on intent match, completeness, coherence |
| Configurable threshold | Set a minimum acceptable score (hard floor: 0.60) |
| Per-session report | Returned on SessionResult.validation_report |
| Model override | Run validation with a different model than task execution |
| Feature | Description |
|---|---|
| Signal-driven gate | Fires automatically when TaskComplexityScorer + validation score clear thresholds. No consent prompt. |
| Draft blueprints | On first stage, a draft blueprint is seeded so guidance accumulates before task is promoted |
| Distinct-principal accumulation | Promotion requires 3 distinct principals (configurable) |
| Auto-apply mode | Default on — deltas promote once confidence accumulates. Flip auto_apply_delta: false for manual review. |
| Human-in-the-loop | cortex delta review · delta apply · delta rollback |
| Feature | Description |
|---|---|
| Input sanitisation | Prompt injection mitigation on user inputs |
| Credential scrubbing | Redacts secrets from logs and event streams |
| WorkspaceBash | Workspace-scoped file/command execution with mandatory HITL before any mutating operation |
| Code sandbox | Bash sandbox for code execution tasks in a sandboxed subprocess |
| Session ownership | Session resume gated by original user_id |
| Tool | What it does |
|---|---|
| Setup wizard | Browser-based cortex.yaml generator at localhost:7799 |
| Config Studio | cortex config-ui — browser UI to inspect/edit cortex.yaml, blueprints, staged deltas |
| Dry-run validation | cortex dry-run validates config and task graph without LLM calls |
| Hot-reload dev mode | cortex dev --watch applies config changes live |
| Session replay | cortex replay shows request, response, task outcomes, validation report |
| Mock LLM client | cortex.testing.MockLLMClient for unit tests without API calls |
Real-world scenarios — every config drawn from the validated UAT suite.
These examples are drawn directly from the framework's User Acceptance Test suite — every config, task description, and assertion was executed against a live LLM and passed.
Multi-task pipeline (assess → explain → exercise → solution), code sandbox execution, multi-student session isolation, history persistence.
task_types:
- name: assess_prior_knowledge
description: Assess the student's prior knowledge
- name: explain_concept
description: Provide clear, age-appropriate explanation
- name: generate_practice_exercise
description: Create a practical coding exercise
- name: write_solution_code
description: Write a complete, runnable Python solution
depends_on: [generate_practice_exercise]
Urgency classification for acute neurological symptoms, cardiac triage, diabetes crisis management. Validation gate enforces clinical quality floor (≥ 0.65).
validation:
enabled: true
threshold: 0.65
history:
enabled: true
retention_days: 365
learning:
require_user_identity: true # only from authenticated clinicians
Four-task pipeline: risk assessment → VaR computation (Python sandbox) → rebalancing recommendations → executive report. Token usage accounting per session.
BST implementation pipeline with code sandbox execution, CI failure analysis, automated fix recommendations, code store persistence.
code_sandbox:
enabled: true
timeout_seconds: 120
allow_network: false
agent:
concurrency:
max_parallel_tasks: 2 # execute + test run in parallel
Clause extraction → risk identification (CRITICAL/HIGH/MEDIUM/LOW, GDPR detection) → negotiation redlines → risk summary. Session completion within defined SLA.
G20 policy analyst: 3 parallel domain analyses (Education, Finance, Healthcare) → Python AI Equity Index → government policy brief. All 5 tasks completed, learning engine fired, history written with full metadata.
agent:
concurrency:
max_parallel_tasks: 3 # three domain analyses run simultaneously
| If you need… | Usage mode |
|---|---|
| A chat UI for end users | Synapse UI (cortex publish ui) |
| An agent other agents call as a tool | MCP Server (cortex publish mcp) |
| A one-shot CLI tool for devs/ops | CLI (cortex dev or Click wrapper) |
| Batch processing of a job queue | Background worker (Celery/SQS + fw.run_session()) |
| AI feature inside an existing web app | Embedded library (pip install cortex-agent-framework) |
| Multi-tenant production service | Docker + Redis storage |
| Specialist agents at different scales | ANT Colony (AntColony.hatch()) |
Every cortex.yaml field — agent, LLMs, tools, storage, validation, learning.
agent: # Agent identity, concurrency, timeouts, intent gate
llm_access: # LLM provider routing
task_types: # Vocabulary of work the agent can do
tool_servers: # MCP tool server connections
storage: # Persistence configuration
sqlite: # SQLite backend settings
redis: # Redis backend settings
history: # Session history settings
validation: # Quality validation settings
learning: # Delta learning settings
ant_colony: # Self-spawning specialist agent mesh
tool_forge: # Runtime MCP server generation
workspace_bash: # Workspace-scoped file/command execution with HITL
code_sandbox: # Sandboxed Python code execution
ui: # Built-in chat UI settings
agent:
name: MyAgent
description: A helpful AI assistant
interaction_mode: interactive # "interactive" | "rpc"
time:
default_max_wait_seconds: 120
default_task_timeout_seconds: 40
concurrency:
max_concurrent_sessions: 50
max_concurrent_sessions_per_user: 3
max_parallel_tasks: 5
max_tasks_per_session: 20
intent_gate:
enabled: true
heuristic_confidence_threshold: 0.7
llm_provider: default
timeout_seconds: 5.0
llm_access:
default:
provider: anthropic # anthropic|openai|gemini|grok|mistral|deepseek|bedrock|azure_ai|local|custom
model: claude-sonnet-4-6
api_key_env_var: ANTHROPIC_API_KEY
max_tokens: 4096
temperature: 1.0
thinking_budget_tokens: 0 # Extended thinking (Anthropic only)
base_url: null # For proxies / gateways
tool_servers:
brave_search:
transport: sse
url: http://localhost:8051/sse
filesystem:
transport: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp/workspace"]
task_types:
- name: web_research
description: Search the web for current information
output_format: md # text|md|json|file|html|csv|code
capability_hint: web_search # auto|llm_synthesis|web_search|bash|code_exec|app_control|browser|document_generation
timeout_seconds: 60
- name: analysis
description: Analyse data and produce structured insights
output_format: json
capability_hint: llm_synthesis
depends_on: [web_research]
storage:
base_path: ./cortex_storage
sqlite:
enabled: true
path: ./cortex_storage/cortex.db
wal_mode: true
# Or Redis for distributed deployments:
# redis:
# enabled: true
# url: ${REDIS_URL}
# key_prefix: "cortex:prod:"
validation:
threshold: 0.75 # Min quality score (floor: 0.60)
history:
enabled: true
retention_days: 90
learning:
enabled: true
validation_threshold: 0.75
complexity_threshold: 0.6
auto_apply_delta: true
auto_apply_min_confidence: medium # ≥ 3 distinct principals
workspace_bash:
enabled: true
hitl_enabled: true # Cannot be disabled — enforced at runtime
app_control: # Launch + drive native desktop applications
enabled: true
hitl_enabled: true # Approve every launch / script / screenshot
max_vision_steps: 10 # Fallback screenshot loop cap
playwright_mcp: # Built-in browser automation
enabled: true
browser: chromium # chromium | firefox | webkit
headless: false # Persistent session — logins survive runs
| Mode | Behaviour |
|---|---|
interactive | Default. Chat UIs, CLI, dev mode. Intent Gate routes conversational turns directly. Clarifications allowed. |
rpc | MCP / automation. Every turn forced to task path. No interactive clarifications — automated callers never hang. Set automatically by cortex publish mcp. |
CORTEX_INTERACTION_MODE=rpc environment variable beats the value in cortex.yaml.| Variable | Description |
|---|---|
ANTHROPIC_API_KEY | Anthropic provider API key |
OPENAI_API_KEY | OpenAI provider API key |
GEMINI_API_KEY | Google Gemini provider API key |
XAI_API_KEY | xAI Grok provider API key |
MISTRAL_API_KEY | Mistral AI provider API key |
DEEPSEEK_API_KEY | DeepSeek provider API key |
CORTEX_CONFIG | Override default config path |
CORTEX_LOG_LEVEL | Logging level (DEBUG, INFO, WARNING, ERROR) |
CORTEX_INTERACTION_MODE | Override agent.interaction_mode at runtime |
Every cortex subcommand — setup, dev, dry-run, publish, replay, delta.
cortex --help
cortex <command> --help
cortex setup [--port 7799] [--no-browser]
Browser-based setup wizard at localhost:7799. Walks through agent identity → LLM provider → tool servers → task types → storage → publish mode. Writes validated cortex.yaml. Re-running loads existing settings; fields that would break existing data are locked.
cortex config-ui [--port 7801]
Config Studio — browser UI to inspect and edit cortex.yaml, blueprints, staged deltas, and session metadata.
cortex dev [--config cortex.yaml] [--watch]
Runs Cortex in development mode. --watch enables hot-reload — edit cortex.yaml and changes apply instantly.
cortex dry-run [--config cortex.yaml] "REQUEST"
Validates config and compiles the task graph without making any LLM calls. Catches config errors, unreachable tool servers, and broken depends_on references before spending API credits. Use in CI to gate config changes.
cortex publish docker [--tag my-agent:latest] [--with-ui]
cortex publish package [--output-dir dist]
cortex publish mcp [--port 8080]
cortex publish ui [--port 8090]
publish mcp automatically sets CORTEX_INTERACTION_MODE=rpc. publish docker --with-ui bundles Cortex Synapse into the image.
cortex replay SESSION_ID --user-id USER_ID
Shows request, task decomposition, task outcomes, token usage, validation score, final response, and duration. Requires history.enabled: true.
cortex delta review # show staged proposals
cortex delta apply [--min-confidence high] # write to cortex.yaml
cortex delta reject # reject a proposal
cortex delta rollback # restore previous config from .bak
cortex ants list # show all live ANTs
cortex ants hatch --name NAME --capability CAP
cortex ants stop --name NAME
cortex ants stop-all
cortex ants status
| Command | What it does |
|---|---|
cortex spec [--format json] | Generate capability manifest (JSON or YAML) |
cortex migrate | Validate config schema compatibility against target version |
cortex --version | Show installed version |
Ship to production — Docker, Python package, MCP server, or chat UI.
| Mode | Consumer | When to use |
|---|---|---|
| Docker | End users / services | Production microservice, multi-tenant backend |
| Package | Python developers | Embed in an existing Django/FastAPI app |
| MCP server | Other agents | Multi-agent composition, IDE integrations |
| Chat UI | End users (browser) | Quick demo, internal tool, user-facing chat |
cortex publish docker --tag my-agent:latest
docker build -f Dockerfile.cortex -t my-agent:latest .
docker run --rm -p 8090:8090 -e ANTHROPIC_API_KEY=your_key my-agent:latest
# With Cortex Synapse chat UI bundled:
cortex publish docker --with-ui --tag my-agent:latest
Production checklist: use Redis (not SQLite) for multi-replica deployments. Pass API keys via -e KEY=val or a secret manager. Set max_concurrent_sessions to match your instance size. The UI server exposes /health for readiness probes.
cortex publish package --output-dir dist
pip install dist/cortex_agent_framework-*.whl
Use when you want to embed Cortex in an existing Python app (Django, FastAPI, Flask) or ship a pre-configured agent to internal users. No separate service to operate.
cortex publish mcp --port 8080
# → http://localhost:8080/mcp
Exposes your agent as a live MCP tool server. Automatically sets CORTEX_INTERACTION_MODE=rpc so callers never hang on interactive clarifications. Connect from another agent:
tool_servers:
my_agent:
url: http://host:8080/mcp
transport: sse
cortex publish ui --port 8090
# → http://localhost:8090
Single-page web frontend with text + file uploads, live task blueprint display, intent classification badges, workspace event streaming, token usage, full-text history search, and artifact ZIP download. Configure title, host, port, and auth under the ui: block.
Run multiple Cortex agents on one machine — each needs its own directory, its own cortex.yaml, and its own ports.
~/agents/
├── research-agent/
│ ├── cortex.yaml # MCP port 8081, storage ./storage
│ └── storage/
├── code-review-agent/
│ ├── cortex.yaml # MCP port 8082, storage ./storage
│ └── storage/
└── orchestrator/
├── cortex.yaml # references 8081 + 8082 as tool_servers
└── storage/
# Terminal 1
cd ~/agents/research-agent && cortex publish mcp --port 8081
# Terminal 2
cd ~/agents/code-review-agent && cortex publish mcp --port 8082
# Terminal 3
cd ~/agents/orchestrator && cortex dev
sqlite.path will fail. Use Redis with a unique key_prefix per agent for shared storage.agent:
name: ProductionAgent
concurrency:
max_concurrent_sessions: 100
max_concurrent_sessions_per_user: 5
llm_access:
default:
provider: anthropic
model: claude-sonnet-4-6
api_key_env_var: ANTHROPIC_API_KEY
redis:
enabled: true
url: ${REDIS_URL}
key_prefix: "cortex:prod:"
Common gotchas and answers — installation, providers, multi-agent, intent gate.
A Python library (cortex-agent-framework) that gives you a production-grade multi-step AI agent driven entirely by a cortex.yaml config file. It handles task decomposition, parallel tool execution, MCP integration, streaming, validation, and session persistence so you can focus on your use case instead of rebuilding agent plumbing.
Yes. It has concurrency limits, session persistence with WAL replay, quality validation, typed streaming events, OpenTelemetry hooks, and three storage backends. The test suite covers core modules with industry-validated acceptance tests.
MIT. Use it commercially, fork it, modify it, ship it.
lsof -i :7799--no-browser and open the URL manuallycortex setup --port 7800Yes. The wizard just generates a YAML file. You can write cortex.yaml by hand — see the Configuration tab for every field.
Absolutely — that's the designed pattern. Each agent needs its own directory, its own cortex.yaml, and its own ports. The defaults are just defaults — all overridable.
No. SQLite locks the DB file; two Cortex processes pointing at the same sqlite.path will fail intermittently. Give each agent its own storage.base_path, or use Redis with a unique key_prefix per agent.
llm_access:
default:
provider: anthropic
model: claude-sonnet-4-6
task_overrides:
cheap_summary:
provider: deepseek
model: deepseek-chat
heavy_reasoning:
provider: anthropic
model: claude-opus-4-7
thinking_budget_tokens: 5000
llm_access:
default:
provider: anthropic_compatible
base_url: https://my-gateway.internal/v1
api_key_env_var: GATEWAY_KEY
model: claude-sonnet-4-6
The task is marked failed, the session continues, and the Primary Agent synthesises what it can from the successful tasks. SessionResult.task_completion reports which tasks succeeded, failed, or timed out.
Controlled by agent.concurrency.max_parallel_tasks (default 5). Dependencies always take precedence — a task waits for its depends_on regardless of the parallel cap.
The Intent Gate classified it as a chat turn and sent it through PrimaryAgent.converse() — skipping scout, decomposition, execution, validation, and learning. Turn it off with agent.intent_gate.enabled: false if every turn should decompose.
interactive — for chat UIs, CLIs, dev mode. Conversational turns skip the full pipeline. ClarificationEvents are emitted and a human answers.rpc — for agents exposed as callables. Every turn forced to the task path. Interactive clarifications are suppressed. cortex publish mcp sets this automatically.cortex publish ui --port 8090
# → http://localhost:8090
Serves Cortex Synapse — a single-page web frontend with text + file uploads, live task blueprint display, intent classification badges, workspace event streaming, token usage, full-text history search, and artifact ZIP download. Configure title, host, port, and auth (none / token / basic) under the ui: block in cortex.yaml.