Cortex Agent Framework

Capability	Cortex	Typical frameworks
Configuration	Single `cortex.yaml` — or a Python `CortexBuilder`	Scattered code, env vars, multiple files
Task orchestration	LLM-generated DAG — or a hand-authored static DAG of code nodes	Sequential chain or hand-coded state machine
Tool protocol	Native MCP (SSE, stdio, streamable-HTTP)	Custom tool wrappers per integration
Multi-agent	Any agent becomes an MCP tool in one command	Bespoke inter-agent protocols
Self-expanding mesh	Ant Colony + ToolForge generate agents at runtime	Static tool lists, no self-expansion
Intent routing	Heuristic → LLM cascade; small talk skips pipeline	Same path for every turn
Quality gates	Built-in validation agent with scoring + remediation	Manual testing or nothing
Learning	Autonomic gate → delta proposals + draft blueprints	Prompt tweaking by hand
LLM providers	8 cloud + local runtime — swap via config	Usually 1–2, hard-coded
Deployment	`publish docker/package/mcp/ui` — one command each	Write your own Dockerfile
Chat UI	Cortex Synapse — full frontend, built-in	Build your own or use a third-party tool
Setup	Visual wizard + CLI	Read docs, write boilerplate

Documentation

Everything you need to ship.

Full reference for every feature, every config key, every CLI command — all here.

📖

Overview

Understand what Cortex is, who it's for, and why it exists.

What is Cortex?

Cortex is a production-grade AI agent framework for Python. You define an agent — its identity, LLM, tools, task types, quality bar, and deployment target — in a single cortex.yaml file, or in pure Python with CortexBuilder. Cortex handles everything else: decomposing user requests into parallel task graphs, calling MCP tool servers, streaming live progress, scoring response quality, persisting sessions, and deploying as Docker, a Python package, an MCP server, or a ready-made chat UI.

framework = CortexFramework("cortex.yaml")
await framework.initialize()
result = await framework.run_session(user_id="u1", request="Analyse Q3 revenue")
# That's the integration. Everything else is handled.

Or build it in code — and wire plain Python functions in as graph nodes:

agent = CortexBuilder("MyAgent", "...").llm("anthropic", api_key_env="ANTHROPIC_API_KEY")

@agent.node()
async def step(ctx):
    return await ctx.llm(ctx.request)

framework = CortexFramework(config=agent.build())   # no YAML file

Why teams choose Cortex

Skip months of framework engineering

Every AI team eventually builds the same stack: task decomposition, parallel tool execution, streaming events, retry logic, session management, quality scoring, multi-provider routing, deployment. Most teams rebuild it two or three times before shipping. Cortex is that stack. Pre-built. Battle-tested. Driven by config, not code.

Go from idea to running agent in minutes

pip install cortex-agent-framework
cortex setup            # visual wizard at localhost:7799
cortex publish ui       # Cortex Synapse chat UI at localhost:8090

Three commands. You have a working agent with Cortex Synapse — a professional web frontend with task blueprint display, intent classification indicators, live workspace events, token usage tracking, full-text history search, and artifact downloads.

Change behavior without changing code

LLM provider? YAML. Task types? YAML. Concurrency limits? YAML. Validation threshold? YAML. Tool servers? YAML. Your Python code stays a thin wrapper — the agent's behavior lives in cortex.yaml, versioned, diffable, reviewable. Prefer code? CortexBuilder assembles the same config in Python, and @node functions become graph nodes for fully deterministic, LangGraph-style control.

Multi-agent composition for free

Any Cortex agent can be published as an MCP server in one command. Another Cortex agent adds it to tool_servers and calls it like any tool. Standard MCP end-to-end, nothing custom.

Orchestrator → Research Agent (MCP :8081) → brave-search, wikipedia
             → Code Review Agent (MCP :8082) → github, filesystem
             → Writing Agent (MCP :8083) → document-gen

Your agent gets smarter over time

The autonomic Learning Engine fires automatically at end-of-session when complexity and validation scores clear their thresholds. New task patterns stage as delta proposals; once three distinct principals confirm the same pattern it auto-promotes into cortex.yaml. Blueprints capture workflow knowledge in versionable markdown, loaded into context on every run.

What Cortex is not

Not a low-code builder. It's a Python library — drive it with cortex.yaml or the CortexBuilder API. Config replaces boilerplate, not code.
Not an LLM gateway. Bring your own API key.
Not a vector database. It calls MCP tools that do RAG — it doesn't implement retrieval itself.
Not a web framework. Cortex runs inside FastAPI/Django/Flask/Click.

The 60-second pitch

You describe your agent — in YAML or in Python. Cortex gives you:

Automatic task decomposition — LLM breaks requests into a typed dependency graph
Code-first option — build the agent in Python; @node functions become a static DAG
Parallel execution — independent tasks run simultaneously, not sequentially
Intent Gate — chat turns skip the full pipeline; only task-shaped turns decompose
MCP tool servers — connect any tool with three lines of YAML
8 cloud LLM providers + local runtime — switch models without code changes
Response validation — every output scored; regressions caught automatically
Autonomic learning — signal-gated evolution; new patterns stage themselves
Blueprints — reusable workflow knowledge that makes the agent better over time
Streaming events — 18 typed event types for any UI (SSE, WebSocket, CLI)
4 deployment targets — Docker, Python package, MCP server, Cortex Synapse chat UI
Visual setup wizard — configure everything from a browser, no docs required
Security built-in — input sanitisation, credential scrubbing, sandboxed code execution

Define once. Deploy anywhere. Let it learn.

🚀

Quick Start

Build, configure, and deploy your first Cortex agent in 5 minutes.

Quick Start (5 minutes)

1. Install

pip install cortex-agent-framework
cortex --help  # lists: setup, dev, dry-run, publish, spec, replay, delta, migrate, ants

2. Hello World (no external tools needed)

# cortex.yaml
agent:
  name: HelloAgent
  description: A minimal Cortex agent

llm_access:
  default:
    provider: anthropic
    model: claude-sonnet-4-6
    api_key_env_var: ANTHROPIC_API_KEY
    max_tokens: 2048

task_types:
  - name: answer
    description: Answer a user question
    output_format: md
    capability_hint: llm_synthesis

storage:
  base_path: ./cortex_storage

export ANTHROPIC_API_KEY=sk-ant-...
cortex dry-run "Explain gradient descent in two sentences"
cortex dev

Tip: web_search is a built-in capability — it works out of the box via DuckDuckGo with no API key. Just add a task type with capability_hint: web_search.

3. Run the Setup Wizard

cortex setup  # → http://localhost:7799

Step	What you configure
Agent Identity	Name, description, interaction mode
LLM Provider	Model, API key — cloud or local (Ollama/LM Studio/vLLM)
Tool Servers	MCP integrations for external capabilities
Task Types	What your agent can do
Storage & Persistence	Memory / SQLite / Redis, retention, encryption
Adaptive Behaviour	Learning engine, validation, blueprints
Publish Mode	Docker, package, MCP, Chat UI

4. The core integration

from cortex.framework import CortexFramework
import asyncio

framework = CortexFramework("cortex.yaml")
await framework.initialize()

result = await framework.run_session(
    user_id="user_123",
    request="Analyse Q3 revenue trends",
    event_queue=asyncio.Queue(),
)

print(result.response)
print(result.validation_report.composite_score)
print(result.token_usage)
print(result.duration_seconds)

Usage Modes

Chat UI (FastAPI + SSE)

@app.post("/chat")
async def chat(body: dict):
    queue = asyncio.Queue()
    asyncio.create_task(
        framework.run_session(
            user_id=body["user_id"],
            request=body["message"],
            event_queue=queue,
        )
    )
    async def stream():
        while True:
            event = await queue.get()
            payload = {"type": event.event_type.value}
            if isinstance(event, ResultEvent):
                payload["content"] = event.content
            yield f"data: {json.dumps(payload)}\n\n"
            if event.event_type in (EventType.SESSION_END, EventType.ERROR):
                break
    return StreamingResponse(stream(), media_type="text/event-stream")

MCP Server (Agent-to-Agent)

cortex publish mcp --port 8081
# Other agents connect:
#   tool_servers:
#     research:
#       url: http://localhost:8081/mcp
#       transport: sse

CLI Tool

@click.command()
@click.argument("request")
def run(request):
    asyncio.run(_run(request))

async def _run(request):
    fw = CortexFramework("cortex.yaml")
    await fw.initialize()
    q = asyncio.Queue()
    result = await fw.run_session("cli_user", request, q)
    print(result.response)
    await fw.shutdown()

Background Worker

async def process_job(job: dict) -> str:
    q = asyncio.Queue()
    result = await framework.run_session(
        user_id=job["user_id"],
        request=job["prompt"],
        event_queue=q,
    )
    return result.response

Architecture

User Request
     │
     ▼
[Primary Agent]  ──── decomposes → task graph ────► [Task A]  [Task B]
                                                          │         │
                                                    [MCP Agent] [MCP Agent]
                                                          │         │
                                                    [Task C  depends on A+B]
                                                          │
                                                  [Primary Agent synthesises]
                                                          │
                                                  [Validation Agent scores]
                                                          │
                                                  [Learning Engine observes]
                                                          │
                                                    Final Response

Streaming Events

Event	What the UI does
session_start	Show "thinking..." indicator
intent_classified	Show chat vs task routing decision
task_blueprint	Render the full DAG before execution starts
task_start	Show progress ("Searching web...", "Analysing...")
task_tool_call	Show which MCP tool is being invoked
result (partial)	Stream text into the chat bubble
file_output	Show download link for agent-produced file
session_token_usage	Display cumulative token counters

⚡

Features

Complete capability matrix — orchestration, MCP, providers, learning, security.

Core Orchestration

Feature	Description
Fan-out / fan-in	Dependency DAG; independent tasks run in parallel
Three execution modes	`adaptive` (LLM free-form), `pinned` (locked topology), `scripted` (Python handler)
Cycle detection	Task graph compiler rejects cyclic graphs before execution starts
Topological execution	Tasks run as soon as their dependencies complete
Intent Gate	Heuristic → LLM cascade routes chat turns directly; emits `IntentClassifiedEvent`
`interaction_mode`	`interactive` for chat/CLI, `rpc` for MCP/automation — never blocks on clarifications
Smart synthesis	Keyword-grep excerpts (Tier 1) + concurrent LLM summaries (Tier 2) before synthesis pass
Clarification support	Agent can pause and ask follow-up questions via `ClarificationEvent`

LLM Providers (8 built-in)

Provider	Config value	Default env var
Anthropic	`anthropic`	`ANTHROPIC_API_KEY`
OpenAI	`openai`	`OPENAI_API_KEY`
Google Gemini	`gemini`	`GEMINI_API_KEY`
xAI Grok	`grok`	`XAI_API_KEY`
Mistral AI	`mistral`	`MISTRAL_API_KEY`
DeepSeek	`deepseek`	`DEEPSEEK_API_KEY`
AWS Bedrock	`bedrock`	AWS credentials
Azure AI	`azure_ai`	`AZURE_AI_API_KEY`
Local runtime	`local`	optional
Custom	`custom`	provide Python dotted path

Per-task model routing: override the default model for specific task types via task_types[n].llm_provider, or enable Adaptive Model Routing (AMR) to let the decomposer select the LLM based on task complexity.

Model Context Protocol (MCP)

Feature	Description
SSE transport	Connect to remote MCP servers over Server-Sent Events
stdio transport	Spawn MCP servers as subprocesses; full JSON-RPC discovery
streamable-HTTP	Full MCP 1.x streamable HTTP support
Publish as MCP server	Export your Cortex agent as a live MCP server for other agents to call

Streaming Events (18 types)

Event class	Key fields
`StatusEvent`	message, session_id, event_type, metadata
`ResultEvent`	content, partial, validation_score, metadata
`ClarificationEvent`	question, options, clarification_id
`IntentClassifiedEvent`	intent_mode, confidence, reasoning
`TaskBlueprintEvent`	tasks, waves — full DAG before execution
`TaskToolCallEvent`	task_id, task_name, tool_name, tool_input
`WorkspaceEvent`	action, path, is_dir
`FileOutputEvent`	filename, mime_type, size_bytes
`SessionTokenUsageEvent`	input_tokens, output_tokens, cache tokens
`LearningEvent`	action, complexity_score, validation_score

Quality & Validation

Feature	Description
Composite scoring	Every response scored on intent match, completeness, coherence
Configurable threshold	Set a minimum acceptable score (hard floor: 0.60)
Per-session report	Returned on `SessionResult.validation_report`
Model override	Run validation with a different model than task execution

Autonomic Learning

Feature	Description
Signal-driven gate	Fires automatically when TaskComplexityScorer + validation score clear thresholds. No consent prompt.
Draft blueprints	On first stage, a draft blueprint is seeded so guidance accumulates before task is promoted
Distinct-principal accumulation	Promotion requires 3 distinct principals (configurable)
Auto-apply mode	Default on — deltas promote once confidence accumulates. Flip `auto_apply_delta: false` for manual review.
Human-in-the-loop	`cortex delta review` · `delta apply` · `delta rollback`

Security

Feature	Description
Input sanitisation	Prompt injection mitigation on user inputs
Credential scrubbing	Redacts secrets from logs and event streams
WorkspaceBash	Workspace-scoped file/command execution with mandatory HITL before any mutating operation
Code sandbox	Bash sandbox for code execution tasks in a sandboxed subprocess
Session ownership	Session resume gated by original user_id

Developer Tooling

Tool	What it does
Setup wizard	Browser-based `cortex.yaml` generator at `localhost:7799`
Config Studio	`cortex config-ui` — browser UI to inspect/edit cortex.yaml, blueprints, staged deltas
Dry-run validation	`cortex dry-run` validates config and task graph without LLM calls
Hot-reload dev mode	`cortex dev --watch` applies config changes live
Session replay	`cortex replay` shows request, response, task outcomes, validation report
Mock LLM client	`cortex.testing.MockLLMClient` for unit tests without API calls

🎯

Use Cases

Real-world scenarios — every config drawn from the validated UAT suite.

Validated Industry Use Cases

These examples are drawn directly from the framework's User Acceptance Test suite — every config, task description, and assertion was executed against a live LLM and passed.

Education — Adaptive Tutoring

Multi-task pipeline (assess → explain → exercise → solution), code sandbox execution, multi-student session isolation, history persistence.

task_types:
  - name: assess_prior_knowledge
    description: Assess the student's prior knowledge
  - name: explain_concept
    description: Provide clear, age-appropriate explanation
  - name: generate_practice_exercise
    description: Create a practical coding exercise
  - name: write_solution_code
    description: Write a complete, runnable Python solution
    depends_on: [generate_practice_exercise]

Healthcare — Clinical Triage

Urgency classification for acute neurological symptoms, cardiac triage, diabetes crisis management. Validation gate enforces clinical quality floor (≥ 0.65).

validation:
  enabled: true
  threshold: 0.65
history:
  enabled: true
  retention_days: 365
learning:
  require_user_identity: true  # only from authenticated clinicians

Financial Analysis — Portfolio Risk

Four-task pipeline: risk assessment → VaR computation (Python sandbox) → rebalancing recommendations → executive report. Token usage accounting per session.

Software Engineering — Design → Implement → Test → Execute

BST implementation pipeline with code sandbox execution, CI failure analysis, automated fix recommendations, code store persistence.

code_sandbox:
  enabled: true
  timeout_seconds: 120
  allow_network: false
agent:
  concurrency:
    max_parallel_tasks: 2  # execute + test run in parallel

Legal & Compliance — Contract Risk

Clause extraction → risk identification (CRITICAL/HIGH/MEDIUM/LOW, GDPR detection) → negotiation redlines → risk summary. Session completion within defined SLA.

Cross-Domain Policy Analysis (Z5 Crown Jewel)

G20 policy analyst: 3 parallel domain analyses (Education, Finance, Healthcare) → Python AI Equity Index → government policy brief. All 5 tasks completed, learning engine fired, history written with full metadata.

agent:
  concurrency:
    max_parallel_tasks: 3  # three domain analyses run simultaneously

Architecture Patterns

If you need…	Usage mode
A chat UI for end users	Synapse UI (`cortex publish ui`)
An agent other agents call as a tool	MCP Server (`cortex publish mcp`)
A one-shot CLI tool for devs/ops	CLI (`cortex dev` or Click wrapper)
Batch processing of a job queue	Background worker (Celery/SQS + `fw.run_session()`)
AI feature inside an existing web app	Embedded library (`pip install cortex-agent-framework`)
Multi-tenant production service	Docker + Redis storage
Specialist agents at different scales	ANT Colony (`AntColony.hatch()`)

⚙️

Configuration

Every cortex.yaml field — agent, LLMs, tools, storage, validation, learning.

Top-level structure

agent:           # Agent identity, concurrency, timeouts, intent gate
llm_access:      # LLM provider routing
task_types:      # Vocabulary of work the agent can do
tool_servers:    # MCP tool server connections
storage:         # Persistence configuration
sqlite:          # SQLite backend settings
redis:           # Redis backend settings
history:         # Session history settings
validation:      # Quality validation settings
learning:        # Delta learning settings
ant_colony:      # Self-spawning specialist agent mesh
tool_forge:      # Runtime MCP server generation
workspace_bash:  # Workspace-scoped file/command execution with HITL
code_sandbox:    # Sandboxed Python code execution
ui:              # Built-in chat UI settings

Full annotated example

agent:
  name: MyAgent
  description: A helpful AI assistant
  interaction_mode: interactive   # "interactive" | "rpc"
  time:
    default_max_wait_seconds: 120
    default_task_timeout_seconds: 40
  concurrency:
    max_concurrent_sessions: 50
    max_concurrent_sessions_per_user: 3
    max_parallel_tasks: 5
    max_tasks_per_session: 20
  intent_gate:
    enabled: true
    heuristic_confidence_threshold: 0.7
    llm_provider: default
    timeout_seconds: 5.0

llm_access:
  default:
    provider: anthropic       # anthropic|openai|gemini|grok|mistral|deepseek|bedrock|azure_ai|local|custom
    model: claude-sonnet-4-6
    api_key_env_var: ANTHROPIC_API_KEY
    max_tokens: 4096
    temperature: 1.0
    thinking_budget_tokens: 0  # Extended thinking (Anthropic only)
    base_url: null             # For proxies / gateways

tool_servers:
  brave_search:
    transport: sse
    url: http://localhost:8051/sse
  filesystem:
    transport: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp/workspace"]

task_types:
  - name: web_research
    description: Search the web for current information
    output_format: md          # text|md|json|file|html|csv|code
    capability_hint: web_search  # auto|llm_synthesis|web_search|bash|code_exec|app_control|browser|document_generation
    timeout_seconds: 60
  - name: analysis
    description: Analyse data and produce structured insights
    output_format: json
    capability_hint: llm_synthesis
    depends_on: [web_research]

storage:
  base_path: ./cortex_storage

sqlite:
  enabled: true
  path: ./cortex_storage/cortex.db
  wal_mode: true

# Or Redis for distributed deployments:
# redis:
#   enabled: true
#   url: ${REDIS_URL}
#   key_prefix: "cortex:prod:"

validation:
  threshold: 0.75            # Min quality score (floor: 0.60)

history:
  enabled: true
  retention_days: 90

learning:
  enabled: true
  validation_threshold: 0.75
  complexity_threshold: 0.6
  auto_apply_delta: true
  auto_apply_min_confidence: medium  # ≥ 3 distinct principals

workspace_bash:
  enabled: true
  hitl_enabled: true         # Cannot be disabled — enforced at runtime

app_control:                 # Launch + drive native desktop applications
  enabled: true
  hitl_enabled: true         # Approve every launch / script / screenshot
  max_vision_steps: 10       # Fallback screenshot loop cap

playwright_mcp:              # Built-in browser automation
  enabled: true
  browser: chromium          # chromium | firefox | webkit
  headless: false            # Persistent session — logins survive runs

interaction_mode

Mode	Behaviour
`interactive`	Default. Chat UIs, CLI, dev mode. Intent Gate routes conversational turns directly. Clarifications allowed.
`rpc`	MCP / automation. Every turn forced to task path. No interactive clarifications — automated callers never hang. Set automatically by `cortex publish mcp`.

Override at runtime: CORTEX_INTERACTION_MODE=rpc environment variable beats the value in cortex.yaml.

Environment Variables

Variable	Description
`ANTHROPIC_API_KEY`	Anthropic provider API key
`OPENAI_API_KEY`	OpenAI provider API key
`GEMINI_API_KEY`	Google Gemini provider API key
`XAI_API_KEY`	xAI Grok provider API key
`MISTRAL_API_KEY`	Mistral AI provider API key
`DEEPSEEK_API_KEY`	DeepSeek provider API key
`CORTEX_CONFIG`	Override default config path
`CORTEX_LOG_LEVEL`	Logging level (DEBUG, INFO, WARNING, ERROR)
`CORTEX_INTERACTION_MODE`	Override agent.interaction_mode at runtime

⌨️

CLI Reference

Every cortex subcommand — setup, dev, dry-run, publish, replay, delta.

CLI Reference

cortex --help
cortex <command> --help

cortex setup

cortex setup [--port 7799] [--no-browser]

Browser-based setup wizard at localhost:7799. Walks through agent identity → LLM provider → tool servers → task types → storage → publish mode. Writes validated cortex.yaml. Re-running loads existing settings; fields that would break existing data are locked.

cortex config-ui

cortex config-ui [--port 7801]

Config Studio — browser UI to inspect and edit cortex.yaml, blueprints, staged deltas, and session metadata.

cortex dev

cortex dev [--config cortex.yaml] [--watch]

Runs Cortex in development mode. --watch enables hot-reload — edit cortex.yaml and changes apply instantly.

cortex dry-run

cortex dry-run [--config cortex.yaml] "REQUEST"

Validates config and compiles the task graph without making any LLM calls. Catches config errors, unreachable tool servers, and broken depends_on references before spending API credits. Use in CI to gate config changes.

cortex publish

cortex publish docker [--tag my-agent:latest] [--with-ui]
cortex publish package [--output-dir dist]
cortex publish mcp [--port 8080]
cortex publish ui [--port 8090]

publish mcp automatically sets CORTEX_INTERACTION_MODE=rpc. publish docker --with-ui bundles Cortex Synapse into the image.

cortex replay

cortex replay SESSION_ID --user-id USER_ID

Shows request, task decomposition, task outcomes, token usage, validation score, final response, and duration. Requires history.enabled: true.

cortex delta

cortex delta review              # show staged proposals
cortex delta apply [--min-confidence high]  # write to cortex.yaml
cortex delta reject              # reject a proposal
cortex delta rollback            # restore previous config from .bak

cortex ants

cortex ants list                 # show all live ANTs
cortex ants hatch --name NAME --capability CAP
cortex ants stop --name NAME
cortex ants stop-all
cortex ants status

Other commands

Command	What it does
`cortex spec [--format json]`	Generate capability manifest (JSON or YAML)
`cortex migrate`	Validate config schema compatibility against target version
`cortex --version`	Show installed version

📦

Deployment

Ship to production — Docker, Python package, MCP server, or chat UI.

Deployment Targets

Mode	Consumer	When to use
Docker	End users / services	Production microservice, multi-tenant backend
Package	Python developers	Embed in an existing Django/FastAPI app
MCP server	Other agents	Multi-agent composition, IDE integrations
Chat UI	End users (browser)	Quick demo, internal tool, user-facing chat

Option A: Docker

cortex publish docker --tag my-agent:latest
docker build -f Dockerfile.cortex -t my-agent:latest .
docker run --rm -p 8090:8090 -e ANTHROPIC_API_KEY=your_key my-agent:latest

# With Cortex Synapse chat UI bundled:
cortex publish docker --with-ui --tag my-agent:latest

Production checklist: use Redis (not SQLite) for multi-replica deployments. Pass API keys via -e KEY=val or a secret manager. Set max_concurrent_sessions to match your instance size. The UI server exposes /health for readiness probes.

Option B: Python Package

cortex publish package --output-dir dist
pip install dist/cortex_agent_framework-*.whl

Use when you want to embed Cortex in an existing Python app (Django, FastAPI, Flask) or ship a pre-configured agent to internal users. No separate service to operate.

Option C: MCP Server

cortex publish mcp --port 8080
# → http://localhost:8080/mcp

Exposes your agent as a live MCP tool server. Automatically sets CORTEX_INTERACTION_MODE=rpc so callers never hang on interactive clarifications. Connect from another agent:

tool_servers:
  my_agent:
    url: http://host:8080/mcp
    transport: sse

Option D: Cortex Synapse Chat UI

cortex publish ui --port 8090
# → http://localhost:8090

Single-page web frontend with text + file uploads, live task blueprint display, intent classification badges, workspace event streaming, token usage, full-text history search, and artifact ZIP download. Configure title, host, port, and auth under the ui: block.

Multi-Agent Deployment

Run multiple Cortex agents on one machine — each needs its own directory, its own cortex.yaml, and its own ports.

~/agents/
├── research-agent/
│   ├── cortex.yaml           # MCP port 8081, storage ./storage
│   └── storage/
├── code-review-agent/
│   ├── cortex.yaml           # MCP port 8082, storage ./storage
│   └── storage/
└── orchestrator/
    ├── cortex.yaml           # references 8081 + 8082 as tool_servers
    └── storage/

# Terminal 1
cd ~/agents/research-agent    && cortex publish mcp --port 8081
# Terminal 2
cd ~/agents/code-review-agent && cortex publish mcp --port 8082
# Terminal 3
cd ~/agents/orchestrator      && cortex dev

Never share a SQLite file between running agents. SQLite locks the DB file — two agents pointing at the same sqlite.path will fail. Use Redis with a unique key_prefix per agent for shared storage.

Production with FastAPI + Redis

agent:
  name: ProductionAgent
  concurrency:
    max_concurrent_sessions: 100
    max_concurrent_sessions_per_user: 5

llm_access:
  default:
    provider: anthropic
    model: claude-sonnet-4-6
    api_key_env_var: ANTHROPIC_API_KEY

redis:
  enabled: true
  url: ${REDIS_URL}
  key_prefix: "cortex:prod:"

💡

FAQ

Common gotchas and answers — installation, providers, multi-agent, intent gate.

General

What exactly is Cortex?

A Python library (cortex-agent-framework) that gives you a production-grade multi-step AI agent driven entirely by a cortex.yaml config file. It handles task decomposition, parallel tool execution, MCP integration, streaming, validation, and session persistence so you can focus on your use case instead of rebuilding agent plumbing.

How is this different from LangChain / LlamaIndex / CrewAI / AutoGen?

Configuration-first. Most frameworks require writing Python to define an agent. Cortex defines agents in YAML. Change behavior by editing config, not code.
Fan-out / fan-in as a core primitive. Parallel tool execution with a dependency DAG is first-class, not an advanced feature you build yourself.
MCP-native. Tool servers and agent-to-agent composition both use MCP end-to-end. No bespoke inter-agent protocol.
Opinionated production stack. Session management, validation scoring, delta learning, replay, hot-reload, and deployment targets all ship in the box.

Is Cortex production-ready?

Yes. It has concurrency limits, session persistence with WAL replay, quality validation, typed streaming events, OpenTelemetry hooks, and three storage backends. The test suite covers core modules with industry-validated acceptance tests.

What's the license?

MIT. Use it commercially, fork it, modify it, ship it.

Installation & Setup

The setup wizard won't open

Check the port isn't in use: lsof -i :7799
Pass --no-browser and open the URL manually
Use a different port: cortex setup --port 7800

Can I run Cortex without the setup wizard?

Yes. The wizard just generates a YAML file. You can write cortex.yaml by hand — see the Configuration tab for every field.

Running Multiple Agents

Can I run multiple Cortex agents on one machine?

Absolutely — that's the designed pattern. Each agent needs its own directory, its own cortex.yaml, and its own ports. The defaults are just defaults — all overridable.

Can two agents share a SQLite database?

No. SQLite locks the DB file; two Cortex processes pointing at the same sqlite.path will fail intermittently. Give each agent its own storage.base_path, or use Redis with a unique key_prefix per agent.

LLM Providers

Can I mix providers in one agent?

llm_access:
  default:
    provider: anthropic
    model: claude-sonnet-4-6
  task_overrides:
    cheap_summary:
      provider: deepseek
      model: deepseek-chat
    heavy_reasoning:
      provider: anthropic
      model: claude-opus-4-7
      thinking_budget_tokens: 5000

Can I point Cortex at a proxy or gateway (LiteLLM, OpenRouter)?

llm_access:
  default:
    provider: anthropic_compatible
    base_url: https://my-gateway.internal/v1
    api_key_env_var: GATEWAY_KEY
    model: claude-sonnet-4-6

Task Graph & Execution

What happens if one task fails?

The task is marked failed, the session continues, and the Primary Agent synthesises what it can from the successful tasks. SessionResult.task_completion reports which tasks succeeded, failed, or timed out.

What's the max parallelism?

Controlled by agent.concurrency.max_parallel_tasks (default 5). Dependencies always take precedence — a task waits for its depends_on regardless of the parallel cap.

Intent Gate & Interaction Modes

Why did "hi" no longer trigger a full task pipeline?

The Intent Gate classified it as a chat turn and sent it through PrimaryAgent.converse() — skipping scout, decomposition, execution, validation, and learning. Turn it off with agent.intent_gate.enabled: false if every turn should decompose.

What's the difference between interactive and rpc?

interactive — for chat UIs, CLIs, dev mode. Conversational turns skip the full pipeline. ClarificationEvents are emitted and a human answers.
rpc — for agents exposed as callables. Every turn forced to the task path. Interactive clarifications are suppressed. cortex publish mcp sets this automatically.

Chat UI (Cortex Synapse)

How do I get the built-in chat UI?

cortex publish ui --port 8090
# → http://localhost:8090

Serves Cortex Synapse — a single-page web frontend with text + file uploads, live task blueprint display, intent classification badges, workspace event streaming, token usage, full-text history search, and artifact ZIP download. Configure title, host, port, and auth (none / token / basic) under the ui: block in cortex.yaml.

Stop rebuildingthe agent stack.

Ship the agent instead.

Everything an agent needs. Nothing you have to build.

8 LLM Providers + Local

MCP-Native Tools

Fan-out / Fan-in DAG

Code-First Agents

Quality Validation

Autonomic Learning

Ant Colony

Intent Gate

Cortex Synapse Chat UI

4 Deploy Targets

Session Persistence

Security Built-in

Observability

ToolForge

Polyglot Code Sandbox

Native App Control

Built-in Browser Automation

Adaptive Model Routing

Auto-tuned LLM concurrency

Define in YAML. Call in Python.

Code-first agents. LangGraph-style.

Built different.

Built for everyone shipping AI.

Skip 3–6 months of plumbing

Governed agent runtime

Multi-agent meshes at scale

Prototype to production in one file

Swap providers without touching code

Observability out of the box

Proven across every industry.

🎓 Education

🏥 Healthcare

💹 Financial Analysis

🔬 Scientific Research

💻 Software Engineering

⚖️ Legal & Compliance

Everything you need to ship.

Overview

What is Cortex?

Why teams choose Cortex

Skip months of framework engineering

Go from idea to running agent in minutes

Change behavior without changing code

Multi-agent composition for free

Your agent gets smarter over time

What Cortex is not

The 60-second pitch

Quick Start

Quick Start (5 minutes)

1. Install

2. Hello World (no external tools needed)

3. Run the Setup Wizard

4. The core integration

Usage Modes

Chat UI (FastAPI + SSE)

MCP Server (Agent-to-Agent)

CLI Tool

Background Worker

Architecture

Streaming Events

Features

Core Orchestration

LLM Providers (8 built-in)

Model Context Protocol (MCP)

Streaming Events (18 types)

Quality & Validation

Autonomic Learning

Security

Developer Tooling

Use Cases

Validated Industry Use Cases

Education — Adaptive Tutoring

Healthcare — Clinical Triage

Financial Analysis — Portfolio Risk

Software Engineering — Design → Implement → Test → Execute

Legal & Compliance — Contract Risk

Cross-Domain Policy Analysis (Z5 Crown Jewel)

Stop rebuilding
the agent stack.