Stop rebuilding
the agent stack.

Ship the agent instead.

The best frameworks don't constrain your thinking — they eliminate the thinking that shouldn't be yours. Parallel task graphs, self-spawning agents, runtime tool generation, native app control, built-in browser automation, polyglot code sandbox, and signal-driven learning: Cortex absorbs the infrastructure so your team ships the product.

v1.5.0 · MIT · Python 3.11+ · Production-grade
bash
$ pip install cortex-agent-framework
✓ Successfully installed cortex-agent-framework
$ cortex setup   # wizard → localhost:7799
✓ cortex.yaml saved
$ cortex publish ui  # chat UI → localhost:8090
✓ Cortex Synapse running at http://localhost:8090
8 LLM Providers + Local
MCP Native (SSE · stdio · HTTP)
Fan-out / Fan-in DAG
Native App Control
Built-in Browser Automation
Polyglot Code Sandbox
Autonomic Learning
Built-in Chat UI
Quality Validation
Ant Colony
Session Persistence
4 Deploy Targets
OpenTelemetry
Intent Gate
Config Studio
8 LLM Providers + Local
MCP Native (SSE · stdio · HTTP)
Fan-out / Fan-in DAG
Native App Control
Built-in Browser Automation
Polyglot Code Sandbox
Autonomic Learning
Built-in Chat UI
Quality Validation
Ant Colony
Session Persistence
4 Deploy Targets
OpenTelemetry
Intent Gate
Config Studio
8+
LLM Providers
11
Sandbox Languages
4
Deploy Targets
3
Browser Engines
18
Streaming Event Types
1
YAML File to Configure All

Everything an agent needs. Nothing you have to build.

Every AI team rebuilds the same stack. Cortex ships it pre-built, battle-tested, and driven entirely by config.

🧠

8 LLM Providers + Local

Anthropic, OpenAI, Gemini, Grok, Mistral, DeepSeek, AWS Bedrock, Azure AI — plus Ollama, LM Studio, and vLLM for fully offline runs. Swap providers via config with no code changes.

🔌

MCP-Native Tools

First-class SSE, stdio, and streamable-HTTP MCP tool servers. Dynamic tool discovery at session start. Any Cortex agent becomes an MCP server in one command.

Fan-out / Fan-in DAG

LLM-generated dependency graph with parallel execution. Independent tasks run simultaneously. Topological ordering ensures dependencies always resolve. Cycle detection at compile time.

🐍

Code-First Agents

Build the whole agent in Python with CortexBuilder — no YAML required. The @node decorator wires plain functions in as graph nodes, LangGraph-style; the declared DAG runs verbatim as a static graph.

Quality Validation

Every response scored on intent match, completeness, and coherence. Configurable quality floor. Responses below threshold are flagged and remediated automatically before the pipeline moves on.

🔬

Autonomic Learning

Signal-gated end-of-session evolution. TaskComplexityScorer + validation thresholds gate learning. New task patterns stage as delta proposals. Distinct-principal confirmation gates promotion.

🐜

Ant Colony

Orchestrator self-spawns specialist Cortex agents as MCP servers at runtime. Supervised, health-checked, auto-restarted. ToolForge generates brand-new MCP servers from LLM-written code.

🎯

Intent Gate

Heuristic → LLM cascade classifier routes chat-shaped turns (greetings, small talk) directly to a streaming reply. Only task-shaped turns decompose. Zero LLM cost for conversational turns.

💬

Cortex Synapse Chat UI

Built-in web frontend with file uploads, live task blueprint display, intent badges, workspace events, token usage tracking, full-text history search, and artifact ZIP download.

🚀

4 Deploy Targets

cortex publish docker · publish package · publish mcp · publish ui. One command per target. Docker, pip wheel, MCP server, or standalone chat UI.

💾

Session Persistence

Memory, SQLite (WAL mode), and Redis backends. Write-ahead log for crash recovery. Resumable sessions. Per-user concurrency limits. Session replay with cortex replay SESSION_ID.

🔒

Security Built-in

Input sanitisation, credential scrubbing, sandboxed code execution, MCP output guard. WorkspaceBash enforces mandatory human-in-the-loop before any mutating file operation.

📊

Observability

OpenTelemetry OTLP exporter, typed event stream (18 event types), token accounting per role, duration tracking, audit logs, anomaly detection, and configurable log levels.

🔨

ToolForge

The decomposer assigns forge_mcp tasks that generate brand-new MCP servers from LLM-written code, write them to disk, and register with Ant Colony at wave boundaries — dependent tasks see the new capability in the same session.

🧪

Polyglot Code Sandbox

One # LANGUAGE: header picks the runtime — Python, Node, TypeScript, Deno, Shell, Ruby, Go, Rust, C, Java, or Kotlin. Per-ecosystem package install (npm, gem, go get) and background-process mode for long-running servers.

🖥️

Native App Control

Launch and drive desktop apps via AppleScript (macOS), PowerShell + UI Automation (Windows), or a screenshot vision loop when no scripting interface exists. Auto-discovers each app's scripting dictionary. Every mutating action gated by HITL.

🌐

Built-in Browser Automation

Playwright MCP starts inside the framework — agents get a browser capability automatically with chromium, firefox, or webkit. Session state persists across runs so logins survive. Zero tool-server wiring required.

🎚️

Adaptive Model Routing

The decomposer grades each task low / medium / high complexity and AMR maps the tier to a provider. Send fast tasks to Haiku, heavy reasoning to Opus — automatically. Per-task overrides always win.

⚙️

Auto-tuned LLM concurrency

No more guessing max_parallel_llm_calls in the setup wizard. The framework picks an initial ceiling from your provider+model — 1 for local Ollama, 8 for Haiku, 4 for Opus — and then AdaptiveLLMGate self-tunes it at runtime via AIMD on observed latency and errors. Details.

Define in YAML. Call in Python.

The entire agent — its identity, LLM, tools, tasks, quality bar, and deployment target — lives in one versioned YAML file.

YAML cortex.yaml
agent:
  name: ResearchAgent
  description: Searches the web and writes reports

llm_access:
  default:
    provider: anthropic
    model: claude-sonnet-4-6
    api_key_env_var: ANTHROPIC_API_KEY

task_types:
  - name: web_research
    capability_hint: web_search
    output_format: md
  - name: write_report
    capability_hint: document_generation
    depends_on: [web_research]

validation:
  threshold: 0.75

learning:
  enabled: true
  auto_apply_delta: true
Python app.py
from cortex.framework import CortexFramework
import asyncio

framework = CortexFramework("cortex.yaml")
await framework.initialize()

result = await framework.run_session(
    user_id="user_1",
    request="Research the latest vector DB "
            "benchmarks and write a report",
    event_queue=asyncio.Queue(),
)

print(result.response)
# validation, token usage, task completion
# — all in result.*

# Fan-out, tool calls, dependency resolution,
# synthesis, validation — all handled.

Code-first agents. LangGraph-style.

Prefer code to config? Build the whole agent with CortexBuilder and wire in code nodes — plain Python functions as graph nodes. No YAML, no decomposition LLM call: the DAG you declare is the plan.

Python build the agent
from cortex import CortexBuilder, CortexFramework

agent = CortexBuilder("ResearchAgent",
                      "Searches the web, writes reports")
agent.llm("anthropic", model="claude-sonnet-4-6",
          api_key_env="ANTHROPIC_API_KEY")
agent.tool_server("brave",
                  url="http://localhost:9000/sse")

@agent.node()
async def web_research(ctx):
    return await ctx.call_tool(
        "brave", "search", query=ctx.request)

@agent.node(depends_on=["web_research"])
async def write_report(ctx):
    return await ctx.llm(
        f"Write a report:\n{ctx.deps['web_research']}")
Python run it
framework = CortexFramework(config=agent.build())
await framework.initialize()

result = await framework.run_session(
    user_1, "Research vector DB benchmarks")

print(result.response)                      # synthesised
print(result.node_outputs["write_report"])  # raw node

# Registering a code node switches the agent to
# static execution — the declared graph runs
# verbatim. You still keep the wave engine,
# validation gate, retries, streaming, and
# session persistence. Cortex just skips the
# planner. Mix .task() (LLM-routed) and .node()
# (Python) freely in one agent.

Built different.

Not another LangChain wrapper. Cortex is an opinionated production stack with MCP end-to-end and config as the first-class primitive.

Capability Cortex Typical frameworks
ConfigurationSingle cortex.yaml — or a Python CortexBuilderScattered code, env vars, multiple files
Task orchestrationLLM-generated DAG — or a hand-authored static DAG of code nodesSequential chain or hand-coded state machine
Tool protocolNative MCP (SSE, stdio, streamable-HTTP)Custom tool wrappers per integration
Multi-agentAny agent becomes an MCP tool in one commandBespoke inter-agent protocols
Self-expanding meshAnt Colony + ToolForge generate agents at runtimeStatic tool lists, no self-expansion
Intent routingHeuristic → LLM cascade; small talk skips pipelineSame path for every turn
Quality gatesBuilt-in validation agent with scoring + remediationManual testing or nothing
LearningAutonomic gate → delta proposals + draft blueprintsPrompt tweaking by hand
LLM providers8 cloud + local runtime — swap via configUsually 1–2, hard-coded
Deploymentpublish docker/package/mcp/ui — one command eachWrite your own Dockerfile
Chat UICortex Synapse — full frontend, built-inBuild your own or use a third-party tool
SetupVisual wizard + CLIRead docs, write boilerplate

Built for everyone shipping AI.

From solo developers to enterprise architects. If you're building AI agents, Cortex gets you to production faster.

Startup Founder

Skip 3–6 months of plumbing

Get a production-grade agent runtime in an afternoon. Focus on your domain, not the orchestration layer that every AI team rebuilds from scratch.

Platform Team

Governed agent runtime

Consistent runtime with audit trails, quality gates, and per-user isolation for every product team. One framework, one operational model.

Enterprise Architect

Multi-agent meshes at scale

Independent scaling per agent, configurable security, compliance-friendly session encryption, and delegation chains for agent-to-agent provenance.

Solo Developer

Prototype to production in one file

One cortex.yaml. No framework-of-the-month to learn. Run cortex setup, edit config, and cortex publish ui — you have a working agent.

Researcher

Swap providers without touching code

Change LLM providers, models, and tools from config. Run experiments at scale. Compare outputs across providers without rewriting integration code.

MLOps Engineer

Observability out of the box

Validation scores, session replay, token accounting, autonomic learning telemetry, OpenTelemetry hooks, and typed event stream — all built in.

Proven across every industry.

Each use case below is drawn from the framework's UAT suite — run against a live LLM and verified.

🎓 Education

Adaptive tutoring with multi-task pipelines: assess prior knowledge → explain concept → generate exercise → write and run solution code. Multi-student session isolation.

assess_prior_knowledge explain_concept write_solution_code

🏥 Healthcare

Clinical triage: symptom analysis → urgency classification → care pathway → clinical handoff summary. Validation gate enforces quality floor (≥ 0.65). Full-year audit trail.

symptom_analysis urgency_classification care_pathway

💹 Financial Analysis

Portfolio risk assessment, VaR computation via Python sandbox, rebalancing recommendations, and executive report — all in one four-task pipeline with token usage accounting.

portfolio_risk_assessment compute_var_metrics executive_report

🔬 Scientific Research

Climate literature synthesis → Python temperature projection model → structured research report. Multi-task dependency chains with numerical code execution.

literature_synthesis build_projection_model research_report

💻 Software Engineering

Plan → implement → execute → test. Full BST pipeline with code sandbox. CI failure analysis, automated fix recommendations, and code store persistence across sessions.

plan_solution implement_solution execute_and_validate write_test_suite

⚖️ Legal & Compliance

Contract risk analysis: clause extraction → risk identification (CRITICAL/HIGH/MEDIUM/LOW with GDPR detection) → redline language → one-page risk summary for legal partners.

clause_extraction risk_identification negotiation_recommendations

Everything you need to ship.

Full reference for every feature, every config key, every CLI command — all here.

📖

Overview

Understand what Cortex is, who it's for, and why it exists.

What is Cortex?

Cortex is a production-grade AI agent framework for Python. You define an agent — its identity, LLM, tools, task types, quality bar, and deployment target — in a single cortex.yaml file, or in pure Python with CortexBuilder. Cortex handles everything else: decomposing user requests into parallel task graphs, calling MCP tool servers, streaming live progress, scoring response quality, persisting sessions, and deploying as Docker, a Python package, an MCP server, or a ready-made chat UI.

framework = CortexFramework("cortex.yaml")
await framework.initialize()
result = await framework.run_session(user_id="u1", request="Analyse Q3 revenue")
# That's the integration. Everything else is handled.

Or build it in code — and wire plain Python functions in as graph nodes:

agent = CortexBuilder("MyAgent", "...").llm("anthropic", api_key_env="ANTHROPIC_API_KEY")

@agent.node()
async def step(ctx):
    return await ctx.llm(ctx.request)

framework = CortexFramework(config=agent.build())   # no YAML file

Why teams choose Cortex

Skip months of framework engineering

Every AI team eventually builds the same stack: task decomposition, parallel tool execution, streaming events, retry logic, session management, quality scoring, multi-provider routing, deployment. Most teams rebuild it two or three times before shipping. Cortex is that stack. Pre-built. Battle-tested. Driven by config, not code.

Go from idea to running agent in minutes

pip install cortex-agent-framework
cortex setup            # visual wizard at localhost:7799
cortex publish ui       # Cortex Synapse chat UI at localhost:8090

Three commands. You have a working agent with Cortex Synapse — a professional web frontend with task blueprint display, intent classification indicators, live workspace events, token usage tracking, full-text history search, and artifact downloads.

Change behavior without changing code

LLM provider? YAML. Task types? YAML. Concurrency limits? YAML. Validation threshold? YAML. Tool servers? YAML. Your Python code stays a thin wrapper — the agent's behavior lives in cortex.yaml, versioned, diffable, reviewable. Prefer code? CortexBuilder assembles the same config in Python, and @node functions become graph nodes for fully deterministic, LangGraph-style control.

Multi-agent composition for free

Any Cortex agent can be published as an MCP server in one command. Another Cortex agent adds it to tool_servers and calls it like any tool. Standard MCP end-to-end, nothing custom.

Orchestrator → Research Agent (MCP :8081) → brave-search, wikipedia
             → Code Review Agent (MCP :8082) → github, filesystem
             → Writing Agent (MCP :8083) → document-gen

Your agent gets smarter over time

The autonomic Learning Engine fires automatically at end-of-session when complexity and validation scores clear their thresholds. New task patterns stage as delta proposals; once three distinct principals confirm the same pattern it auto-promotes into cortex.yaml. Blueprints capture workflow knowledge in versionable markdown, loaded into context on every run.

What Cortex is not

  • Not a low-code builder. It's a Python library — drive it with cortex.yaml or the CortexBuilder API. Config replaces boilerplate, not code.
  • Not an LLM gateway. Bring your own API key.
  • Not a vector database. It calls MCP tools that do RAG — it doesn't implement retrieval itself.
  • Not a web framework. Cortex runs inside FastAPI/Django/Flask/Click.

The 60-second pitch

You describe your agent — in YAML or in Python. Cortex gives you:

  • Automatic task decomposition — LLM breaks requests into a typed dependency graph
  • Code-first option — build the agent in Python; @node functions become a static DAG
  • Parallel execution — independent tasks run simultaneously, not sequentially
  • Intent Gate — chat turns skip the full pipeline; only task-shaped turns decompose
  • MCP tool servers — connect any tool with three lines of YAML
  • 8 cloud LLM providers + local runtime — switch models without code changes
  • Response validation — every output scored; regressions caught automatically
  • Autonomic learning — signal-gated evolution; new patterns stage themselves
  • Blueprints — reusable workflow knowledge that makes the agent better over time
  • Streaming events — 18 typed event types for any UI (SSE, WebSocket, CLI)
  • 4 deployment targets — Docker, Python package, MCP server, Cortex Synapse chat UI
  • Visual setup wizard — configure everything from a browser, no docs required
  • Security built-in — input sanitisation, credential scrubbing, sandboxed code execution

Define once. Deploy anywhere. Let it learn.

🚀

Quick Start

Build, configure, and deploy your first Cortex agent in 5 minutes.

Quick Start (5 minutes)

1. Install

pip install cortex-agent-framework
cortex --help  # lists: setup, dev, dry-run, publish, spec, replay, delta, migrate, ants

2. Hello World (no external tools needed)

# cortex.yaml
agent:
  name: HelloAgent
  description: A minimal Cortex agent

llm_access:
  default:
    provider: anthropic
    model: claude-sonnet-4-6
    api_key_env_var: ANTHROPIC_API_KEY
    max_tokens: 2048

task_types:
  - name: answer
    description: Answer a user question
    output_format: md
    capability_hint: llm_synthesis

storage:
  base_path: ./cortex_storage
export ANTHROPIC_API_KEY=sk-ant-...
cortex dry-run "Explain gradient descent in two sentences"
cortex dev
Tip: web_search is a built-in capability — it works out of the box via DuckDuckGo with no API key. Just add a task type with capability_hint: web_search.

3. Run the Setup Wizard

cortex setup  # → http://localhost:7799
StepWhat you configure
Agent IdentityName, description, interaction mode
LLM ProviderModel, API key — cloud or local (Ollama/LM Studio/vLLM)
Tool ServersMCP integrations for external capabilities
Task TypesWhat your agent can do
Storage & PersistenceMemory / SQLite / Redis, retention, encryption
Adaptive BehaviourLearning engine, validation, blueprints
Publish ModeDocker, package, MCP, Chat UI

4. The core integration

from cortex.framework import CortexFramework
import asyncio

framework = CortexFramework("cortex.yaml")
await framework.initialize()

result = await framework.run_session(
    user_id="user_123",
    request="Analyse Q3 revenue trends",
    event_queue=asyncio.Queue(),
)

print(result.response)
print(result.validation_report.composite_score)
print(result.token_usage)
print(result.duration_seconds)

Usage Modes

Chat UI (FastAPI + SSE)

@app.post("/chat")
async def chat(body: dict):
    queue = asyncio.Queue()
    asyncio.create_task(
        framework.run_session(
            user_id=body["user_id"],
            request=body["message"],
            event_queue=queue,
        )
    )
    async def stream():
        while True:
            event = await queue.get()
            payload = {"type": event.event_type.value}
            if isinstance(event, ResultEvent):
                payload["content"] = event.content
            yield f"data: {json.dumps(payload)}\n\n"
            if event.event_type in (EventType.SESSION_END, EventType.ERROR):
                break
    return StreamingResponse(stream(), media_type="text/event-stream")

MCP Server (Agent-to-Agent)

cortex publish mcp --port 8081
# Other agents connect:
#   tool_servers:
#     research:
#       url: http://localhost:8081/mcp
#       transport: sse

CLI Tool

@click.command()
@click.argument("request")
def run(request):
    asyncio.run(_run(request))

async def _run(request):
    fw = CortexFramework("cortex.yaml")
    await fw.initialize()
    q = asyncio.Queue()
    result = await fw.run_session("cli_user", request, q)
    print(result.response)
    await fw.shutdown()

Background Worker

async def process_job(job: dict) -> str:
    q = asyncio.Queue()
    result = await framework.run_session(
        user_id=job["user_id"],
        request=job["prompt"],
        event_queue=q,
    )
    return result.response

Architecture

User Request
     │
     ▼
[Primary Agent]  ──── decomposes → task graph ────► [Task A]  [Task B]
                                                          │         │
                                                    [MCP Agent] [MCP Agent]
                                                          │         │
                                                    [Task C  depends on A+B]
                                                          │
                                                  [Primary Agent synthesises]
                                                          │
                                                  [Validation Agent scores]
                                                          │
                                                  [Learning Engine observes]
                                                          │
                                                    Final Response

Streaming Events

EventWhat the UI does
session_startShow "thinking..." indicator
intent_classifiedShow chat vs task routing decision
task_blueprintRender the full DAG before execution starts
task_startShow progress ("Searching web...", "Analysing...")
task_tool_callShow which MCP tool is being invoked
result (partial)Stream text into the chat bubble
file_outputShow download link for agent-produced file
session_token_usageDisplay cumulative token counters

Features

Complete capability matrix — orchestration, MCP, providers, learning, security.

Core Orchestration

FeatureDescription
Fan-out / fan-inDependency DAG; independent tasks run in parallel
Three execution modesadaptive (LLM free-form), pinned (locked topology), scripted (Python handler)
Cycle detectionTask graph compiler rejects cyclic graphs before execution starts
Topological executionTasks run as soon as their dependencies complete
Intent GateHeuristic → LLM cascade routes chat turns directly; emits IntentClassifiedEvent
interaction_modeinteractive for chat/CLI, rpc for MCP/automation — never blocks on clarifications
Smart synthesisKeyword-grep excerpts (Tier 1) + concurrent LLM summaries (Tier 2) before synthesis pass
Clarification supportAgent can pause and ask follow-up questions via ClarificationEvent

LLM Providers (8 built-in)

ProviderConfig valueDefault env var
AnthropicanthropicANTHROPIC_API_KEY
OpenAIopenaiOPENAI_API_KEY
Google GeminigeminiGEMINI_API_KEY
xAI GrokgrokXAI_API_KEY
Mistral AImistralMISTRAL_API_KEY
DeepSeekdeepseekDEEPSEEK_API_KEY
AWS BedrockbedrockAWS credentials
Azure AIazure_aiAZURE_AI_API_KEY
Local runtimelocaloptional
Customcustomprovide Python dotted path

Per-task model routing: override the default model for specific task types via task_types[n].llm_provider, or enable Adaptive Model Routing (AMR) to let the decomposer select the LLM based on task complexity.

Model Context Protocol (MCP)

FeatureDescription
SSE transportConnect to remote MCP servers over Server-Sent Events
stdio transportSpawn MCP servers as subprocesses; full JSON-RPC discovery
streamable-HTTPFull MCP 1.x streamable HTTP support
Publish as MCP serverExport your Cortex agent as a live MCP server for other agents to call

Streaming Events (18 types)

Event classKey fields
StatusEventmessage, session_id, event_type, metadata
ResultEventcontent, partial, validation_score, metadata
ClarificationEventquestion, options, clarification_id
IntentClassifiedEventintent_mode, confidence, reasoning
TaskBlueprintEventtasks, waves — full DAG before execution
TaskToolCallEventtask_id, task_name, tool_name, tool_input
WorkspaceEventaction, path, is_dir
FileOutputEventfilename, mime_type, size_bytes
SessionTokenUsageEventinput_tokens, output_tokens, cache tokens
LearningEventaction, complexity_score, validation_score

Quality & Validation

FeatureDescription
Composite scoringEvery response scored on intent match, completeness, coherence
Configurable thresholdSet a minimum acceptable score (hard floor: 0.60)
Per-session reportReturned on SessionResult.validation_report
Model overrideRun validation with a different model than task execution

Autonomic Learning

FeatureDescription
Signal-driven gateFires automatically when TaskComplexityScorer + validation score clear thresholds. No consent prompt.
Draft blueprintsOn first stage, a draft blueprint is seeded so guidance accumulates before task is promoted
Distinct-principal accumulationPromotion requires 3 distinct principals (configurable)
Auto-apply modeDefault on — deltas promote once confidence accumulates. Flip auto_apply_delta: false for manual review.
Human-in-the-loopcortex delta review · delta apply · delta rollback

Security

FeatureDescription
Input sanitisationPrompt injection mitigation on user inputs
Credential scrubbingRedacts secrets from logs and event streams
WorkspaceBashWorkspace-scoped file/command execution with mandatory HITL before any mutating operation
Code sandboxBash sandbox for code execution tasks in a sandboxed subprocess
Session ownershipSession resume gated by original user_id

Developer Tooling

ToolWhat it does
Setup wizardBrowser-based cortex.yaml generator at localhost:7799
Config Studiocortex config-ui — browser UI to inspect/edit cortex.yaml, blueprints, staged deltas
Dry-run validationcortex dry-run validates config and task graph without LLM calls
Hot-reload dev modecortex dev --watch applies config changes live
Session replaycortex replay shows request, response, task outcomes, validation report
Mock LLM clientcortex.testing.MockLLMClient for unit tests without API calls
🎯

Use Cases

Real-world scenarios — every config drawn from the validated UAT suite.

Validated Industry Use Cases

These examples are drawn directly from the framework's User Acceptance Test suite — every config, task description, and assertion was executed against a live LLM and passed.

Education — Adaptive Tutoring

Multi-task pipeline (assess → explain → exercise → solution), code sandbox execution, multi-student session isolation, history persistence.

task_types:
  - name: assess_prior_knowledge
    description: Assess the student's prior knowledge
  - name: explain_concept
    description: Provide clear, age-appropriate explanation
  - name: generate_practice_exercise
    description: Create a practical coding exercise
  - name: write_solution_code
    description: Write a complete, runnable Python solution
    depends_on: [generate_practice_exercise]

Healthcare — Clinical Triage

Urgency classification for acute neurological symptoms, cardiac triage, diabetes crisis management. Validation gate enforces clinical quality floor (≥ 0.65).

validation:
  enabled: true
  threshold: 0.65
history:
  enabled: true
  retention_days: 365
learning:
  require_user_identity: true  # only from authenticated clinicians

Financial Analysis — Portfolio Risk

Four-task pipeline: risk assessment → VaR computation (Python sandbox) → rebalancing recommendations → executive report. Token usage accounting per session.

Software Engineering — Design → Implement → Test → Execute

BST implementation pipeline with code sandbox execution, CI failure analysis, automated fix recommendations, code store persistence.

code_sandbox:
  enabled: true
  timeout_seconds: 120
  allow_network: false
agent:
  concurrency:
    max_parallel_tasks: 2  # execute + test run in parallel

Legal & Compliance — Contract Risk

Clause extraction → risk identification (CRITICAL/HIGH/MEDIUM/LOW, GDPR detection) → negotiation redlines → risk summary. Session completion within defined SLA.

Cross-Domain Policy Analysis (Z5 Crown Jewel)

G20 policy analyst: 3 parallel domain analyses (Education, Finance, Healthcare) → Python AI Equity Index → government policy brief. All 5 tasks completed, learning engine fired, history written with full metadata.

agent:
  concurrency:
    max_parallel_tasks: 3  # three domain analyses run simultaneously

Architecture Patterns

If you need…Usage mode
A chat UI for end usersSynapse UI (cortex publish ui)
An agent other agents call as a toolMCP Server (cortex publish mcp)
A one-shot CLI tool for devs/opsCLI (cortex dev or Click wrapper)
Batch processing of a job queueBackground worker (Celery/SQS + fw.run_session())
AI feature inside an existing web appEmbedded library (pip install cortex-agent-framework)
Multi-tenant production serviceDocker + Redis storage
Specialist agents at different scalesANT Colony (AntColony.hatch())
⚙️

Configuration

Every cortex.yaml field — agent, LLMs, tools, storage, validation, learning.

Top-level structure

agent:           # Agent identity, concurrency, timeouts, intent gate
llm_access:      # LLM provider routing
task_types:      # Vocabulary of work the agent can do
tool_servers:    # MCP tool server connections
storage:         # Persistence configuration
sqlite:          # SQLite backend settings
redis:           # Redis backend settings
history:         # Session history settings
validation:      # Quality validation settings
learning:        # Delta learning settings
ant_colony:      # Self-spawning specialist agent mesh
tool_forge:      # Runtime MCP server generation
workspace_bash:  # Workspace-scoped file/command execution with HITL
code_sandbox:    # Sandboxed Python code execution
ui:              # Built-in chat UI settings

Full annotated example

agent:
  name: MyAgent
  description: A helpful AI assistant
  interaction_mode: interactive   # "interactive" | "rpc"
  time:
    default_max_wait_seconds: 120
    default_task_timeout_seconds: 40
  concurrency:
    max_concurrent_sessions: 50
    max_concurrent_sessions_per_user: 3
    max_parallel_tasks: 5
    max_tasks_per_session: 20
  intent_gate:
    enabled: true
    heuristic_confidence_threshold: 0.7
    llm_provider: default
    timeout_seconds: 5.0

llm_access:
  default:
    provider: anthropic       # anthropic|openai|gemini|grok|mistral|deepseek|bedrock|azure_ai|local|custom
    model: claude-sonnet-4-6
    api_key_env_var: ANTHROPIC_API_KEY
    max_tokens: 4096
    temperature: 1.0
    thinking_budget_tokens: 0  # Extended thinking (Anthropic only)
    base_url: null             # For proxies / gateways

tool_servers:
  brave_search:
    transport: sse
    url: http://localhost:8051/sse
  filesystem:
    transport: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp/workspace"]

task_types:
  - name: web_research
    description: Search the web for current information
    output_format: md          # text|md|json|file|html|csv|code
    capability_hint: web_search  # auto|llm_synthesis|web_search|bash|code_exec|app_control|browser|document_generation
    timeout_seconds: 60
  - name: analysis
    description: Analyse data and produce structured insights
    output_format: json
    capability_hint: llm_synthesis
    depends_on: [web_research]

storage:
  base_path: ./cortex_storage

sqlite:
  enabled: true
  path: ./cortex_storage/cortex.db
  wal_mode: true

# Or Redis for distributed deployments:
# redis:
#   enabled: true
#   url: ${REDIS_URL}
#   key_prefix: "cortex:prod:"

validation:
  threshold: 0.75            # Min quality score (floor: 0.60)

history:
  enabled: true
  retention_days: 90

learning:
  enabled: true
  validation_threshold: 0.75
  complexity_threshold: 0.6
  auto_apply_delta: true
  auto_apply_min_confidence: medium  # ≥ 3 distinct principals

workspace_bash:
  enabled: true
  hitl_enabled: true         # Cannot be disabled — enforced at runtime

app_control:                 # Launch + drive native desktop applications
  enabled: true
  hitl_enabled: true         # Approve every launch / script / screenshot
  max_vision_steps: 10       # Fallback screenshot loop cap

playwright_mcp:              # Built-in browser automation
  enabled: true
  browser: chromium          # chromium | firefox | webkit
  headless: false            # Persistent session — logins survive runs

interaction_mode

ModeBehaviour
interactiveDefault. Chat UIs, CLI, dev mode. Intent Gate routes conversational turns directly. Clarifications allowed.
rpcMCP / automation. Every turn forced to task path. No interactive clarifications — automated callers never hang. Set automatically by cortex publish mcp.
Override at runtime: CORTEX_INTERACTION_MODE=rpc environment variable beats the value in cortex.yaml.

Environment Variables

VariableDescription
ANTHROPIC_API_KEYAnthropic provider API key
OPENAI_API_KEYOpenAI provider API key
GEMINI_API_KEYGoogle Gemini provider API key
XAI_API_KEYxAI Grok provider API key
MISTRAL_API_KEYMistral AI provider API key
DEEPSEEK_API_KEYDeepSeek provider API key
CORTEX_CONFIGOverride default config path
CORTEX_LOG_LEVELLogging level (DEBUG, INFO, WARNING, ERROR)
CORTEX_INTERACTION_MODEOverride agent.interaction_mode at runtime
⌨️

CLI Reference

Every cortex subcommand — setup, dev, dry-run, publish, replay, delta.

CLI Reference

cortex --help
cortex <command> --help

cortex setup

cortex setup [--port 7799] [--no-browser]

Browser-based setup wizard at localhost:7799. Walks through agent identity → LLM provider → tool servers → task types → storage → publish mode. Writes validated cortex.yaml. Re-running loads existing settings; fields that would break existing data are locked.

cortex config-ui

cortex config-ui [--port 7801]

Config Studio — browser UI to inspect and edit cortex.yaml, blueprints, staged deltas, and session metadata.

cortex dev

cortex dev [--config cortex.yaml] [--watch]

Runs Cortex in development mode. --watch enables hot-reload — edit cortex.yaml and changes apply instantly.

cortex dry-run

cortex dry-run [--config cortex.yaml] "REQUEST"

Validates config and compiles the task graph without making any LLM calls. Catches config errors, unreachable tool servers, and broken depends_on references before spending API credits. Use in CI to gate config changes.

cortex publish

cortex publish docker [--tag my-agent:latest] [--with-ui]
cortex publish package [--output-dir dist]
cortex publish mcp [--port 8080]
cortex publish ui [--port 8090]

publish mcp automatically sets CORTEX_INTERACTION_MODE=rpc. publish docker --with-ui bundles Cortex Synapse into the image.

cortex replay

cortex replay SESSION_ID --user-id USER_ID

Shows request, task decomposition, task outcomes, token usage, validation score, final response, and duration. Requires history.enabled: true.

cortex delta

cortex delta review              # show staged proposals
cortex delta apply [--min-confidence high]  # write to cortex.yaml
cortex delta reject              # reject a proposal
cortex delta rollback            # restore previous config from .bak

cortex ants

cortex ants list                 # show all live ANTs
cortex ants hatch --name NAME --capability CAP
cortex ants stop --name NAME
cortex ants stop-all
cortex ants status

Other commands

CommandWhat it does
cortex spec [--format json]Generate capability manifest (JSON or YAML)
cortex migrateValidate config schema compatibility against target version
cortex --versionShow installed version
📦

Deployment

Ship to production — Docker, Python package, MCP server, or chat UI.

Deployment Targets

ModeConsumerWhen to use
DockerEnd users / servicesProduction microservice, multi-tenant backend
PackagePython developersEmbed in an existing Django/FastAPI app
MCP serverOther agentsMulti-agent composition, IDE integrations
Chat UIEnd users (browser)Quick demo, internal tool, user-facing chat

Option A: Docker

cortex publish docker --tag my-agent:latest
docker build -f Dockerfile.cortex -t my-agent:latest .
docker run --rm -p 8090:8090 -e ANTHROPIC_API_KEY=your_key my-agent:latest

# With Cortex Synapse chat UI bundled:
cortex publish docker --with-ui --tag my-agent:latest

Production checklist: use Redis (not SQLite) for multi-replica deployments. Pass API keys via -e KEY=val or a secret manager. Set max_concurrent_sessions to match your instance size. The UI server exposes /health for readiness probes.

Option B: Python Package

cortex publish package --output-dir dist
pip install dist/cortex_agent_framework-*.whl

Use when you want to embed Cortex in an existing Python app (Django, FastAPI, Flask) or ship a pre-configured agent to internal users. No separate service to operate.

Option C: MCP Server

cortex publish mcp --port 8080
# → http://localhost:8080/mcp

Exposes your agent as a live MCP tool server. Automatically sets CORTEX_INTERACTION_MODE=rpc so callers never hang on interactive clarifications. Connect from another agent:

tool_servers:
  my_agent:
    url: http://host:8080/mcp
    transport: sse

Option D: Cortex Synapse Chat UI

cortex publish ui --port 8090
# → http://localhost:8090

Single-page web frontend with text + file uploads, live task blueprint display, intent classification badges, workspace event streaming, token usage, full-text history search, and artifact ZIP download. Configure title, host, port, and auth under the ui: block.

Multi-Agent Deployment

Run multiple Cortex agents on one machine — each needs its own directory, its own cortex.yaml, and its own ports.

~/agents/
├── research-agent/
│   ├── cortex.yaml           # MCP port 8081, storage ./storage
│   └── storage/
├── code-review-agent/
│   ├── cortex.yaml           # MCP port 8082, storage ./storage
│   └── storage/
└── orchestrator/
    ├── cortex.yaml           # references 8081 + 8082 as tool_servers
    └── storage/
# Terminal 1
cd ~/agents/research-agent    && cortex publish mcp --port 8081
# Terminal 2
cd ~/agents/code-review-agent && cortex publish mcp --port 8082
# Terminal 3
cd ~/agents/orchestrator      && cortex dev
Never share a SQLite file between running agents. SQLite locks the DB file — two agents pointing at the same sqlite.path will fail. Use Redis with a unique key_prefix per agent for shared storage.

Production with FastAPI + Redis

agent:
  name: ProductionAgent
  concurrency:
    max_concurrent_sessions: 100
    max_concurrent_sessions_per_user: 5

llm_access:
  default:
    provider: anthropic
    model: claude-sonnet-4-6
    api_key_env_var: ANTHROPIC_API_KEY

redis:
  enabled: true
  url: ${REDIS_URL}
  key_prefix: "cortex:prod:"
💡

FAQ

Common gotchas and answers — installation, providers, multi-agent, intent gate.

General

What exactly is Cortex?

A Python library (cortex-agent-framework) that gives you a production-grade multi-step AI agent driven entirely by a cortex.yaml config file. It handles task decomposition, parallel tool execution, MCP integration, streaming, validation, and session persistence so you can focus on your use case instead of rebuilding agent plumbing.

How is this different from LangChain / LlamaIndex / CrewAI / AutoGen?

  • Configuration-first. Most frameworks require writing Python to define an agent. Cortex defines agents in YAML. Change behavior by editing config, not code.
  • Fan-out / fan-in as a core primitive. Parallel tool execution with a dependency DAG is first-class, not an advanced feature you build yourself.
  • MCP-native. Tool servers and agent-to-agent composition both use MCP end-to-end. No bespoke inter-agent protocol.
  • Opinionated production stack. Session management, validation scoring, delta learning, replay, hot-reload, and deployment targets all ship in the box.

Is Cortex production-ready?

Yes. It has concurrency limits, session persistence with WAL replay, quality validation, typed streaming events, OpenTelemetry hooks, and three storage backends. The test suite covers core modules with industry-validated acceptance tests.

What's the license?

MIT. Use it commercially, fork it, modify it, ship it.

Installation & Setup

The setup wizard won't open

  • Check the port isn't in use: lsof -i :7799
  • Pass --no-browser and open the URL manually
  • Use a different port: cortex setup --port 7800

Can I run Cortex without the setup wizard?

Yes. The wizard just generates a YAML file. You can write cortex.yaml by hand — see the Configuration tab for every field.

Running Multiple Agents

Can I run multiple Cortex agents on one machine?

Absolutely — that's the designed pattern. Each agent needs its own directory, its own cortex.yaml, and its own ports. The defaults are just defaults — all overridable.

Can two agents share a SQLite database?

No. SQLite locks the DB file; two Cortex processes pointing at the same sqlite.path will fail intermittently. Give each agent its own storage.base_path, or use Redis with a unique key_prefix per agent.

LLM Providers

Can I mix providers in one agent?

llm_access:
  default:
    provider: anthropic
    model: claude-sonnet-4-6
  task_overrides:
    cheap_summary:
      provider: deepseek
      model: deepseek-chat
    heavy_reasoning:
      provider: anthropic
      model: claude-opus-4-7
      thinking_budget_tokens: 5000

Can I point Cortex at a proxy or gateway (LiteLLM, OpenRouter)?

llm_access:
  default:
    provider: anthropic_compatible
    base_url: https://my-gateway.internal/v1
    api_key_env_var: GATEWAY_KEY
    model: claude-sonnet-4-6

Task Graph & Execution

What happens if one task fails?

The task is marked failed, the session continues, and the Primary Agent synthesises what it can from the successful tasks. SessionResult.task_completion reports which tasks succeeded, failed, or timed out.

What's the max parallelism?

Controlled by agent.concurrency.max_parallel_tasks (default 5). Dependencies always take precedence — a task waits for its depends_on regardless of the parallel cap.

Intent Gate & Interaction Modes

Why did "hi" no longer trigger a full task pipeline?

The Intent Gate classified it as a chat turn and sent it through PrimaryAgent.converse() — skipping scout, decomposition, execution, validation, and learning. Turn it off with agent.intent_gate.enabled: false if every turn should decompose.

What's the difference between interactive and rpc?

  • interactive — for chat UIs, CLIs, dev mode. Conversational turns skip the full pipeline. ClarificationEvents are emitted and a human answers.
  • rpc — for agents exposed as callables. Every turn forced to the task path. Interactive clarifications are suppressed. cortex publish mcp sets this automatically.

Chat UI (Cortex Synapse)

How do I get the built-in chat UI?

cortex publish ui --port 8090
# → http://localhost:8090

Serves Cortex Synapse — a single-page web frontend with text + file uploads, live task blueprint display, intent classification badges, workspace event streaming, token usage, full-text history search, and artifact ZIP download. Configure title, host, port, and auth (none / token / basic) under the ui: block in cortex.yaml.