Cortex ships four deployment targets out of the box: Docker, Python package, MCP server, and Chat UI. Pick based on who’s calling your agent.
| Mode | Consumer | Transport | When to use |
|---|---|---|---|
| Docker | End users / services | HTTP to a running container | Production microservice, multi-tenant backend |
| Package | Python developers | import in-process |
Embed in an existing Django/FastAPI app |
| MCP server | Other agents | MCP protocol tool call | Multi-agent composition, IDE integrations |
| Chat UI | End users (browser) | HTTP + SSE | Quick demo, internal tool, user-facing chat |
cortex publish docker --tag my-agent:latest
docker build -f Dockerfile.cortex -t my-agent:latest .
docker run --rm -p 8090:8090 -e ANTHROPIC_API_KEY=your_key my-agent:latest
Pass --with-ui to generate a Dockerfile that runs the built-in Cortex Synapse chat UI on port 8090:
cortex publish docker --with-ui --tag my-agent:latest
docker build -f Dockerfile.cortex -t my-agent:latest .
docker run --rm -p 8090:8090 -e ANTHROPIC_API_KEY=your_key my-agent:latest
# open http://localhost:8090
-e KEY=val or a secret manager — never bake them into the image.max_concurrent_sessions in cortex.yaml to match your instance size.CORTEX_LOG_LEVEL=INFO (or DEBUG for investigation) and forward container stdout to your log aggregator.OTEL_EXPORTER_OTLP_ENDPOINT, etc.)./health — use it in your container orchestrator’s readiness probe.# cortex.yaml
agent:
name: ProductionAgent
concurrency:
max_concurrent_sessions: 100
max_concurrent_sessions_per_user: 5
llm_access:
default:
provider: anthropic
model: claude-sonnet-4-6
api_key_env_var: ANTHROPIC_API_KEY
redis:
enabled: true
url: ${REDIS_URL}
key_prefix: "cortex:prod:"
cortex publish package --output-dir dist
pip install dist/cortex_agent_framework-*.whl
Use this when:
Once installed, import and call it directly:
from cortex.framework import CortexFramework
framework = CortexFramework("cortex.yaml")
await framework.initialize()
result = await framework.run_session(user_id="u1", request="Hello")
Then start it with:
export ANTHROPIC_API_KEY=your_key
cortex publish ui --config cortex.yaml
# open http://localhost:8090
No new deployment target to operate. Cortex is just a dependency.
cortex publish mcp --config cortex.yaml --port 8080
# MCP server running at http://localhost:8080/mcp
Runs the agent as a live aiohttp HTTP server. Any MCP client — another Cortex agent, Claude Desktop, an IDE, or a custom tool — can call it:
# consumer's cortex.yaml
tool_servers:
my_specialist_agent:
transport: sse
url: http://host:8080/mcp
Or call it directly via REST (convenience alias /run):
curl -X POST http://localhost:8080/run \
-H 'Content-Type: application/json' \
-d '{"input": "Summarise the latest AI news"}'
# → {"output": "..."}
Interaction mode: cortex publish mcp automatically sets CORTEX_INTERACTION_MODE=rpc so the agent never blocks on interactive clarifications — MCP clients cannot answer them.
Use this when:
cortex publish ui --config cortex.yaml
# Cortex chat UI: http://localhost:8090
Serves Cortex Synapse — a fully-featured single-page web frontend backed by your agent. Open http://localhost:8090 in your browser.
file_input MIME / size limits; mid-session uploads also supported.auth.mode: none), shared token, or HTTP Basic.ui:
enabled: true
host: "0.0.0.0"
port: 8090
title: "My Agent"
auth:
mode: none # none | token | basic
# token: "s3cret" # for mode: token
# username: admin # for mode: basic
# password: changeme # for mode: basic
Configure through the Chat UI section in the setup wizard (cortex setup) or by hand.
The UI server also exposes a REST API for headless / programmatic access:
| Endpoint | Method | Description |
|---|---|---|
/api/session |
POST | Start a new session |
/api/session/{id}/events |
GET | SSE stream of events |
/api/session/{id}/clarify |
POST | Answer a HITL clarification |
/api/session/{id}/upload |
POST | Upload additional files mid-session |
/api/history |
GET | List session history |
/api/history/search?q=... |
GET | Full-text search over sessions |
/api/history/{sid} |
GET | Session detail |
/api/history/{sid}/files/{task}/{name} |
GET | Download a task output file |
/api/history/{sid}/artifacts/zip |
GET | Download all outputs as ZIP |
/api/history/{sid} |
DELETE | Delete a session |
/api/ants/{ant_id} |
DELETE | Cancel a running ant task |
/api/runtime/delta/action |
POST | Promote or discard a learning delta |
/api/services/{service}/launch |
POST | Ensure config-ui or wizard is running |
cortex publish docker --with-ui --tag my-agent:latest
docker build -f Dockerfile.cortex -t my-agent:latest .
docker run --rm -p 8090:8090 -e ANTHROPIC_API_KEY=your_key my-agent:latest
# open http://localhost:8090
The generated Dockerfile runs cortex publish ui as its entrypoint and exposes port 8090.
history.enabled: true) so conversations survive page reloads.none to token or basic before exposing to the internet.cortex publish ui --host 127.0.0.1 --port 9000.localhost even when the server binds to 0.0.0.0.Cortex is designed for multi-agent composition. Any number of Cortex agents can run on one host or across a cluster — each just needs its own directory, its own cortex.yaml, and its own ports.
| Thing | Default | Per-agent override |
|---|---|---|
| Config file | ./cortex.yaml |
--config PATH or CORTEX_CONFIG env var |
| Wizard port | 7799 |
cortex setup --port 7800 |
| MCP publish port | 8080 |
cortex publish mcp --port 8081 |
| Chat UI port | 8090 |
Set ui.port or cortex publish ui --port 9000 |
| Storage base_path | ./cortex_storage |
Set storage.base_path in each cortex.yaml |
| SQLite DB path | ./cortex_storage/cortex.db |
Set sqlite.path — never share across running agents |
~/agents/
├── research-agent/
│ ├── cortex.yaml # MCP port 8081, storage ./storage
│ └── storage/
├── code-review-agent/
│ ├── cortex.yaml # MCP port 8082, storage ./storage
│ └── storage/
└── orchestrator/
├── cortex.yaml # references 8081 + 8082 as tool_servers
└── storage/
1. Create each sub-agent (each in its own directory, with its own wizard port):
mkdir -p ~/agents/research-agent && cd ~/agents/research-agent
cortex setup --port 7799
mkdir -p ~/agents/code-review-agent && cd ~/agents/code-review-agent
cortex setup --port 7800
mkdir -p ~/agents/orchestrator && cd ~/agents/orchestrator
cortex setup --port 7801
2. In the orchestrator’s cortex.yaml, reference the sub-agents as tool servers and add matching task types:
tool_servers:
research:
transport: sse
url: http://localhost:8081/mcp
code_review:
transport: sse
url: http://localhost:8082/mcp
task_types:
- name: research
description: Delegate web research to ResearchAgent
capability_hint: web_search
output_format: md
- name: review_code
description: Delegate code review to CodeReviewAgent
capability_hint: auto
output_format: md
- name: write_report
description: Synthesise findings into a final report
capability_hint: document_generation
depends_on: [research, review_code]
3. Run all three (separate terminals, or systemd / supervisor / pm2 units):
# Terminal 1
cd ~/agents/research-agent && cortex publish mcp --port 8081
# Terminal 2
cd ~/agents/code-review-agent && cortex publish mcp --port 8082
# Terminal 3
cd ~/agents/orchestrator && cortex dev
4. Drive the orchestrator from your application code:
result = await framework.run_session(
user_id="dev_1",
request="Research vector DB benchmarks and review our benchmark script",
)
The orchestrator fans out research and review_code in parallel to the two sub-agents over MCP, waits for both, then runs write_report.
When tool_forge, ant_colony, and code_sandbox are all enabled, an orchestrator can instruct the decomposer to generate a new MCP server from code during a session:
ant_colony:
enabled: true
tool_forge:
enabled: true
persist_by_default: false # session-scoped by default; set true to survive restarts
spawn_timeout_seconds: 30
code_sandbox:
enabled: true
A forge_mcp task generates a FastMCP server script, writes it to cortex_storage/ants/<task_name>/server.py, and registers it at the wave boundary — dependent tasks in later waves can use the new capability immediately. Forged servers are supervised by Ant Colony like any hand-hatched ant.
sqlite.path will fail intermittently. Give each its own storage.base_path.key_prefix per agent so sessions don’t collide.cortex.yaml, storage, and ports.cortex setup --port 7800, --port 7801, etc.7799+N and MCP 8080+N make a mesh readable.cortex publish mcp holds the port until the process exits — lsof -i :8081 to find a lingering PID.CORTEX_CONFIG for sticky shells. export CORTEX_CONFIG=~/agents/research-agent/cortex.yaml lets you run cortex dev from anywhere targeting that agent./mcp, not /sse. Consumer tool_servers entries should point at http://host:PORT/mcp.| Need | How |
|---|---|
| One specialist is the bottleneck | Run multiple replicas of that sub-agent behind a load balancer |
| Agents span hosts | Point tool_servers.*.url at the remote hostname instead of localhost |
| Shared session store across replicas | Use Redis with a consistent key_prefix |
| Zero-downtime deploys | Publish each agent as a Docker image and roll them independently |
cortex.yamlvalidation.threshold set appropriately for your use caselearning.auto_apply_confidence: null (human-gated) unless you’ve measured the confidence modelCORTEX_LOG_LEVEL=INFO in production, DEBUG only for investigationmax_parallel_llm_calls left unset (auto-derives from provider+model and self-tunes via AdaptiveLLMGate) unless you need to pin it for a hard-rate-limited APIcortex dry-run wired into CI so bad configs fail at build timetoken or basic if exposed beyond localhost