Hub-and-spoke architecture for multi-agent AI orchestration — central MCP API server coordinates Docker-isolated worker agents with persistent SQLite state, vector memory, and Claude Code harness integration.

Agent Swarm follows a hub-and-spoke architecture where a central MCP API server coordinates communication between agents.

System Diagram

Core Components

MCP API Server

The MCP (Model Context Protocol) server is the central coordination point. It:

Exposes tools via the MCP protocol (both STDIO and HTTP transports)
Manages agent registration and task assignment
Stores all state in a SQLite database
Handles integrations (Slack, GitHub, GitLab, AgentMail, Linear webhooks)
Runs the task scheduler for recurring automation

The server runs on port 3013 by default and is implemented in src/http/ (modular route handlers) with tool definitions in src/server.ts.

It also runs three background subsystems:

Scheduler — Polls for scheduled tasks and creates them at the configured interval
Workflow Engine — Redesigned DAG executor with executor registry (8 types), checkpoint durability, webhook/schedule/manual triggers, per-step retry, and version history (see Workflows)
Heartbeat — A lightweight triage module that sweeps the swarm every 90 seconds (configurable via HEARTBEAT_INTERVAL_MS). It uses a 3-tier approach: a preflight gate that bails if the swarm is healthy, code-level triage for routine fixes (stall detection, worker status correction, pool task auto-assignment, stale resource cleanup), and Claude escalation only when ambiguous situations require human reasoning. The lead agent also triggers a heartbeat sweep on startup to immediately detect and recover any stalled tasks from before restart. See Environment Variables for heartbeat configuration.

Lead Agent

The lead agent is the coordinator. It:

Receives incoming tasks from external sources (Slack, GitHub, email)
Has an inbox for triaging incoming messages
Breaks down complex tasks and delegates to workers
Monitors worker progress and provides feedback
Communicates results back to the user
Can inject learnings into worker memories

Worker Agents

Workers are the execution layer. Each worker:

Runs in an isolated Docker container with a full development environment
Uses a configurable AI harness — Claude Code (default) or pi-mono — selected via HARNESS_PROVIDER. See Harness Configuration
Has access to git, Node.js, Python, Bun, and common CLI tools
Executes tasks assigned by the lead agent
Reports progress via the store-progress MCP tool
Can expose HTTP services on port 3000
Learns from each session and builds compounding knowledge

Docker Runtime

Each worker container includes:

Languages: Python 3, Node.js 22, Bun
Build tools: gcc, g++, make, cmake
Process manager: PM2 (for background services)
CLI tools: GitHub CLI (gh), GitLab CLI (glab), sqlite3
Agent tools: wts (git worktree manager), agent-fs (persistent shared filesystem)
Utilities: git, git-lfs, vim, nano, jq, curl, wget, ssh
Runtime user: Agent processes run as the non-root worker user without passwordless sudo

Data Flow

Task Creation

User sends a message (Slack DM, GitHub @mention, email, or API call)
MCP server receives the webhook/request
Lead agent's inbox receives the message
Lead agent triages and creates tasks for workers

Task Execution

Worker polls for or receives a task assignment
Worker starts a Claude Code session with the task context
Worker executes the task, using MCP tools for coordination
Worker reports progress via store-progress
On completion, output is saved and the lead is notified

Learning Loop

At session end, a summary model extracts key learnings
Learnings are embedded and stored in the memory system
On next task, relevant memories are retrieved and included in context
The lead can also inject learnings directly into workers

Project Structure

agent-swarm/
├── src/
│   ├── cli.tsx          # CLI entry point (Ink/React)
│   ├── http/            # Modular HTTP route handlers
│   │   ├── index.ts     # Handler registry & dispatch
│   │   ├── tasks.ts     # Task endpoints
│   │   ├── agents.ts    # Agent endpoints
│   │   ├── schedules.ts # Schedule endpoints
│   │   ├── core.ts      # Core endpoints (join, poll, progress)
│   │   ├── config.ts    # Config endpoints
│   │   ├── memory.ts    # Memory endpoints
│   │   ├── db-query.ts  # Read-only SQL query endpoint
│   │   ├── workflows.ts # Workflow CRUD + trigger endpoints
│   │   └── ...          # Additional route modules
│   ├── server.ts        # MCP server setup & tool registration
│   ├── tools/           # MCP tool implementations
│   ├── heartbeat/       # Lightweight swarm triage module
│   │   └── heartbeat.ts # 3-tier heartbeat (gate → code triage → escalation)
│   ├── be/              # Backend (database, business logic)
│   │   ├── db.ts        # SQLite database
│   │   └── migrations/  # Database migration system
│   ├── workflows/      # Workflow automation engine (DAG executor, triggers, nodes)
│   ├── artifact-sdk/    # Artifact SDK (serve files/apps via localtunnel)
│   ├── providers/       # AI provider adapters (formatCommand for provider-aware prompts)
│   │   ├── types.ts     # ProviderAdapter interface
│   │   ├── claude-adapter.ts  # Claude Code adapter
│   │   └── pi-mono-adapter.ts # Pi-mono adapter
│   ├── oauth/           # Generic OAuth module (PKCE, token management)
│   ├── gitlab/          # GitLab webhook handlers
│   ├── commands/        # CLI command implementations
│   │   ├── runner.ts    # Task runner (polls and spawns sessions)
│   │   ├── worker.ts    # Worker agent command
│   │   ├── lead.ts      # Lead agent command
│   │   └── artifact.ts  # Artifact serve/list/stop commands
│   └── hooks/           # Claude Code hooks
├── templates/           # Agent templates (official + community)
├── apps/
│   ├── ui/              # Dashboard UI (React + AG Grid)
│   ├── templates-ui/    # Templates registry (Next.js app)
│   └── evals/           # E2B-backed eval harness + UI
├── deploy/              # Deployment scripts
├── scripts/             # Utility scripts
├── docker-compose.example.yml
├── Dockerfile           # API server image
├── Dockerfile.worker    # Worker image
└── package.json

Technology Stack

Component	Technology
Runtime	Bun
API Protocol	MCP (Model Context Protocol)
Database	SQLite (via `bun:sqlite`)
AI Runtime	Claude Code (default) or pi-mono — selected via `HARNESS_PROVIDER`. See Harness Configuration
Containerization	Docker
Process Management	PM2
Memory Embeddings	OpenAI `text-embedding-3-small`
Dashboard	React + Vite
Schema Validation	Zod

Next Steps

Agent Identity & Configuration — How agents are personalized and evolve
Hook System — The six hooks that fire during each session
Memory System — How agents build compounding knowledge
Task Lifecycle — How tasks flow through the swarm
Deployment Guide — Production deployment options

Architecture Overview