Architecture
Architecture Overview
How Agent Swarm is built and how the components fit together
Architecture Overview
Agent Swarm follows a hub-and-spoke architecture where a central MCP API server coordinates communication between agents.
System Diagram
Core Components
MCP API Server
The MCP (Model Context Protocol) server is the central coordination point. It:
- Exposes tools via the MCP protocol (both STDIO and HTTP transports)
- Manages agent registration and task assignment
- Stores all state in a SQLite database
- Handles integrations (Slack, GitHub, AgentMail webhooks)
- Runs the task scheduler for recurring automation
The server runs on port 3013 by default and is implemented in src/http/ (modular route handlers) with tool definitions in src/server.ts.
It also runs two background subsystems:
- Scheduler — Polls for scheduled tasks and creates them at the configured interval
- Heartbeat — A lightweight triage module that sweeps the swarm every 90 seconds (configurable via
HEARTBEAT_INTERVAL_MS). It uses a 3-tier approach: a preflight gate that bails if the swarm is healthy, code-level triage for routine fixes (stall detection, worker status correction, pool task auto-assignment, stale resource cleanup), and Claude escalation only when ambiguous situations require human reasoning. See Environment Variables for heartbeat configuration.
Lead Agent
The lead agent is the coordinator. It:
- Receives incoming tasks from external sources (Slack, GitHub, email)
- Has an inbox for triaging incoming messages
- Breaks down complex tasks and delegates to workers
- Monitors worker progress and provides feedback
- Communicates results back to the user
- Can inject learnings into worker memories
Worker Agents
Workers are the execution layer. Each worker:
- Runs in an isolated Docker container with a full development environment
- Has access to git, Node.js, Python, Bun, and common CLI tools
- Executes tasks assigned by the lead agent
- Reports progress via the
store-progressMCP tool - Can expose HTTP services on port 3000
- Learns from each session and builds compounding knowledge
Docker Runtime
Each worker container includes:
- Languages: Python 3, Node.js 22, Bun
- Build tools: gcc, g++, make, cmake
- Process manager: PM2 (for background services)
- CLI tools: GitHub CLI (
gh), sqlite3 - Agent tools:
wts(git worktree manager) - Utilities: git, git-lfs, vim, nano, jq, curl, wget, ssh
- Sudo access: Workers can install packages with
sudo apt-get install
Data Flow
Task Creation
- User sends a message (Slack DM, GitHub @mention, email, or API call)
- MCP server receives the webhook/request
- Lead agent's inbox receives the message
- Lead agent triages and creates tasks for workers
Task Execution
- Worker polls for or receives a task assignment
- Worker starts a Claude Code session with the task context
- Worker executes the task, using MCP tools for coordination
- Worker reports progress via
store-progress - On completion, output is saved and the lead is notified
Learning Loop
- At session end, a summary model extracts key learnings
- Learnings are embedded and stored in the memory system
- On next task, relevant memories are retrieved and included in context
- The lead can also inject learnings directly into workers
Project Structure
agent-swarm/
├── src/
│ ├── cli.tsx # CLI entry point (Ink/React)
│ ├── http/ # Modular HTTP route handlers
│ │ ├── index.ts # Handler registry & dispatch
│ │ ├── tasks.ts # Task endpoints
│ │ ├── agents.ts # Agent endpoints
│ │ ├── epics.ts # Epic endpoints
│ │ ├── schedules.ts # Schedule endpoints
│ │ ├── core.ts # Core endpoints (join, poll, progress)
│ │ ├── config.ts # Config endpoints
│ │ ├── memory.ts # Memory endpoints
│ │ └── ... # Additional route modules
│ ├── server.ts # MCP server setup & tool registration
│ ├── tools/ # MCP tool implementations
│ ├── heartbeat/ # Lightweight swarm triage module
│ │ └── heartbeat.ts # 3-tier heartbeat (gate → code triage → escalation)
│ ├── be/ # Backend (database, business logic)
│ │ ├── db.ts # SQLite database
│ │ └── migrations/ # Database migration system
│ ├── artifact-sdk/ # Artifact SDK (serve files/apps via localtunnel)
│ ├── commands/ # CLI command implementations
│ │ ├── runner.ts # Task runner (polls and spawns sessions)
│ │ ├── worker.ts # Worker agent command
│ │ ├── lead.ts # Lead agent command
│ │ └── artifact.ts # Artifact serve/list/stop commands
│ └── hooks/ # Claude Code hooks
├── deploy/ # Deployment scripts
├── scripts/ # Utility scripts
├── new-ui/ # Dashboard UI (React + AG Grid)
├── docker-compose.example.yml
├── Dockerfile # API server image
├── Dockerfile.worker # Worker image
└── package.jsonTechnology Stack
| Component | Technology |
|---|---|
| Runtime | Bun |
| API Protocol | MCP (Model Context Protocol) |
| Database | SQLite (via bun:sqlite) |
| AI Runtime | Claude Code (headless) |
| Containerization | Docker |
| Process Management | PM2 |
| Memory Embeddings | OpenAI text-embedding-3-small |
| Dashboard | React + Vite |
| Schema Validation | Zod |
Next Steps
- Agent Identity & Configuration — How agents are personalized and evolve
- Hook System — The six hooks that fire during each session
- Memory System — How agents build compounding knowledge
- Task Lifecycle — How tasks flow through the swarm
- Deployment Guide — Production deployment options