Multica: The Managed Agents Platform That Runs Code on Your Machine, Not Theirs

In partnership with

SnackOnAI Engineering | Senior AI Systems Researcher | Technical Deep Dive | June 3, 2026

The standard narrative around "agentic platforms" is: give the platform your API keys, give the platform your codebase, and the platform runs agents for you. This is convenient and it is a significant security and privacy commitment. You are trusting a third-party cloud with your API keys, your code, and your credentials.

Multica's architecture inverts this. The Multica server handles workspaces, issue lists, comment threads, and task queues. It is a coordination hub with no AI execution capability. The AI execution happens on your machine, driven by a local daemon that polls the server, picks up tasks, invokes your locally installed AI coding tools (Claude Code, Codex, Cursor, and nine others), and posts results back. Your API keys are read by your machine. Your code is never uploaded.

The platform's design is closest to Linear or Jira, not to Devin or Copilot Workspace. The difference: Linear and Jira assign work to humans. Multica assigns work to agents.

Scope: Multica's three-component distributed architecture (server, daemon, AI coding tools), the skill system and ClawHavoc security incident, squads (multi-agent routing), the four task triggers, and autopilots. Not covered: the full provider comparison matrix, or the upcoming cloud runtime (currently waitlist-only).

What It Actually Does

Multica is a project management platform with agent assignment built in. The user experience is an issue tracker where some assignees are AI agents. The technical architecture is a three-component distributed system where AI execution is deliberately kept on the user's infrastructure.

The four ways to start agent work:

Trigger	How	Typical Use
Assign an issue	Pick an agent as assignee	Standard task delegation
@mention in comment	"@agent-name take a look"	One-off requests without reassigning
Direct chat	Chat window, not tied to an issue	Questions, drafts, quick tasks
Autopilot (scheduled)	Standing recurring instruction	"Standup summary every Monday"

The twelve supported AI coding tools (as of June 2026): Antigravity, Claude Code, Codex, Cursor, Copilot, Gemini, Hermes, Kimi, Kiro CLI, OpenCode, OpenClaw, Pi

Deployment options:

# Option 1: Multica Cloud (recommended start)
npm install -g @multica/cli
multica daemon   # ← starts on your machine, connects to Multica Cloud

# Option 2: Full self-host (Docker Compose)
# Postgres + server + web all on your infrastructure
docker compose up -d

# Option 3: Desktop app
# Ships with CLI built in, starts daemon on launch, no CLI setup

The Architecture, Unpacked

Focus on the daemon's role as the trust boundary. The server coordinates work. The daemon executes it. These two components are separated both architecturally and physically. The server can never instruct the daemon to exfiltrate credentials; it can only assign tasks.

The Code, Annotated

Snippet One: Daemon Startup and Runtime Registration

// Multica daemon: startup, tool detection, and polling loop (reconstructed from architecture docs)
// Source: multica-ai/multica (MIT), server/ + CLI_AND_DAEMON.md
// The daemon is written in Go, shipped as a single binary

package daemon

import (
    "time"
    "context"
)

// Runtime: one AI coding tool on one workspace
// If you have Claude Code + Codex on 2 workspaces = 4 runtimes registered
type Runtime struct {
    WorkspaceID string
    ToolName    string    // "claude-code", "codex", "cursor", etc.
    ToolPath    string    // path to the installed CLI binary
    DaemonID    string    // unique ID for this daemon instance
}

func StartDaemon(ctx context.Context, serverURL string, token string) error {
    // Step 1: Detect installed AI coding tools
    // ← Scans for installed CLIs: claude, codex, cursor, copilot, etc.
    //   Checks PATH + tool-specific locations
    tools := detectInstalledTools()

    // Step 2: Register each tool per workspace as an independent runtime
    // ← This is the key: each (tool, workspace) pair is a separate runtime
    //   The server can route tasks to specific tools in specific workspaces
    runtimes := registerRuntimes(tools, serverURL, token)

    // Step 3: Start heartbeat goroutine (every 15 seconds)
    // ← Heartbeat tells the server "this daemon is still alive and online"
    //   Server marks daemon offline if heartbeat stops → tasks won't be dispatched
    go func() {
        ticker := time.NewTicker(15 * time.Second)
        for range ticker.C {
            sendHeartbeat(runtimes, serverURL, token)
        }
    }()

    // Step 4: Task polling loop (every 3 seconds)
    // ← Poll-based architecture, not push/webhook
    //   This allows the daemon to be behind NAT/firewall without inbound ports
    //   The daemon reaches OUT to the server; the server never reaches IN to the daemon
    ticker := time.NewTicker(3 * time.Second)
    for {
        select {
        case <-ctx.Done():
            return nil
        case <-ticker.C:
            for _, runtime := range runtimes {
                // Check for queued tasks for this runtime
                tasks := fetchQueuedTasks(runtime, serverURL, token)
                for _, task := range tasks {
                    // ← Each task runs in its own goroutine
                    //   Multiple tasks can run in parallel if multiple runtimes are available
                    go executeTask(ctx, task, runtime)
                }
            }
        }
    }
}

func executeTask(ctx context.Context, task Task, runtime Runtime) {
    // Step 1: Mark task as dispatched
    updateTaskStatus(task.ID, "dispatched")

    // Step 2: Create isolated working directory
    // ← Each task gets its own directory: no cross-contamination between tasks
    workDir := createWorkDir(task.ID)

    // Step 3: Inject skills into tool-specific skill path
    // ← Different tools discover skills from different paths:
    //   Claude Code: ~/.claude/skills/
    //   Cursor:      .cursor/skills/
    //   Copilot:     .copilot/skills/
    //   Antigravity: .agents/skills/
    //   Gemini/Hermes/OpenClaw: .agent_context/skills/ (fallback, may not be read)
    // ← THIS is the critical detail: skills land in different places per tool
    injectSkills(task.AgentSkills, runtime.ToolName, workDir)

    // Step 4: Mark task as running and invoke the AI coding tool
    updateTaskStatus(task.ID, "running")
    result := invokeAITool(runtime, task, workDir)

    // Step 5: Report result
    if result.Success {
        updateTaskStatus(task.ID, "completed")
    } else {
        updateTaskStatus(task.ID, "failed")
        postComment(task.IssueID, result.ErrorSummary)
    }
}

The poll-based architecture (3-second interval, outbound only) is the design choice that enables the daemon to run behind NAT and firewalls without requiring inbound ports or webhooks. The daemon always initiates the connection. The server never needs to reach the daemon directly. This is a security-first design, not a latency optimization.

Snippet Two: Skills and Squads

# SKILL.md format (Anthropic Agent Skills open standard)
# Skills are knowledge packs attached to agents
# Compatible with Anthropic's official repository, ClawHub, and skills.sh

---
name: python-type-safety
description: Enforce strict typing, mypy compliance, and PEP 8 in Python projects
version: 1.0.0
---

# Python Type Safety Expert

## When to apply this skill
Apply when working on any Python file. This skill ensures all code you write
follows strict type annotation practices and passes mypy in strict mode.

## Core rules
1. Every function parameter and return type must be annotated
2. Use `from __future__ import annotations` for forward references
3. Avoid `Any` type — use Union, Optional, or TypeVar instead
4. Run `mypy --strict` before marking any task complete

## Common patterns
```python
# CORRECT: explicit return type, no implicit Any
def process_items(items: list[str]) -> dict[str, int]:
    return {item: len(item) for item in items}

# WRONG: missing return type annotation
def process_items(items):
    return {item: len(item) for item in items}


```bash
# Squads: multi-agent routing with a designated leader agent
# Source: multica.ai/docs/squads (MIT)

# Create a squad with a leader agent and specialized members
multica squad create \
  --name "Backend Squad" \
  --leader backend-lead-agent

# Add specialist agents with role descriptions
# ← Role descriptions help the leader decide who to delegate to
multica squad member add <squad-id> \
  --member-id <api-engineer-uuid> \
  --type agent \
  --role "Owns the REST API layer and OpenAPI spec"

multica squad member add <squad-id> \
  --member-id <db-engineer-uuid> \
  --type agent \
  --role "Owns migrations, query optimization, and PostgreSQL tuning"

multica squad member add <squad-id> \
  --member-id <security-engineer-uuid> \
  --type agent \
  --role "Owns auth flows, rate limiting, and OWASP review"

# When an issue is assigned to "Backend Squad":
# 1. Leader agent is triggered (not all members simultaneously)
# 2. Leader reads the issue + squad roster + role descriptions
# 3. Leader posts ONE delegation comment: "@db-engineer-agent, please handle this"
# 4. That @mention triggers a new task for the mentioned agent
# 5. Leader records its evaluation: why it chose that member
# 6. Leader stops — it does NOT implement the work itself

# ← THIS is the trick: the leader is a router, not a worker
# The squad's purpose is ROUTING, not adding capability
# "The squad doesn't add capability — it adds routing." — Multica docs

# Assign an issue to the squad (not an individual):
multica issue assign <issue-id> --assignee "Backend Squad"
# → Leader picks it up → delegates to the right specialist → leader stops

The "leader stops after delegating" design is the most important architectural decision in Squads. The leader does not implement the work. It reads the issue, picks the right member, posts the @mention, logs the decision, and exits. The specialist then runs as a normal agent. This keeps delegation and execution cleanly separated, which prevents leader agents from partially implementing tasks before delegating.

It In Action: End-to-End Worked Example

Scenario: Engineering team using a Backend Squad to handle a database performance issue.

Setup (one-time):

# Install daemon
npm install -g @multica/cli
multica daemon   # starts polling, connects to Multica Cloud

# Or self-host:
docker compose up -d  # Postgres + Multica server + web

Issue created:

Title: Slow query on /api/users/search endpoint
Description: The search endpoint is timing out at >50ms on production with
             1M+ users. Query plan shows full table scan despite user.email index.
Project: Backend
Priority: High

Assignment:

Assignee: Backend Squad
→ Leader agent (backend-lead-agent) triggered immediately

Leader agent execution (~30 seconds):

Leader reads: issue + squad roster + role descriptions

Leader reasoning:
  "This is a query optimization issue with a database index problem.
   The db-engineer-agent owns migrations, query optimization, and PostgreSQL tuning.
   This is squarely in their domain."

Leader posts comment:
  "@db-engineer-agent please investigate the slow query on /api/users/search.
   The query plan shows a full table scan despite the user.email index.
   Check index definition, query structure, and EXPLAIN ANALYZE output."

Leader records evaluation:
  multica squad activity <issue-id> action --reason "Routed to db-engineer: query optimization issue matching their domain"

Leader stops.

db-engineer-agent execution (~4 minutes):

Daemon picks up task from @mention trigger
Invokes: Claude Code (configured runtime for db-engineer-agent)
Skills injected: postgresql-expert + query-optimization (from workspace skill library)
Working directory: /tmp/multica-tasks/<task-id>/

Agent actions:
  1. Reads codebase: finds src/db/queries/users.py
  2. Runs EXPLAIN ANALYZE via db connection
  3. Diagnoses: index exists but query uses ILIKE → prevents index use
  4. Fix: adds pg_trgm extension + GIN index for trigram search
  5. Generates migration: 0042_add_users_search_gin_index.sql
  6. Runs tests: all pass
  7. Posts progress comments to issue throughout

Task completes.

Issue timeline:

09:14 Issue created
09:14 Assigned to Backend Squad
09:14 backend-lead-agent: "@db-engineer-agent please investigate..."
09:14 Task dispatched to db-engineer-agent
09:18 db-engineer-agent: "Found the issue: ILIKE prevents index use. Generating GIN index migration."
09:18 db-engineer-agent: "PR opened: #284 — adds pg_trgm GIN index for trigram search"
09:18 Task completed

Time from assignment to PR: 4 minutes
Human involvement: zero until PR review

What the human sees in real time via WebSocket:

Issue status changes: Backlog → In Progress → Done
Comment thread showing agent decisions and progress
Link to PR #284 in GitHub integration
Squad leader's delegation reasoning in activity timeline

Why This Design Works, and What It Trades Away

The daemon's poll-based architecture is the correct design for a platform that wants to run agent tasks on user machines without requiring inbound connectivity. A webhook or push-based model would require the user to expose a port or configure a tunnel. A poll-based model requires only outbound HTTP from the user's machine, which works behind any NAT or firewall. The 3-second polling interval is fast enough for interactive task assignment while cheap enough to run indefinitely without meaningful resource usage.

The "no AI execution on the server" constraint is the trust architecture that makes Multica suitable for teams with code confidentiality requirements. The server can be compromised without exposing API keys or code. The daemon can be compromised without exposing the server's data. These are separate trust boundaries with separate attack surfaces.

Squads address a real problem in multi-agent systems: which agent should handle this? Without squads, the person assigning work must know which agent to assign. With squads, the leader agent handles that routing decision. The role descriptions attached to squad members are the information the leader uses to make this decision. This is the correct abstraction: humans define specialists and roles, agents handle routing.

The skill system's adoption of the Anthropic Agent Skills open standard is the right interoperability choice. Skills written for Claude Code work in Multica without modification. Skills from ClawHub work without adaptation. The standard creates a portable knowledge format that is not locked to any single platform.

What Multica trades away:

Cloud execution (currently). The local daemon model means you need a machine running continuously to handle agent tasks. The cloud runtime (currently waitlist-only) will eliminate this requirement but adds the server-side execution trust question that the local daemon specifically avoids. This is a genuine architectural tension with no clean resolution: local execution preserves privacy but requires always-on machines; cloud execution is convenient but changes the trust model.

Latency per task pickup. The 3-second polling interval means there is up to a 3-second delay between issue assignment and task pickup. For synchronous workflows expecting immediate response, this is friction. For async development workflows (assign overnight, review in the morning), it is irrelevant.

Skill security is not solved. The ClawHavoc incident (February 2026) demonstrated that a malicious skill imported from ClawHub can instruct the AI coding tool to exfiltrate API keys. ClawHub added VirusTotal scanning after the incident. The docs are explicit: automated scans are not a substitute for your own review. Third-party skills are executable code attached to an agent running with access to your machine.

Technical Moats

The daemon as a trust boundary. Building a platform where AI execution happens on the user's machine, not the platform's servers, requires a daemon that is reliable, secure, and simple to install. The Multica daemon is a single Go binary with no runtime dependencies. It polls, invokes CLI tools, and reports results. Its simplicity is its security: there is very little surface area for the daemon to do something unexpected. Replicating this requires the same principled restraint: not adding features that require the daemon to store state or make outbound calls beyond the Multica server.

The squad routing model. The leader-delegates-and-stops pattern is correct but non-obvious. Most multi-agent system designs have the coordinator also participate in implementation. Multica's leader agent is a pure router. This requires designing the squad system to explicitly prevent leaders from executing task content, which is a constraint that must be enforced in the prompt engineering and the task dispatch logic simultaneously. Getting this right required real iteration on the squad prompt design.

Twelve tool integrations. Each AI coding tool has different invocation patterns, different skill file locations, different configuration formats, and different output parsing requirements. Maintaining correct integration with twelve tools is ongoing work. The MCP support being provider-specific (only 7 of 12 tools consume mcp_config) is evidence that even a well-resourced team finds full twelve-tool coverage non-trivial.

Insights

Insight One: Multica is closer to a build system than to a coding agent. Build systems (Make, Gradle, Bazel) take a specification of what to build, manage dependencies, and execute work on local machines. Multica takes a specification of what to implement (an issue), manages agent routing (squads), and executes work on local machines. The distinction matters for how you think about reliability: build system reliability is about infrastructure correctness, not agent quality. Multica's reliability is similarly bounded by infrastructure correctness (daemon availability, server uptime) before it is bounded by agent quality (how good Claude Code or Codex is).

Insight Two: The "no AI on our servers" design is a strong selling point for enterprise customers and a genuine architectural constraint for the product. The cloud runtime (coming soon) will enable teams without always-on machines to use Multica, but it will necessarily change the trust model. Any team that adopted Multica specifically because agents run locally will need to re-evaluate when cloud runtimes launch. The docs are transparent about this tension, but the product team has not yet publicly resolved it. How they handle the trust model for cloud execution will determine whether Multica can serve both security-sensitive and convenience-first teams simultaneously.

Surprising Takeaway

Multica ships with a mobile iOS app. The full agent management workflow (assign issues, @mention agents, chat, monitor task status) is available on iPhone. Most team that think of "AI coding agents" as a desktop-first or server-first workflow have not considered that the issue assignment and monitoring workflow is fundamentally mobile-compatible. You do not need to be at a computer to assign a task to an agent, monitor its progress via WebSocket push notifications, and review the PR it opened. The agent runs on your dev machine's daemon; you can track everything from your phone. This changes the async workflow model: an engineering manager can assign issues to agents from their phone during commute and review PRs when they get to their desk.

TL;DR For Engineers

Multica (multica-ai/multica, MIT, 19.1k stars, 2,557 commits) is a task collaboration platform where agents are assigned issues the same way humans are. Three-component architecture: Multica server (data + WebSocket, no AI execution) ↔ daemon (Go binary, polls every 3s, runs on your machine) ↔ 12 AI coding tools (Claude Code, Codex, Cursor, and 9 others). Your API keys and code never leave your machine.
Four task triggers: assign issue (start immediately), @mention (comment-level), direct chat (not tied to issue), autopilot (scheduled recurring). Self-host with Docker Compose or use Multica Cloud (daemon still runs locally; cloud runtime coming, waitlist-only).
Skills: Anthropic Agent Skills open standard (SKILL.md), compatible with ClawHub marketplace. Critical: skill files land in tool-specific paths (~/.claude/skills/ for Claude Code, .cursor/skills/ for Cursor). Three tools (Gemini, Hermes, OpenClaw) use a fallback path that may not be read by the tool.
Squads: named group of agents + humans, one leader agent. Assign to squad → leader routes to specialist via @mention → leader stops. The squad adds routing, not capability.
Security note: ClawHavoc incident (February 2026) showed malicious ClawHub skills can instruct agents to exfiltrate API keys. Review every third-party skill before importing. VirusTotal scanning exists but is not sufficient.

The Issue Tracker Was Missing an Agent Tab

Multica's bet is that the missing piece in software development workflows is not better AI coding models, it is a management layer for the agents those models power. Issue tracking, comment threading, skill libraries, multi-agent routing, scheduled autopilots, and a trust architecture that keeps your code on your machine are the infrastructure that converts a capable coding agent from a one-shot tool into an ongoing teammate.

The local daemon model is the right architecture for the current trust environment. The skill security surface is the current weakest point. The squads routing system is the most architecturally interesting component. The mobile app is the most underappreciated.

References

Multica GitHub Repository, MIT, 19.1k stars
Multica Documentation
Multica Skills Documentation
Multica Squads Documentation
ReAct: Synergizing Reasoning and Acting in Language Models, arXiv:2210.03629 — the reasoning+acting loop underlying the agents Multica manages
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation, arXiv:2308.08155 — multi-agent coordination research; Squads implement a structured version of this pattern
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?, arXiv:2310.06770 — the benchmark that validated the class of agents Multica orchestrates
ClawHub — the skill marketplace; see security notes on third-party skills

Multica (multica-ai/multica, MIT, 19.1k stars, 2,557 commits) is a task collaboration platform where AI agents are assigned issues like human teammates, executing via a local Go daemon (polls every 3s, heartbeat every 15s) that invokes 12 locally installed AI coding tools (Claude Code, Codex, Cursor, and 9 others) while keeping API keys and code directories on the user's machine. Core features: four task triggers (assign, @mention, chat, autopilot), a skill system compatible with the Anthropic Agent Skills open standard with ClawHub marketplace access (security note: ClawHavoc incident February 2026), and Squads (leader agent routes issues to specialists via @mention and stops, adding routing without adding capability). Self-hostable via Docker Compose; cloud runtimes on waitlist.

Sponsored Ad

If you enjoy practical AI insights, check out SnackOnAI and support the newsletter by subscribing, sharing, and exploring our sponsored ad — it helps us keep building and delivering value 🚀

22 ChatGPT Agents Built for Every Marketing Job

Most marketers use ChatGPT to do general research and then call it an AI strategy. The ones outperforming them are deploying specialized agents built for specific jobs.

We put together 22 plug-and-play ChatGPT marketing agents that handle the work eating your week, each with built-in instructions and structured outputs ready to go in under 5 minutes.

Subscribe to Marketing Against the Grain and get all 22 free.

Inside you'll find:

Competitive intelligence agent that visits competitor websites and builds detailed comparison matrices automatically
Customer feedback analyzer that ranks improvement opportunities by business impact
Social listening specialist that monitors brand mentions and flags reputation risks before they escalate
Campaign optimization agents that handle attribution analysis and surface what is actually driving results

Your competitors are already running agents like these.

Get 22 ChatGPT Marketing Agents free when you subscribe to Marketing Against the Grain today.

Get The Guide