SnackOnAI Engineering | Senior AI Systems Researcher | Technical Deep Dive | May 7, 2026
The dominant framing in AI agent research treats "multi-agent" as a capability problem: how do you make agents reason better, plan longer, use tools more reliably? TinyAGI treats it as an organizational problem: how do you run a team of specialized AI workers, 24 hours a day, across every channel your users might contact you on, with exactly the same reliability and coordination properties you would expect from a human team?
The answer is a TypeScript monorepo, a SQLite message queue with dead-letter management, isolated agent workspaces per Claude Code or Codex session, and a fan-out routing protocol implemented in markdown.
TinyAGI (3.2k GitHub stars, 465 forks, MIT, formerly TinyClaw) is the agent teams orchestrator built for the "one-person company" use case: one person, multiple specialized AI agents, running 24/7, processing messages from Discord, WhatsApp, Telegram, and a web portal simultaneously.
This newsletter dissects TinyAGI as a systems document: what the SQLite queue provides that a naive queue does not, why isolated workspaces rather than shared context, how the [@agent: message] routing syntax maps to actual code paths, and what the "one person company" design constraint reveals about the tradeoffs other frameworks make.
Scope: TinyAGI's architecture (README, AGENTS.md, package structure), the SQLite queue design, workspace isolation, the fan-out and chain routing protocol, the TinyOffice web portal, multi-channel support, and the plugin system. Not covered: the broader LLM agent research landscape beyond contextual comparison, or TinyAGI's roadmap.
What It Actually Does
TinyAGI runs multiple teams of specialized AI agents simultaneously, each operating in an isolated filesystem workspace, communicating via a SQLite-backed message queue, accessible through any messaging channel. The project describes its goal as "the agent teams orchestrator for One Person Company."
Concretely: you configure agents (a coder, a writer, an assistant, or any custom role), assign them AI providers (Claude, Codex, or any OpenAI/Anthropic-compatible endpoint), and run TinyAGI in tmux for always-on operation. Users interact via Discord, WhatsApp, Telegram, or the TinyOffice web portal. Messages route to the appropriate agent by @mention syntax or fall back to a default agent.
Repository structure (monorepo):
tinyagi/
├── packages/
│ ├── core/ # Shared types, config, SQLite queue, agent invocation
│ ├── main/ # Queue processor entry point
│ ├── teams/ # Team conversation orchestration, chain execution
│ ├── server/ # REST + SSE API server
│ ├── channels/ # Discord, Telegram, WhatsApp channel clients
│ ├── cli/ # CLI commands
│ └── visualizer/ # TUI dashboard and chatroom viewer
├── tinyoffice/ # Next.js web portal
└── .tinyagi/ # Runtime data: SQLite db, logs, channel state, chats
Agent workspaces (per agent, isolated):
~/tinyagi-workspace/
├── tinyagi/ # default agent workspace
├── coder/ # coder agent workspace
└── writer/ # writer agent workspace
Each workspace is the working_directory when that agent's LLM process runs. Claude Code runs with claude -c -p "message" from within the agent's workspace directory. Codex runs with codex exec resume --last --json "message". The workspace isolation means each agent's conversation history, file modifications, and context are completely separate.
The Architecture, Unpacked

Focus on the SQLite queue as the coordination primitive. All channels enqueue into one table. The queue processor dispatches to agents in parallel. Responses enqueue back. The entire system's reliability and durability properties come from SQLite's atomic transactions, not from the agent processes themselves.
The Code, Annotated
Snippet One: SQLite Queue Design (Atomic, Durable, Dead-Letter)
// packages/core/src/queue/queue.ts (reconstructed from architecture docs + README)
// The SQLite queue is the reliability foundation of TinyAGI.
// Why SQLite instead of Redis or an in-memory queue?
// ← Durability: survives process crashes without lost messages
// ← Simplicity: no separate database process to manage
// ← Atomicity: transactions prevent partial state on concurrent access
// ← The bottleneck is LLM API calls (seconds), not queue ops (microseconds)
import Database from 'better-sqlite3';
import path from 'path';
import os from 'os';
interface Message {
id: string;
agent_id: string;
channel: string;
sender: string;
content: string;
status: 'pending' | 'processing' | 'completed' | 'dead';
receive_count: number;
created_at: number;
updated_at: number;
}
class AgentMessageQueue {
private db: Database.Database;
private readonly MAX_RECEIVE_COUNT = 3; // after 3 failures → dead letter
constructor() {
const dbPath = path.join(os.homedir(), '.tinyagi', 'tinyagi.db');
this.db = new Database(dbPath);
// ← WAL mode: critical for concurrent access
// Multiple agents reading + the queue processor writing simultaneously
// would deadlock in journal mode. WAL allows concurrent reads.
this.db.pragma('journal_mode = WAL');
this.db.pragma('foreign_keys = ON');
this.initSchema();
}
private initSchema(): void {
// ← Status as a state machine: pending → processing → completed
// OR pending → processing → pending (retry) × N → dead
// Dead-lettered messages are kept for inspection, not deleted.
this.db.exec(`
CREATE TABLE IF NOT EXISTS messages (
id TEXT PRIMARY KEY,
agent_id TEXT NOT NULL,
channel TEXT NOT NULL,
sender TEXT NOT NULL,
content TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
receive_count INTEGER NOT NULL DEFAULT 0,
created_at INTEGER NOT NULL,
updated_at INTEGER NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_messages_status_agent
ON messages(status, agent_id); -- ← critical for queue processor polling
CREATE TABLE IF NOT EXISTS responses (
id TEXT PRIMARY KEY,
message_id TEXT NOT NULL REFERENCES messages(id),
agent_id TEXT NOT NULL,
content TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at INTEGER NOT NULL
);
`);
}
enqueueMessage(agentId: string, channel: string, sender: string, content: string): string {
const id = crypto.randomUUID();
const now = Date.now();
// ← Atomic insert: either fully written or not at all
// No partial state if the process crashes mid-write
const stmt = this.db.prepare(`
INSERT INTO messages (id, agent_id, channel, sender, content, status, receive_count, created_at, updated_at)
VALUES (?, ?, ?, ?, ?, 'pending', 0, ?, ?)
`);
stmt.run(id, agentId, channel, sender, content, now, now);
return id;
}
receiveMessage(agentId: string): Message | null {
// ← THIS is the trick: the entire receive-and-lock is one atomic transaction
// Two queue processors can't pick up the same message because the
// status update and the SELECT happen in the same transaction.
const receive = this.db.transaction(() => {
const msg = this.db.prepare(`
SELECT * FROM messages
WHERE agent_id = ? AND status = 'pending'
ORDER BY created_at ASC
LIMIT 1
`).get(agentId) as Message | undefined;
if (!msg) return null;
// Check if this message has exceeded retry limit → dead letter
if (msg.receive_count >= this.MAX_RECEIVE_COUNT) {
this.db.prepare(`UPDATE messages SET status = 'dead', updated_at = ? WHERE id = ?`)
.run(Date.now(), msg.id);
return null; // ← dead-lettered: caller doesn't process it
}
// Lock for processing
this.db.prepare(`
UPDATE messages
SET status = 'processing', receive_count = receive_count + 1, updated_at = ?
WHERE id = ?
`).run(Date.now(), msg.id);
return { ...msg, status: 'processing', receive_count: msg.receive_count + 1 };
});
return receive();
}
completeMessage(id: string): void {
this.db.prepare(`UPDATE messages SET status = 'completed', updated_at = ? WHERE id = ?`)
.run(Date.now(), id);
}
// If an agent crashes mid-processing, this resets stuck messages
resetStalledMessages(stalledThresholdMs: number = 300_000): void {
const cutoff = Date.now() - stalledThresholdMs;
this.db.prepare(`
UPDATE messages SET status = 'pending', updated_at = ?
WHERE status = 'processing' AND updated_at < ?
`).run(Date.now(), cutoff);
}
deadLetters(agentId?: string): Message[] {
const query = agentId
? `SELECT * FROM messages WHERE status = 'dead' AND agent_id = ?`
: `SELECT * FROM messages WHERE status = 'dead'`;
return (agentId ? this.db.prepare(query).all(agentId) : this.db.prepare(query).all()) as Message[];
}
}
The receive transaction is the entire correctness guarantee. Without the atomic lock (SELECT + UPDATE in one transaction), two agents could both pick up the same message. SQLite serializes transactions, preventing this without any external coordination mechanism.
Snippet Two: The Fan-Out Routing Protocol and Agent Invocation
// packages/teams/src/router.ts + packages/core/src/agent.ts (reconstructed)
// The routing protocol is implemented in two layers:
// 1. Tag parsing: [@agent_id: message] → {agentId, message, sharedContext}
// 2. Agent invocation: run the appropriate CLI command in the agent's workspace
interface RouteTarget {
agentId: string;
message: string;
sharedContext?: string;
}
// Parse routing tags from a message
// Examples:
// "[@coder: fix the bug]" → [{agentId: 'coder', message: 'fix the bug'}]
// "[@coder: fix X] [@writer: document Y]" → two targets, invoked in PARALLEL
// "Sprint ends Friday.\n[@coder: also list PRs] [@reviewer: also flag blockers]"
// → shared context 'Sprint ends Friday.' delivered to both, plus each gets own tag
function parseRoutingTags(rawMessage: string): RouteTarget[] {
const tagPattern = /\[@([\w-]+):\s*(.*?)\]/gs;
const targets: RouteTarget[] = [];
// Extract shared context: text OUTSIDE the [@agent: ...] tags
const sharedContext = rawMessage.replace(tagPattern, '').trim();
// Reset regex lastIndex for reuse
tagPattern.lastIndex = 0;
let match;
while ((match = tagPattern.exec(rawMessage)) !== null) {
const [, rawIds, message] = match;
// Handle comma-separated agent IDs: [@coder,writer,tester: message]
// ← All get the same message — broadcast mode
const agentIds = rawIds.split(',').map(id => id.trim());
for (const agentId of agentIds) {
targets.push({
agentId,
message: message.trim(),
sharedContext: sharedContext || undefined,
});
}
}
// If no tags found, route to default agent
if (targets.length === 0) {
targets.push({ agentId: 'tinyagi', message: rawMessage });
}
return targets;
}
// Invoke an agent: run the LLM CLI in the agent's isolated workspace
async function invokeAgent(
agentId: string,
message: string,
config: AgentConfig,
): Promise<string> {
const workspacePath = path.join(os.homedir(), 'tinyagi-workspace', agentId);
// ← Each agent invocation runs in its OWN workspace directory
// cd coder/ before running claude → Claude Code's file operations are isolated
// No context bleed: coder agent's conversation history is separate from writer's
const options = {
cwd: workspacePath,
env: { ...process.env, ANTHROPIC_API_KEY: config.apiKey },
};
if (config.provider === 'claude') {
// -c: continue conversation (persistent session)
// -p: prompt (non-interactive, stdout output only)
// --dangerously-skip-permissions: allow file I/O without confirmation
const result = await execa('claude', [
'--dangerously-skip-permissions',
'--model', config.model, // e.g., claude-sonnet-4-6
'-c', // ← continue from last session
'-p', message, // ← the actual message/task
], options);
return result.stdout;
} else if (config.provider === 'codex') {
// resume --last: continue the most recent Codex session
const result = await execa('codex', [
'exec', 'resume', '--last',
'--model', config.model,
'--skip-git-repo-check',
'--dangerously-bypass-approvals-and-sandbox',
'--json',
message,
], options);
return JSON.parse(result.stdout).output;
}
throw new Error(`Unsupported provider: ${config.provider}`);
}
// Fan-out dispatcher: invoke all targets in parallel
async function dispatchToAgents(
rawMessage: string,
configs: Map<string, AgentConfig>,
): Promise<Map<string, string>> {
const targets = parseRoutingTags(rawMessage);
// ← Fan-out: all targets dispatched simultaneously
// A message to [@coder] and [@writer] starts BOTH agent sessions at the same time
// Neither waits for the other → parallel execution
const results = await Promise.all(
targets.map(async ({ agentId, message, sharedContext }) => {
const config = configs.get(agentId) ?? configs.get('tinyagi')!;
const fullMessage = sharedContext
? `${sharedContext}\n\n${message}`
: message;
const response = await invokeAgent(agentId, fullMessage, config);
return [agentId, response] as const;
})
);
return new Map(results);
}
Promise.all over multiple invokeAgent calls is the fan-out implementation. The parallel execution happens in the Node.js event loop, with each agent subprocess (claude or codex) running independently. If one agent fails, the others complete normally and their responses are still delivered.
It In Action: End-to-End Worked Example
Scenario: A solo developer runs TinyAGI with three agents (coder, reviewer, assistant) and receives a Discord message: "Sprint ends Friday. Report your status and blockers."
Input (Discord message):
Sprint ends Friday, 3 open bugs.
Reply with: (1) status (2) blockers (3) next step.
[@coder: Also list any PRs you have open.]
[@reviewer: Also flag any PRs waiting on you.]
[@tester: Also report test coverage for the auth module.]
Step 1: Channel → Queue
Discord channel client receives the message.
enqueueMessage():
→ 3 queue entries created (one per @tag):
{id: 'msg-a1', agent_id: 'coder', content: '[shared context]\n[coder-specific]', status: 'pending'}
{id: 'msg-b2', agent_id: 'reviewer', content: '[shared context]\n[reviewer-specific]', status: 'pending'}
{id: 'msg-c3', agent_id: 'tester', content: '[shared context]\n[tester-specific]', status: 'pending'}
SQLite write: ~0.1ms per enqueue (negligible vs LLM latency)
Step 2: Queue Processor → Parallel Dispatch
Queue processor polls every 500ms.
receiveMessage('coder'): msg-a1 → status: 'processing'
receiveMessage('reviewer'): msg-b2 → status: 'processing'
receiveMessage('tester'): msg-c3 → status: 'processing'
All three invocations launched in parallel via Promise.all:
Agent coder:
cd ~/tinyagi-workspace/coder/
claude --dangerously-skip-permissions --model claude-sonnet-4-6 -c -p "[message]"
→ Claude Code opens coder's conversation context
→ Reads coder's workspace files (git log, open PRs, recent commits)
→ Response: "Status: 2 bugs fixed. Blockers: waiting on API docs. PRs: #47, #51"
Agent reviewer:
cd ~/tinyagi-workspace/reviewer/
claude --dangerously-skip-permissions --model claude-sonnet-4-6 -c -p "[message]"
→ Uses reviewer's isolated context (different from coder's)
→ Response: "Status: reviewing PR #47. Blockers: need unit tests on auth changes."
Agent tester:
cd ~/tinyagi-workspace/tester/
claude --dangerously-skip-permissions --model claude-sonnet-4-6 -c -p "[message]"
→ Checks test files in tester workspace
→ Response: "Auth module coverage: 73%. 2 missing edge cases identified."
All three running concurrently: total LLM latency ≈ max(coder, reviewer, tester) not sum
Typical: 8-15 seconds per agent call → fan-out saves ~2x wall-clock time vs sequential
Step 3: Responses → Delivery
All three responses enqueued to responses table.
Response delivery:
msg-a1 response → Discord reply in thread (or DM): coder's status
msg-b2 response → Discord reply: reviewer's status
msg-c3 response → Discord reply: tester's status
Total latency: ~10-15 seconds from Discord message to all three replies
(bounded by slowest agent, not sum of all three)
Step 4: Chain execution (alternative flow)
For sequential handoff: "@coder: draft the fix. When done, @reviewer: review it"
Agent router detects chain instruction inside coder's response or message.
coder runs → response triggers reviewer invocation
Latency: sequential, not parallel (coder response time + reviewer response time)
Use when: second agent depends on first agent's output
Use fan-out when: agents are independent
Why This Design Works, and What It Trades Away
The SQLite queue as the coordination primitive is the correct choice for a single-machine personal-scale orchestrator. SQLite with WAL mode provides concurrent readers and serialized writes. The dead-letter mechanism means message processing failures are durable and inspectable, not silently lost. The entire queue fits in a file in ~/.tinyagi/. No separate database process. No Redis. No network partition to handle. The bottleneck is LLM API calls (seconds), not queue operations (microseconds). SQLite is more than adequate for this workload.
The isolated workspace design is the correct choice for preventing context contamination. When the coder agent's working_directory is ~/tinyagi-workspace/coder/, Claude Code's file read/write operations, its conversation history (-c flag), and its CLAUDE.md configuration all live in that directory. The writer agent's workspace is completely separate. A coding agent that has spent weeks working on a TypeScript project carries that context forward. A writing agent maintains its own editorial style history. Mixing these contexts in a single workspace would degrade both agents' specialized performance.
The [@agent: message] syntax implemented in markdown (AGENTS.md) is the correct approach for the "one person company" user model. The user does not want to learn an API or a DSL. They want to talk to their team in natural language. Routing via @mention is a UX pattern every developer already knows from GitHub, Slack, and Discord. The syntax maps directly to code (tag parsing, Promise.all fan-out) while remaining human-readable and editable in AGENTS.md.
What TinyAGI trades away:
Horizontal scalability. SQLite is single-file and single-machine. For multi-machine deployments, SQLite's WAL mode is insufficient. The README acknowledges this: "For high-scale deployments, run multiple TinyAGI instances behind a load balancer, each connecting to a shared SQLite database with WAL mode enabled. Use the plugin system to implement Redis as an external queue for horizontal scaling." This is a documented limitation, not an oversight. The target use case, one person's agent team on one machine, does not require horizontal scaling.
Sophisticated agent memory and RAG. TinyAGI relies on the LLM's context window (via -c for persistent conversation) as its primary memory mechanism. There is no built-in vector store or long-term memory retrieval. For tasks requiring recall of information from months ago that exceeds context window limits, TinyAGI does not provide a built-in solution.
Agent-to-agent observability. When the coder agent hands off to the reviewer via chain execution, there is no built-in tracing, span recording, or structured logging of the inter-agent communication. The TUI visualizer shows which agents are active, but debugging a multi-hop chain failure requires reading log files.
Technical Moats
The file-as-agent-config approach. TinyAGI configures agents via markdown files (AGENTS.md, SOUL.md, heartbeat.md) that are copied into each agent's workspace directory. This is not just documentation, it is the agent's system prompt and operating instructions. The AGENTS.md file that ships with TinyAGI contains the routing syntax, the team collaboration protocol, and the agent's personality guidelines. Editing it in a text editor is the configuration mechanism. This is the correct UX for a personal tool where the user is the developer.
Provider normalization across Claude and Codex. TinyAGI invokes Claude Code and OpenAI Codex as subprocess CLIs rather than API clients. This means a Claude agent and a Codex agent can collaborate on the same team, with the orchestration layer normalizing their different response formats. A user can run a Claude-powered research agent that hands off to a Codex-powered coding agent without any code changes. The provider field in settings.json is the only configuration difference.
Always-on via tmux, not daemon processes. The 24/7 operation model uses tmux rather than a custom daemon or systemd service. This is the right choice for a personal tool: tmux sessions are visible, attachable, debuggable, and familiar to developers. A crashing agent subprocess restarts via the queue processor's stalled-message reset mechanism, not via a complex restart supervisor. Operational simplicity beats theoretical elegance at this scale.
Insights
Insight One: TinyAGI is not a framework for building AI products. It is infrastructure for replacing human assistants, and the design choices reflect that goal in every layer.
Most multi-agent frameworks (LangGraph, CrewAI, AutoGen) are evaluated on their ability to complete complex reasoning tasks, solve benchmarks, or build demos. TinyAGI is evaluated on whether it is reliably available when a user sends a WhatsApp message at 2am, whether the coder agent remembers the context from last week's discussion, and whether the message queue survives a network interruption without losing work. These are operational reliability requirements, not capability requirements. The SQLite queue, the tmux deployment, the dead-letter management, and the isolated workspace persistence all exist to serve operational reliability, not to maximize reasoning performance.
Insight Two: The "one person company" design constraint is actually a stronger forcing function for good architectural decisions than "enterprise scalability" requirements, because it forces every abstraction to justify its complexity against a single concrete user's daily workflow.
Multi-agent frameworks built for enterprise often accumulate complexity: sophisticated orchestration graphs, complex state machines, observability stacks, custom protocols. TinyAGI's design constraint (one person, one machine, available 24/7, chat-based) eliminates most of this complexity by default. SQLite instead of distributed queue: the user's laptop. Markdown-file configuration instead of a DSL: the user reads it. tmux instead of a process manager: the user can attach. The complexity that remains is the complexity that earns its place. Most agent frameworks would benefit from applying a similar "one concrete user" constraint before adding features.
Takeaway
TinyAGI's most important file is AGENTS.md, a markdown file that serves simultaneously as the routing protocol specification, the agent team operating manual, the user-editable system prompt, and the onboarding document for new team members, both human and AI. The file is the protocol.
When a new agent is created via tinyagi agent add, a copy of AGENTS.md is placed in that agent's workspace directory. The agent reads AGENTS.md at the start of every session. The routing syntax ([@agent: message], fan-out, chain execution, shared context) is not enforced by code validation. It is enforced by the LLM reading the markdown specification and following it. The protocol lives in a file that any team member, human or AI, can read, understand, and propose changes to. This is a deliberately different design choice from frameworks that encode protocol in code and require developers to understand source files to modify behavior.
TL;DR For Engineers
TinyAGI runs teams of specialized AI agents (Claude Code, Codex, or any compatible provider) in isolated filesystem workspaces, coordinated via a SQLite message queue with atomic transactions, retry logic, and dead-letter management. Always-on via tmux.
Message routing uses
[@agent_id: message]tags parsed from user messages. Single targets route sequentially. Multiple tags triggerPromise.allfan-out (parallel). Text outside tags delivers as shared context to all mentioned agents.SQLite with WAL mode is the correct queue choice: durable, file-backed, concurrent readers, single-writer. LLM API calls (seconds) are the bottleneck, not queue ops (microseconds). No Redis required for single-machine deployment.
Isolated workspaces (one directory per agent,
claude -cfor persistent conversation) prevent context contamination between agents. The coder's weeks of TypeScript context does not bleed into the writer's editorial context.AGENTS.md is the configuration and protocol specification simultaneously. Placed in each agent's workspace, it defines routing syntax, team operating model, and agent personality. The protocol is a markdown file that both humans and agents can read and modify.
The One-Person Company Has a Tech Stack Now
TinyAGI's engineering contribution is not a novel algorithm or a new research result. It is the correct application of proven primitives (SQLite, tmux, TypeScript monorepo, subprocess CLI invocation) to the specific requirements of running AI agent teams as persistent personal infrastructure. The result is a system that is reliable, inspectable, debuggable, and operable by a single developer without DevOps overhead.
The implicit claim in "agent teams orchestrator for One Person Company" is that the coordination overhead of managing specialized AI workers should be lower than the coordination overhead of managing specialized human workers. TinyAGI's architecture makes that overhead concrete: a SQLite file, a handful of markdown configuration files, and a process running in tmux. The claim is warranted.
References
TinyAGI GitHub Repository, 3.2k stars, 465 forks, MIT license
TinyAGI docs/AGENTS.md, multi-agent configuration docs
TinyClaw repository (original name), same architecture, prior branding
Building a Durable Message Queue on SQLite for AI Agent Orchestration, DEV Community, March 2026
A Survey on Large Language Model based Autonomous Agents, arXiv:2308.11432, Wang et al., 2023
TinyOffice web portal, browser-based dashboard for TinyAGI
TinyAGI (3.2k stars, MIT, formerly TinyClaw) is a TypeScript monorepo orchestrating teams of specialized AI agents (Claude Code, Codex, or any OpenAI/Anthropic-compatible provider) in isolated filesystem workspaces, coordinated via a SQLite message queue with atomic transactions, WAL mode for concurrent access, retry logic, and dead-letter management, running 24/7 via tmux with multi-channel support (Discord, Telegram, WhatsApp, web). The routing protocol is implemented in markdown (AGENTS.md): [@agent_id: message] tags are parsed at queue dispatch time, with multiple tags triggering Promise.all fan-out for parallel execution and text outside tags delivering as shared context. The design target is the "one person company" operational requirement: reliable availability and persistent agent context over reasoning performance or horizontal scalability.
Sponsored Ad
If you enjoy practical AI insights, check out SnackOnAI and support the newsletter by subscribing, sharing, and exploring our sponsored ad — it helps us keep building and delivering value 🚀
