Theoretical New Coding Agent
After analyzing 16 coding agents across this field guide, what features are common to all? What takes up the most code? And what should you absolutely include if you're building a new custom coding agent from scratch?
Methodology
This analysis is based on code inspection of all 16 agents in this field guide: Pochi, Neovate Code, Mux, Crush, Kimi CLI, Qwen Code, OpenHands, Claude Code, DeerFlow, Hermes Agent, Codex CLI, Pi Mono, Wintermolt, Zaica, Open Claude Code 2.0, and the now-archived hermes-agent.
For each agent, I identified: core architecture, tool implementations, provider integration patterns, context management approaches, permission systems, and extensibility mechanisms. The goal: extract patterns that appear across multiple agents — these are likely essential, not optional.
The Universal Agent Loop
Every coding agent implements some variant of this fundamental loop:
Command or prompt from user
Send messages + tools to model, get response
Run bash, read files, write edits, etc.
Collect tool results, inject as messages
Loop until task complete or max iterations
The variations come in how each step is implemented:
- Async generator (Open Claude Code) — yields events, recursive continuation
- Callback-based (Claude Code, Crush) — handlers for stream events
- Thread-based (Zaica, Wintermolt) — worker threads, message passing
- Graph-based (DeerFlow) — LangGraph nodes and edges
Essential Features — Must Include
These features appear in every coding agent in this set. Omitting them would make the agent incomplete.
| Feature | Why Essential | How Implemented Across Agents |
|---|---|---|
| File Read | Agents cannot function without reading code | All 16 agents — varies from simple file I/O to PDF/binary detection |
| File Write/Edit | The primary output mechanism | All 16 agents — Write (clobbers) vs Edit (diff/patch) distinction |
| Glob | Find files without knowing paths | All 16 agents — pattern matching, some with sorting |
| Grep | Search code without reading everything | All 16 agents — regex, context lines, line numbers |
| Bash/Shell | Run tests, build, git — the operating system bridge | All 16 agents — timeout handling is critical |
| LLM Integration | The brain — cannot be omitted | All 16 agents — provider abstraction varies |
| Message History | LLMs are stateless; history is memory | All 16 agents — array of message objects |
| Permission Prompts | Safety for dangerous operations | 15/16 agents — only Pi Mono omits (deliberate choice) |
| Session Save/Resume | Long tasks span multiple sessions | 15/16 agents — file-based or JSONL |
| MCP Client | Extensibility via Model Context Protocol | 13/16 agents — increasingly standard |
Near-Universal Features — Include Unless Deliberately Omitting
These appear in 12-15 of 16 agents. Include them unless you have a specific philosophical reason not to.
| Feature | Frequency | Notes |
|---|---|---|
| Sub-agent Spawning | 14/16 | Most have an Agent tool; some use thread/process pools |
| Skills System | 13/16 | Markdown-based instructions triggered by /command |
| Slash Commands | 13/16 | REPL shortcuts (/help, /model, /compact, etc.) |
| Context Compaction | 13/16 | When tokens get high — summarization or truncation |
| Todo/Task Management | 13/16 | Track goals and sub-tasks |
| Multi-Provider Support | 12/16 | Not Claude Code or Open Claude Code which are Anthropic-first |
| Token Tracking | 12/16 | Estimate tokens for compaction, cost monitoring |
| HTTP/Fetch | 12/16 | Web search, API calls, documentation retrieval |
| Loop Detection | 12/16 | Detect when agent is stuck repeating |
Code Distribution — What Takes Up The Most Space
Based on file sizes and line counts across these agents, here's approximately where the code goes:
Tool Implementations
35-50% of total codeBash, Read, Write, Edit, Glob, Grep — each is non-trivial
Permission/Sandbox Systems
15-25% of total codePath validation, injection detection, sandbox execution
UI/REPL/Terminal
10-20% of total codeSlash commands, rendering, input handling
Provider Integration
10-15% of total codePer-provider code, auth, streaming, retries
Context Management
5-10% of total codeCompaction, summarization, token counting
Session/State
3-8% of total codeSave, resume, checkpointing
Tool code is dominated by Bash
In most agents, Bash is the single largest tool implementation — typically 3-5KB for the basic implementation, but often 10KB+ with background jobs, timeout escalation (SIGTERM → SIGKILL), output limits, and injection detection.
Common Architectural Patterns
Across all these agents, certain patterns repeat. These are tried-and-true approaches that multiple agents have validated.
1. Registry Pattern for Tools
Tools are registered in a central registry with name-based lookup:
tools.get('Bash'). Each tool exposes name, description,
inputSchema, validateInput(), and call(). This makes tools discoverable
and allows dynamic tool sets per agent.
Used by: Open Claude Code, Pochi, Pi Mono, Qwen Code
2. Middleware Chain
A fixed-order chain where each request/response passes through middleware layers: logging, permissions, error handling, compaction, loop detection. Each middleware can short-circuit the chain.
Used by: DeerFlow (14 layers), Mux, some Crush components
3. Event-Driven Architecture
Async generators or event emitters where the agent yields events: stream start, tokens, tool calls, errors, compaction. UI listens to same stream as backend.
Used by: Open Claude Code (13 events), DeerFlow, some Claude Code components
4. Provider Abstraction
A provider interface that normalizes across OpenAI, Anthropic, Google, etc.: chat(), streaming(), token counting. Each provider implements the interface, configuration selects which to use.
Used by: Mux, Qwen Code, Pochi, Pi Mono, Wintermolt
Unique Innovations — Consider Including
These features appear in only 1-3 agents but represent genuinely good ideas worth considering.
| Feature | Agent(s) | Why It's Worth Including |
|---|---|---|
| Two-tier Compaction | Open Claude Code | Micro-compaction (truncate old tool results) + full compaction (summarize) gives fine-grained control |
| Git Worktree Isolation | Open Claude Code | Subagents get isolated git branches — cleaner parallelization |
| File Checkpointing + Undo | Open Claude Code | Native undo support — rollback dangerous edits |
| Deferred Tool Loading | DeerFlow | Only load MCP tools when needed — reduces context pollution |
| Micro-batching | Pochi | Batch concurrent read operations, run stateful ops serially — better parallelism |
| File State Cache | Pochi | Detect "file unchanged since last read" — save tokens on repeated reads |
| Large Output Persistence | Pochi | Offload big tool results to disk — avoid context bloat |
| Chain Workflows | Zaica | Multi-step pipelines (scout → planner → coder → reviewer) — structured automation |
| MCP Bidirectional | Codex CLI, Wintermolt | Agent as MCP server for other agents — composability |
| Skill Self-Evolution | DeerFlow | Agent creates/improves skills during session — learn from experience |
Minimum Viable Coding Agent
If you wanted to build the smallest possible functional coding agent, here's what you'd need:
Core Files (~10 files, ~2000 lines)
- Agent Loop — async function that: sends messages to LLM, parses response, executes tools, loops
- Read Tool — file reading with line numbers, binary detection
- Write Tool — file writing with mkdir -p
- Edit Tool — string replacement, uniqueness check
- Glob Tool — pattern matching
- Grep Tool — regex search with ripgrep
- Bash Tool — shell execution with timeout (60s default)
- LLM Provider — abstraction + OpenAI or Anthropic implementation
- Message History — array of messages, add tool results
- Simple REPL — input loop, output streaming
That's actually Pi Mono's starting point
Pi Mono deliberately ships exactly this minimal kernel — no MCP, no permissions, no sub-agents. Everything else is an extension. Their 438 files is after adding all the extensions.
Agents That Add Then Remove Code
An interesting observation from this analysis cycle: which agents seem to be in a state of flux, adding features then pruning them?
| Agent | Pattern Observed | Example |
|---|---|---|
| DeerFlow | Citation system added, then removed | Removed SafeCitationContent, inline-citation.tsx, citation core — replaced with simple MarkdownContent |
| Pochi | New features added alongside refactoring | Added batch-utils, file-state-cache, tool-result-persistence while refactoring tool selection |
| Pi Mono | Rapid iteration, many provider changes | Added extensive anthropic/openai test coverage, thinking levels, cache retention control |
DeerFlow's citation removal is particularly notable — they had a full citation parsing system (SafeCitationContent, use-parsed-citations, citation utilities) and removed it entirely. This suggests they found citations weren't worth the complexity for their use case.
Interesting Insights
Permission Systems Are Under-Engineered
Most agents have simple permission prompts but lack sophisticated policy engines. Only Claude Code, Neovate Code, and Codex CLI have serious command banning, injection detection, and path validation. This is low-hanging fruit for improvement.
Compaction Is Solved But Under-Implemented
Most agents implement compaction but inconsistently. Only Open Claude Code, DeerFlow, and Claude Code itself have nuanced compaction (micro vs full). Many just truncate at a fixed token count.
MCP Is Becoming Standard
MCP went from rare to common in 12 months. Now 13/16 agents support it. If building a new agent, MCP support is expected, not optional.
The Zig Agents Go Wide
Wintermolt and Zaica take opposite approaches: Wintermolt has 7 operating modes, cron scheduling, Tailscale, camera, browser control. Zaica has chain workflows, reactive state, Wyhash. Both are small and focused but on very different problems.
Open Claude Code Is A Technical Marvel
The async generator architecture with 13 event types, two-tier compaction, git worktree isolation, file checkpointing, and 7-type hook system — implemented with 1,581 passing tests — is genuinely impressive as an independent implementation.
DeerFlow Has The Best Middleware
The 14-layer middleware chain with @Next/@Prev anchors for custom positioning, circular dependency detection, and guaranteed ordering is the most sophisticated extensibility mechanism in any agent here.
Recommendations for New Agent Builders
- Start with Pi Mono's minimal kernel approach. Ship a working core, then add extensions. A 10-file agent that works is better than a 500-file agent that doesn't.
- Use tool registry pattern. Tools should be discoverable, composable, and swappable. Don't hardcode tool calls in the loop.
- Implement two-tier compaction from day one. Micro-compaction (truncate old tool results to 100 chars) is cheap to implement and saves tokens constantly.
- Build permission as middleware. Don't mix permission logic with tool execution — wrap the tool call, check permissions, proceed or deny.
- Support MCP from the start. It's the standard for extensibility. stdio transport is simplest to implement first.
- Don't skimp on Bash. It's the tool users care about most. Handle timeouts, background jobs, output limits, and at least basic injection detection.
- Plan for checkpointing. File before-write checkpoints enable undo. Even a simple backup-copy before edit catches most disasters.
- Use existing patterns. The agent loop, tool registry, middleware chain, and provider abstraction have been validated across 16 agents. Don't innovate on architecture — replicate what works.
Bottom Line
Building a coding agent from scratch? Focus your energy on:
- Bash tool — most code, most user impact
- Permission system — safety-critical, often under-engineered
- Two-tier compaction — token efficiency is existential for long tasks
- MCP integration — extensibility standard, not optional
Everything else — skills, slash commands, sub-agents, multi-provider — can be added later. The core loop + 6 tools + LLM provider + compaction is a working agent on day one.