AI Coding Guides Deep Dives
Cross-Agent Analysis • 16 Agents • Pattern Extraction

Theoretical New Coding Agent

After analyzing 16 coding agents across this field guide, what features are common to all? What takes up the most code? And what should you absolutely include if you're building a new custom coding agent from scratch?

(Alright, ad over. Back to the serious technical analysis.)

Methodology

This analysis is based on code inspection of all 16 agents in this field guide: Pochi, Neovate Code, Mux, Crush, Kimi CLI, Qwen Code, OpenHands, Claude Code, DeerFlow, Hermes Agent, Codex CLI, Pi Mono, Wintermolt, Zaica, Open Claude Code 2.0, and the now-archived hermes-agent.

For each agent, I identified: core architecture, tool implementations, provider integration patterns, context management approaches, permission systems, and extensibility mechanisms. The goal: extract patterns that appear across multiple agents — these are likely essential, not optional.

The Universal Agent Loop

Every coding agent implements some variant of this fundamental loop:

1
User Input

Command or prompt from user

2
LLM Inference

Send messages + tools to model, get response

3
Tool Execution

Run bash, read files, write edits, etc.

4
Observation

Collect tool results, inject as messages

5
Repeat

Loop until task complete or max iterations

The variations come in how each step is implemented:

Essential Features — Must Include

These features appear in every coding agent in this set. Omitting them would make the agent incomplete.

FeatureWhy EssentialHow Implemented Across Agents
File Read Agents cannot function without reading code All 16 agents — varies from simple file I/O to PDF/binary detection
File Write/Edit The primary output mechanism All 16 agents — Write (clobbers) vs Edit (diff/patch) distinction
Glob Find files without knowing paths All 16 agents — pattern matching, some with sorting
Grep Search code without reading everything All 16 agents — regex, context lines, line numbers
Bash/Shell Run tests, build, git — the operating system bridge All 16 agents — timeout handling is critical
LLM Integration The brain — cannot be omitted All 16 agents — provider abstraction varies
Message History LLMs are stateless; history is memory All 16 agents — array of message objects
Permission Prompts Safety for dangerous operations 15/16 agents — only Pi Mono omits (deliberate choice)
Session Save/Resume Long tasks span multiple sessions 15/16 agents — file-based or JSONL
MCP Client Extensibility via Model Context Protocol 13/16 agents — increasingly standard

Near-Universal Features — Include Unless Deliberately Omitting

These appear in 12-15 of 16 agents. Include them unless you have a specific philosophical reason not to.

FeatureFrequencyNotes
Sub-agent Spawning14/16Most have an Agent tool; some use thread/process pools
Skills System13/16Markdown-based instructions triggered by /command
Slash Commands13/16REPL shortcuts (/help, /model, /compact, etc.)
Context Compaction13/16When tokens get high — summarization or truncation
Todo/Task Management13/16Track goals and sub-tasks
Multi-Provider Support12/16Not Claude Code or Open Claude Code which are Anthropic-first
Token Tracking12/16Estimate tokens for compaction, cost monitoring
HTTP/Fetch12/16Web search, API calls, documentation retrieval
Loop Detection12/16Detect when agent is stuck repeating

Code Distribution — What Takes Up The Most Space

Based on file sizes and line counts across these agents, here's approximately where the code goes:

Tool Implementations

35-50% of total code

Bash, Read, Write, Edit, Glob, Grep — each is non-trivial

Permission/Sandbox Systems

15-25% of total code

Path validation, injection detection, sandbox execution

UI/REPL/Terminal

10-20% of total code

Slash commands, rendering, input handling

Provider Integration

10-15% of total code

Per-provider code, auth, streaming, retries

Context Management

5-10% of total code

Compaction, summarization, token counting

Session/State

3-8% of total code

Save, resume, checkpointing

⚠️

Tool code is dominated by Bash

In most agents, Bash is the single largest tool implementation — typically 3-5KB for the basic implementation, but often 10KB+ with background jobs, timeout escalation (SIGTERM → SIGKILL), output limits, and injection detection.

Common Architectural Patterns

Across all these agents, certain patterns repeat. These are tried-and-true approaches that multiple agents have validated.

1. Registry Pattern for Tools

Tools are registered in a central registry with name-based lookup: tools.get('Bash'). Each tool exposes name, description, inputSchema, validateInput(), and call(). This makes tools discoverable and allows dynamic tool sets per agent.

Used by: Open Claude Code, Pochi, Pi Mono, Qwen Code

2. Middleware Chain

A fixed-order chain where each request/response passes through middleware layers: logging, permissions, error handling, compaction, loop detection. Each middleware can short-circuit the chain.

Used by: DeerFlow (14 layers), Mux, some Crush components

3. Event-Driven Architecture

Async generators or event emitters where the agent yields events: stream start, tokens, tool calls, errors, compaction. UI listens to same stream as backend.

Used by: Open Claude Code (13 events), DeerFlow, some Claude Code components

4. Provider Abstraction

A provider interface that normalizes across OpenAI, Anthropic, Google, etc.: chat(), streaming(), token counting. Each provider implements the interface, configuration selects which to use.

Used by: Mux, Qwen Code, Pochi, Pi Mono, Wintermolt

Unique Innovations — Consider Including

These features appear in only 1-3 agents but represent genuinely good ideas worth considering.

FeatureAgent(s)Why It's Worth Including
Two-tier Compaction Open Claude Code Micro-compaction (truncate old tool results) + full compaction (summarize) gives fine-grained control
Git Worktree Isolation Open Claude Code Subagents get isolated git branches — cleaner parallelization
File Checkpointing + Undo Open Claude Code Native undo support — rollback dangerous edits
Deferred Tool Loading DeerFlow Only load MCP tools when needed — reduces context pollution
Micro-batching Pochi Batch concurrent read operations, run stateful ops serially — better parallelism
File State Cache Pochi Detect "file unchanged since last read" — save tokens on repeated reads
Large Output Persistence Pochi Offload big tool results to disk — avoid context bloat
Chain Workflows Zaica Multi-step pipelines (scout → planner → coder → reviewer) — structured automation
MCP Bidirectional Codex CLI, Wintermolt Agent as MCP server for other agents — composability
Skill Self-Evolution DeerFlow Agent creates/improves skills during session — learn from experience

Minimum Viable Coding Agent

If you wanted to build the smallest possible functional coding agent, here's what you'd need:

Core Files (~10 files, ~2000 lines)

  1. Agent Loop — async function that: sends messages to LLM, parses response, executes tools, loops
  2. Read Tool — file reading with line numbers, binary detection
  3. Write Tool — file writing with mkdir -p
  4. Edit Tool — string replacement, uniqueness check
  5. Glob Tool — pattern matching
  6. Grep Tool — regex search with ripgrep
  7. Bash Tool — shell execution with timeout (60s default)
  8. LLM Provider — abstraction + OpenAI or Anthropic implementation
  9. Message History — array of messages, add tool results
  10. Simple REPL — input loop, output streaming
💡

That's actually Pi Mono's starting point

Pi Mono deliberately ships exactly this minimal kernel — no MCP, no permissions, no sub-agents. Everything else is an extension. Their 438 files is after adding all the extensions.

Agents That Add Then Remove Code

An interesting observation from this analysis cycle: which agents seem to be in a state of flux, adding features then pruning them?

AgentPattern ObservedExample
DeerFlow Citation system added, then removed Removed SafeCitationContent, inline-citation.tsx, citation core — replaced with simple MarkdownContent
Pochi New features added alongside refactoring Added batch-utils, file-state-cache, tool-result-persistence while refactoring tool selection
Pi Mono Rapid iteration, many provider changes Added extensive anthropic/openai test coverage, thinking levels, cache retention control

DeerFlow's citation removal is particularly notable — they had a full citation parsing system (SafeCitationContent, use-parsed-citations, citation utilities) and removed it entirely. This suggests they found citations weren't worth the complexity for their use case.

Interesting Insights

Permission Systems Are Under-Engineered

Most agents have simple permission prompts but lack sophisticated policy engines. Only Claude Code, Neovate Code, and Codex CLI have serious command banning, injection detection, and path validation. This is low-hanging fruit for improvement.

Compaction Is Solved But Under-Implemented

Most agents implement compaction but inconsistently. Only Open Claude Code, DeerFlow, and Claude Code itself have nuanced compaction (micro vs full). Many just truncate at a fixed token count.

MCP Is Becoming Standard

MCP went from rare to common in 12 months. Now 13/16 agents support it. If building a new agent, MCP support is expected, not optional.

The Zig Agents Go Wide

Wintermolt and Zaica take opposite approaches: Wintermolt has 7 operating modes, cron scheduling, Tailscale, camera, browser control. Zaica has chain workflows, reactive state, Wyhash. Both are small and focused but on very different problems.

Open Claude Code Is A Technical Marvel

The async generator architecture with 13 event types, two-tier compaction, git worktree isolation, file checkpointing, and 7-type hook system — implemented with 1,581 passing tests — is genuinely impressive as an independent implementation.

DeerFlow Has The Best Middleware

The 14-layer middleware chain with @Next/@Prev anchors for custom positioning, circular dependency detection, and guaranteed ordering is the most sophisticated extensibility mechanism in any agent here.

Recommendations for New Agent Builders

  1. Start with Pi Mono's minimal kernel approach. Ship a working core, then add extensions. A 10-file agent that works is better than a 500-file agent that doesn't.
  2. Use tool registry pattern. Tools should be discoverable, composable, and swappable. Don't hardcode tool calls in the loop.
  3. Implement two-tier compaction from day one. Micro-compaction (truncate old tool results to 100 chars) is cheap to implement and saves tokens constantly.
  4. Build permission as middleware. Don't mix permission logic with tool execution — wrap the tool call, check permissions, proceed or deny.
  5. Support MCP from the start. It's the standard for extensibility. stdio transport is simplest to implement first.
  6. Don't skimp on Bash. It's the tool users care about most. Handle timeouts, background jobs, output limits, and at least basic injection detection.
  7. Plan for checkpointing. File before-write checkpoints enable undo. Even a simple backup-copy before edit catches most disasters.
  8. Use existing patterns. The agent loop, tool registry, middleware chain, and provider abstraction have been validated across 16 agents. Don't innovate on architecture — replicate what works.

Bottom Line

Building a coding agent from scratch? Focus your energy on:

  1. Bash tool — most code, most user impact
  2. Permission system — safety-critical, often under-engineered
  3. Two-tier compaction — token efficiency is existential for long tasks
  4. MCP integration — extensibility standard, not optional

Everything else — skills, slash commands, sub-agents, multi-provider — can be added later. The core loop + 6 tools + LLM provider + compaction is a working agent on day one.