Head-to-Head • Source-Level • 1,894 TS vs 1,418 Rust

Claude Code vs Codex CLI: What Each Has That the Other Doesn't

Both are frontier lab coding agents. Both ship with deep tool catalogs, permission systems, TUIs, MCP clients, subagent spawning, and IDE extensions. But their source-level architectures reveal two fundamentally different engineering philosophies.

(Alright, ad over. Back to the serious technical analysis.)

The tale of the tape

Dimension	Claude Code (Anthropic)	Codex CLI (OpenAI)
Language	TypeScript (Bun runtime)	Rust (native binary, zero runtime)
Source files	1,894 .ts/.tsx files	1,418 .rs files (+ 222 config/schema/template files = ~3,805 total)
Modules/crates	36 source directories, flat module graph	87 Cargo workspace crates
Lines of code	~512,000+	~350,000 (estimated from Rust average)
TUI framework	React + Ink (custom Ink renderer, 48 files) + Yoga layout	Ratatui (forked patches, 80 files)
License	Leaked snapshot (proprietary)	Apache-2.0 (open source)
Model default	Claude Sonnet/Opus (Anthropic SDK)	OpenAI Responses API (configurable)
Config format	JSON + settings DB + migrations	TOML with layered stack (global/project/CLI overrides)
Feature flags	`bun:bundle` dead-code elimination + GrowthBook	Cargo features + compile-time conditional compilation
Permission system	Mode-based (default/plan/auto/bypass) + wildcard allow/deny rules + 2-stage bash classifier	execpolicy rule DSL + request_permissions tool + per-tool MCP approval overrides
Sandboxing	Upstream proxy (CCR), ptrace blocking, CONNECT-to-WebSocket relay (remote-only)	macOS Seatbelt, Linux bubblewrap/Landlock, Windows restricted tokens (local)
MCP	Client only (exposes CLI tools via `--mcp-server`)	Client AND server (bidirectional — `codex mcp-server` is a full MCP tool)
Subagents	AgentTool (spawn/fork/resume), coordinator mode, team swarms	Dual API (v1 + v2): spawn/wait/send/close/resume/list/followup, CSV-driven batch jobs
Skills	SKILL.md with frontmatter, reference files, conditional activation, 17 bundled skills	Embedded system skills via `include_dir!`, fingerprint-based caching, skill-creator
Memory	MEMORY.md (200-line/25KB cap), team memory, auto-memory, aging, relevance finding	SQLite state DB, rollout JSONL, memory trace building from session files
IDE integration	Bridge (WebSocket, 32 files) for VS Code/JetBrains, JWT auth, direct connect	App server (JSON-RPC, 25 files) for VS Code/Cursor/Windsurf, stdio + WebSocket
Commands	101+ slash commands	~20 subcommands via Clap (exec, login, mcp, sandbox, sessions, etc.)
Tools	~40+ built-in tools (43 tool directories)	~20+ built-in tools (codex-tools crate, 47 files)
Voice	Voice mode (GrowthBook-gated, OAuth-required, streaming STT)	Realtime WebRTC (macOS-only, AVFoundation, SDP offer/answer)
Vim mode	Full state machine (INSERT/NORMAL, motions, operators, text objects, dot-repeat)	Not present
Local models	No dedicated local model support	Ollama + LM Studio dedicated crates with model pulling, downloading, pre-loading
Feedback/telemetry	GrowthBook analytics, Datadog, Sentry feedback (ring buffer, 4MB)	OpenTelemetry OTLP, feedback to Sentry (ring buffer, 4MB), analytics events
Collaboration	Coordinator mode, team swarms, teammate views, remote sessions over WebSocket	Collaboration mode templates (plan/default/execute/pair_programming)

What Claude Code has that Codex doesn't

Full vim mode

Claude Code ships a complete vim state machine in five files (vim/): INSERT/NORMAL modes, motions (h/j/k/l, w/b/e, f/F/t/T), operators (delete, change, yank), text objects (iw, aw, i", a"), and dot-repeat. Codex has no vim mode at all.

Terminal power users

101+ slash commands

Claude Code's commands/ directory holds 101 entries — everything from /compact and /mcp to /bughunter, /teleport, /thinkback, /autofix-pr, /chrome, and /ultraplan. Codex has ~20 subcommands via Clap, focusing on core operations rather than power-user extras.

Command depth

LSP as a first-class built-in tool

Claude Code's LSPTool is a full Language Server Protocol integration exposed as a built-in tool (feature-gated via ENABLE_LSP_TOOL). Codex has no LSP tool — it relies on its own file search, grep, and apply_patch instead.

IDE intelligence

Web search and web fetch

Claude Code includes WebSearchTool and WebFetchTool as built-in tools for fetching URL content and performing web searches. Codex has no built-in web search — it relies on the model's training data and local file access.

Web-connected

Notebook editing

Claude Code ships a NotebookEditTool for Jupyter notebook editing. Codex has no notebook-specific tooling.

Git worktrees

Claude Code has EnterWorktreeTool and ExitWorktreeTool for git worktree isolation, letting the agent work on a branch without disturbing the main working directory. Codex has no worktree support.

Rich plugin/skill ecosystem

Claude Code has 17 bundled skills (batch, claudeApi, debug, loop, remember, scheduleRemoteAgents, simplify, skillify, stuck, verify, etc.) plus a plugin scaffolding system for user-toggleable features. Skills support reference files with secure extraction (O_NOFOLLOW, O_EXCL, 0o600), conditional activation based on file paths, and argument substitution. Codex has a simpler embedded skills system with fingerprint-based installation.

Extensibility

Config migrations

Claude Code ships 11 idempotent config migrations — model migrations (Fennec→Opus, Sonnet 4.5→4.6), permission migrations, MCP server migrations, and auto-update migrations. This is a production pattern for managing a large user base across versions. Codex has no migration system.

Feature-gated internal tools

Claude Code gates dozens of tools behind bun:bundle flags: REPLTool (VM-isolated execution), SleepTool, CronTools, MonitorTool, SendUserFileTool, PushNotificationTool, SubscribePRTool, WebBrowserTool, SnipTool, ListPeersTool, WorkflowTool, OverflowTestTool, CtxInspectTool, TerminalCaptureTool, PowerShellTool. These are Anthropic-internal features not available in the public build, but they show the codebase's extensibility ceiling.

GrowthBook feature flag platform

Claude Code uses GrowthBook for remote feature flagging, enabling Anthropic to roll out features gradually, A/B test, and kill-switch features without shipping new binaries. Codex uses compile-time Cargo features instead — more static but faster at runtime.

What Codex has that Claude Code doesn't

Three platform-specific sandboxes

Codex has three separate sandbox implementations: macOS Seatbelt (SBPL policies), Linux bubblewrap/Landlock (with vendored bwrap fallback), and Windows restricted tokens (ACL manipulation, capability SIDs, private desktops, DPAPI encryption). Claude Code only has upstream proxy hardening for its remote (CCR) infrastructure — no local process sandbox.

Local security

Bidirectional MCP

Codex is both an MCP client (codex-mcp crate) and an MCP server (codex-mcp-server crate, run via codex mcp-server). This means Codex can consume external tools and be consumed as a tool by other agents. Claude Code is MCP client only — its --mcp-server mode exposes a limited subset of tools (Bash, Read, Edit) rather than the full agent.

Agent ecosystem

Execution policy DSL

Codex's execpolicy crate implements a rule-based domain-specific language for command allow/deny evaluation: prefix patterns with wildcards, network rules by host/protocol, program-scoped rules, and runtime policy amendment. Claude Code uses a simpler mode-based permission system with allow/deny rules and a 2-stage bash classifier.

Dedicated local model support

Codex ships two dedicated crates for local models: codex-ollama (model pulling with progress, version checking, Responses API compatibility verification) and codex-lmstudio (model fetching, downloading, pre-loading). Claude Code has no local model integration at all.

Offline-first

Agent jobs: CSV-driven batch workflows

Codex has spawn_agents_on_csv_tool and create_report_agent_job_result_tool — a higher-level abstraction for spawning multiple agents from CSV input and collecting structured results. Claude Code has coordinator mode and team swarms but no equivalent batch-job primitive.

Strict correctness guarantees

Codex's Cargo.toml denies unwrap_used, expect_used, needless_borrow, and 20+ other clippy lints across all 87 crates. The Rust compiler enforces these at compile time. Claude Code is TypeScript — it has Biome linting and TypeScript strictness, but no compiler-level guarantee against panics or unhandled errors.

Engineering rigor

Headless exec with JSONL output

Codex's exec crate provides two output modes: human-readable terminal output and JSONL output for CI/ automation pipelines. Each event (agent messages, tool calls, MCP calls, todos, web search) is a structured JSONL record. Claude Code has structuredIO.ts for SDK mode but no dedicated headless binary.

Sandbox CLI for testing

Codex exposes codex sandbox {macos,linux,windows} so developers can test sandbox behavior without involving the agent loop. Claude Code has no equivalent sandbox-debugging CLI.

SQLite state with partitioned logs

Codex's state crate manages two SQLite databases: state.db (threads, agent jobs, backfill, memories, remote control) and logs.db with 10MB partition budgets and automatic log rotation. Claude Code uses a simpler settings DB without partitioned log management.

Open source

Codex is Apache-2.0 licensed on GitHub. You can read, fork, and contribute to the entire 3,805-file codebase. Claude Code is a leaked snapshot discovered via source maps — proprietary software, not open source.

Transparency

What both have (but implement differently)

Capability	Claude Code	Codex CLI
TUI	React + Ink (custom renderer, 48 files) + Yoga layout engine + 144 component files	Ratatui (80 source files) with frames, styles, tooltips, theme picker
Subagents	AgentTool (spawn/fork/resume/memory/color), coordinator mode system prompt, team swarms	Dual API v1/v2 with 7 lifecycle tools (spawn/wait/send/close/resume/list/followup), CSV batch jobs
IDE bridge	Bridge (32 files, WebSocket, JWT auth, VS Code/JetBrains, direct connect)	App server (25 files, JSON-RPC over stdio/WebSocket, VS Code/Cursor/Windsurf)
Memory	MEMORY.md (200-line cap), team memory, auto-memory, aging, relevance finding, memory extraction	SQLite state, rollout JSONL, memory trace building, structured queries
Skills	SKILL.md with frontmatter, reference files, conditional activation, 17 bundled skills	Embedded via `include_dir!`, fingerprint-based install, skill-creator
MCP client	Full MCP SDK integration, tool/resource discovery, elicitations, channel permissions, approval UI	codex-mcp crate, connection manager, tool discovery, deferred loading, OAuth scope resolution
Voice	Voice mode (GrowthBook-gated, OAuth-required, streaming STT, keyterms)	Realtime WebRTC (macOS-only, AVFoundation, SDP offer/answer, audio level monitoring)
Feedback	Sentry feedback (ring buffer, 4MB), GrowthBook analytics, Datadog telemetry	Sentry feedback (ring buffer, 4MB), OpenTelemetry OTLP, analytics events
File search	native-ts file-index bindings, GrepTool (ripgrep), GlobTool (picomatch)	file-search crate (nucleo matcher + ignore walker, gitignore-aware, multi-threaded)
Permissions	Modes (default/plan/auto/bypass) + wildcard rules + 2-stage bash classifier (XML parsing)	execpolicy DSL + request_permissions tool + per-tool MCP approval overrides
Compaction	compact/ service (context compression integrated with query engine)	compact.rs + compact_remote.rs (inline summarization + remote API-based compaction)
Update checking	Auto-updater with GrowthBook-gated rollout	TUI update prompts with self-update checking

Architecture philosophy comparison

Claude Code: Ship fast, feature-gate everything

Claude Code is a TypeScript monorepo built on Bun with React+Ink for the terminal UI. It uses bun:bundle for dead-code elimination and GrowthBook for remote feature flags. The architecture is designed for rapid iteration: 101+ slash commands, 40+ tools, 17 bundled skills, dozens of feature gates, and an internal-only feature pipeline that most users never see.

The codebase is shaped around product velocity — ship features, gate them behind flags, migrate configs forward, and let GrowthBook decide who gets what. The leaked snapshot reveals a team comfortable with complexity: 512,000+ lines of TypeScript, 144 UI components, 85 React hooks, and 36 source directories.

Codex CLI: Build right, structure for correctness

Codex is a Rust workspace of 87 crates with a strict clippy lint policy that bans unwrap_used and expect_used everywhere. Every error is handled explicitly. The architecture is designed for audibility and safety: narrow crate boundaries, protocol types in a dedicated crate, platform-specific sandbox implementations, and a headless execution mode for CI pipelines.

The codebase is shaped around engineering discipline — fat LTO, symbol stripping, edition 2024, 22 utility crates with single-responsibility modules, and compile-time guarantees instead of runtime feature flags. Where Claude Code ships 101 slash commands, Codex ships ~20 focused subcommands. Where Claude Code uses GrowthBook, Codex uses Cargo features.

The feature ceiling gap

One of the most revealing differences is what each codebase is capable of versus what it currently ships.

Claude Code's internal feature pipeline

The leaked snapshot shows dozens of feature-gated tools not available in the public build: a web browser tool, cron scheduling, push notifications, GitHub PR subscriptions, terminal capture, context inspection, overflow testing, and a full REPL with VM isolation. These suggest an internal development pipeline where features are built, tested, and gradually exposed. The public CLI is a subset of what the codebase can do.

Codex's open ceiling

Codex is fully open source — what you see is what you get. There are no hidden feature flags, no internal-only tools gated behind USER_TYPE === 'ant' checks, and no remote feature flag server. The v8-poc crate is labeled as a proof-of-concept, and collaboration mode templates are embedded but the full user-facing flow isn't yet documented. The ceiling is transparent because the entire codebase is visible.

Verdict

Choose Claude Code if you want...

The deepest built-in tool catalog (LSP, notebooks, web search, worktrees)
101+ slash commands for power-user workflows
Full vim mode with state machine completeness
17 bundled skills with conditional activation and reference files
GrowthBook feature flags for gradual rollout capabilities
Config migrations for managing large user bases across versions
The most extensible plugin/skill architecture

Choose Codex CLI if you want...

Three platform-specific sandboxes for local process isolation
Bidirectional MCP (consume tools and be consumed as one)
Compile-time correctness guarantees (Rust + strict clippy)
Dedicated local model support (Ollama + LM Studio)
Headless JSONL output for CI/automation pipelines
Execution policy DSL for fine-grained command control
Apache-2.0 open source license — fully auditable
Agent jobs: CSV-driven batch multi-agent workflows

🔬

The real answer

They are two sides of the same coin. Claude Code is the product-maximizing agent — ship everything, gate it behind flags, iterate fast. Codex is the engineering-maximizing agent — structure for correctness, open-source everything, let the compiler catch your mistakes. Both are deeply integrated product runtimes. Both have subagent spawning, MCP clients, IDE bridges, and memory systems. The difference is not what they can do — it's how they choose to build it.