OpenHands: The Platform-Shaped Agent
OpenHands is the hardest repo to score fairly — the local snapshot is explicitly described as incomplete, with the modern V1 agent core having moved to a separate Software Agent SDK repository. But what remains is still architecturally fascinating, including one of the most ingenious retry strategies in this set.
The important caveat
The V1 agent core is not in this repo
OpenHands' own documentation states that the newer V1 agent core moved to a separate Software Agent SDK repository. What remains locally includes the platform architecture, sandbox infrastructure, app/server code, and the legacy CodeAct agent — useful for understanding the platform shape, but not the full current product story.
That said, the local snapshot still reveals significant architectural decisions that differentiate OpenHands from every other agent in this set.
The ingenious temperature-bumping retry
The single most interesting thing in this repo is the retry logic in
openhands/llm/retry_mixin.py. It uses the
tenacity library with a documented, intentional quirk:
LLMNoResponseError at temperature 0 → bump to 1.0
When the model returns no response at all (empty stream, no tokens)
and the temperature is set to 0, OpenHands automatically
sets temperature = 1.0 on the next retry attempt.
The reasoning is explicit in the code comments: a fully deterministic model (temp=0) that returns nothing is stuck in a degenerate fixed point. Adding randomness breaks the loop. This is one of the more thoughtful LLM retry patterns in the set — it adapts the request rather than just retrying identically.
# Intentional: on LLMNoResponseError at temp=0,
# set temperature = 1.0 on next retry.
# Rationale: deterministic model returning nothing
# is in a degenerate fixed point. Randomness breaks it.
This is the kind of production scar tissue you only get from running an agent at scale. Most agents just retry the same request and hope for a different result. OpenHands recognizes that identical requests to a deterministic model produce identical outputs — so it changes the model's behavior parameters.
CondensationRequestTool — the agent requests its own compression
OpenHands defines a CondensationRequestTool that the agent
itself can invoke to request history condensation. This is unusual: most
agents have the runtime decide when to compress context. In OpenHands,
the agent can notice it's running low on context and ask for
compression.
This is a more agent-centric design philosophy: the LLM is trusted to know its own context state and make informed decisions about when to compact history. It requires the agent to understand the tradeoff (losing detail for more working room), but gives it autonomy over its own cognitive budget.
9 Jinja2 prompt templates with XML sections
OpenHands has the most modular prompt system in this set: 9 Jinja2
.j2 templates in
openhands/agenthub/codeact_agent/prompts/, assembled at
runtime. The main template uses named XML sections:
| Section | Purpose |
|---|---|
<ROLE> | Defines the agent's identity and capabilities |
<EFFICIENCY> | Guidelines for efficient behavior |
<FILE_SYSTEM_GUIDELINES> | File operation best practices |
<CODE_QUALITY> | Code standards and testing expectations |
<VERSION_CONTROL> | Git workflow expectations |
<PULL_REQUESTS> | PR creation and review guidelines |
<PROBLEM_SOLVING_WORKFLOW> | Systematic problem-solving approach |
<SECURITY> | Security practices and risk awareness |
<EXTERNAL_SERVICES> | Integration with external APIs and services |
<ENVIRONMENT_SETUP> | Environment configuration and dependencies |
The security section includes {% include 'security_risk_assessment.j2' %}
— a composable sub-template, not inline text. This is Jinja2's template
composition at work, allowing security guidelines to be maintained
separately from the main prompt.
The long-horizon variant
(system_prompt_long_horizon.j2) extends the base to add
<TASK_MANAGEMENT> and
<TASK_TRACKING_PERSISTENCE> for the
task_tracker tool — designed for multi-step, multi-session
tasks that span hours or days.
Additional templates: in_context_learning_example.j2,
microagent_info.j2, additional_info.j2,
system_prompt_interactive.j2,
system_prompt_tech_philosophy.j2.
fn_call_converter — LEGACY V0, removal April 1, 2026
The fn_call_converter.py file is marked
LEGACY V0, removal April 1, 2026. It converts between
JSON function-calling and XML for models that don't support native tool
calls:
<function=name>
<parameter=key>value</parameter>
</function>
It uses </function as a stream-stop word for incremental
parsing. The refine_prompt() function automatically replaces
'bash' with 'powershell' on Windows — an
automatic platform adaptation that no other agent formalizes at the
prompt conversion level.
Sandbox and Docker architecture
OpenHands is the most security-conscious agent in this set when it comes to execution isolation. Rather than running commands on the host machine with guards and blocklists, it provisions Docker containers as isolated execution environments:
Container-per-session model
Each agent session gets its own Docker container. Files, processes, and network access are all sandboxed. The container is torn down when the session ends. This is the strongest isolation model in this set.
File synchronization
The sandbox manager syncs file changes between the container and the host workspace. The agent works inside the container, but file edits are reflected back to the user's workspace in real time.
All tool calls carry a security_risk attribute validated
against a RISK_LEVELS dictionary. This is defense in depth:
even inside a sandbox, the agent's actions are classified and auditable.
CodeAct agent architecture
The CodeAct agent is OpenHands's primary agent loop. It follows the "code as action" paradigm: the agent writes and executes Python/bash code as its primary action mechanism, rather than calling predefined tools. This is more flexible than tool-based agents — the agent can write arbitrary code to solve problems — but requires stronger sandbox isolation.
The local snapshot still contains the CodeAct agent's action definitions, observation types, and the agent hub structure. The modern V1 implementation lives in the separate Software Agent SDK, but the conceptual architecture is visible here.
Model support
OpenHands historically supported broad model flexibility via LiteLLM-style compatibility. The local repo shows:
- OpenAI — GPT-4, GPT-4o, o-series models
- Anthropic — Claude Sonnet, Opus, Haiku
- Google — Gemini Pro, Flash
- Open-source — Any LiteLLM-compatible model
- Local models — Ollama, vLLM, and other local inference servers
The fn_call_converter enables models that don't support native function calling to still participate as agents through the XML tool format.
App/server architecture
OpenHands includes a full web application with:
- Next.js frontend — React-based web UI for interacting with agents
- FastAPI backend — Python API server managing agent sessions
- WebSocket communication — Real-time streaming of agent actions and observations
- Session management — Persistent agent sessions that survive browser refresh
This makes OpenHands more of a platform than a CLI tool. You can run it as a self-hosted service with multiple users, each with their own agent sessions and sandboxed environments.
Where OpenHands is weaker
Hard to judge from local code alone
The most important modern agent core is not fully in this repo snapshot. The V1 agent SDK lives elsewhere, so any assessment based on the local code is inherently incomplete.
Heavier infrastructure requirements
Docker-based sandboxing means you need Docker running. This is fine for a self-hosted platform but rules out quick "just install and run" usage that CLI agents like Crush or Claude Code support.
Bottom line
OpenHands is the most platform-shaped agent in this set. It's not a CLI tool — it's a self-hostable service with Docker sandboxing, a web UI, session management, and a composible prompt template system.
The temperature-bumping retry strategy alone is worth studying: it's the kind of production scar tissue that separates serious agent operators from weekend wrappers. The CondensationRequestTool, which lets the agent request its own context compression, is an agent-centric design philosophy that trusts the LLM with cognitive budget decisions.
The caveat is that the local repo is a partial snapshot. The V1 agent core moved to a separate SDK repository, so this represents the platform architecture more than the current agent implementation.