142K people starred an agent that rewrites its own code

ResearchAudio.io · Issue 24

142K people starred an agent that rewrites its own code

An honest teardown of Hermes Agent: the closed learning loop, the cross-platform gateway, and the single architectural decision that hasn't bitten yet but is the same one that took OpenClaw down.

TL;DR

What it is: Hermes Agent is Nous Research's open-source AI agent. MIT-licensed. Shipped February 25, 2026. Star growth: 0 to 95K in 7 weeks, ~110K by late April, ~142K by mid-May. Fastest-growing open-source agent of 2026.

Why it won: A closed learning loop. The agent writes its own skill files after solving a task, loads them on demand on the next, and improves them in use. No other framework attempts this in the same shape.

Where it bites: The gateway process is a single node, no built-in replication. The skill registry is mutable and persisted. Both are by design. OpenClaw shipped the same attack surface and ate 9 CVEs in 4 days in March.

The reading: Hermes is the most architecturally interesting open-source agent of the year and also the one carrying the biggest unpaid security bill. Zero disclosed CVEs as of April 2026. The clock is running.

Use it for: Solo dev workflows that compound. Personal cross-platform assistant. Self-hosted research agent. Avoid for: high-availability production, audit-strict environments, anything where a poisoned skill is a six-figure incident.

~142K

stars in 11 weeks

gateway process

disclosed CVEs (so far)

Nous Research shipped Hermes Agent on February 25, 2026. Seven weeks later it crossed 95,000 GitHub stars. Today it's somewhere north of 140,000. The fastest growth of any open-source AI agent framework on record, with no marketing campaign behind it.

The growth is organic because the architecture does something no competitor attempts: a closed learning loop where the agent watches what you do, codifies it into reusable skills, improves them during use, and recalls them across sessions. Most frameworks treat an agent as a stateless request handler. Hermes treats it as a system that deepens its understanding of you over time.

That is why developers starred it. The interesting question is whether the same design choice that won the stars also carries an unpaid security bill that's going to come due.

Why it won 142K stars in 11 weeks

Seven concrete decisions explain the star velocity. Each removes a friction point that other frameworks left intact.

Decision	What it removes
Model router	`hermes model` switches between 400+ models with zero code changes. No more provider lock-in.
Self-improving skills	Agent writes skill files after solving complex tasks. Loads them next time. No prompt-engineering treadmill.
Cross-platform gateway	One process serves Telegram, Discord, Slack, WhatsApp, Signal, plus 15 more. Stop managing five separate bots.
Serverless hibernation	Modal and Daytona backends sleep when idle, wake on demand. Runs on a $5 VPS, scales to near-zero cost when unused.
Seven deployment backends	Local, Docker, SSH, Singularity, Modal, Daytona, Vercel Sandbox. Fits any infrastructure constraint.
MIT license	No attribution friction, no copyleft surprise. Enterprises adopt without legal review.
Nous Research backing	Independent research lab, not a product company. Credibility by association with the Hermes, Nomos, Psyche model families.

Figure 1: the seven decisions behind the star velocity. Source: GitHub README plus Nous Research docs.

The model router is the quiet killer feature. Most teams pick a framework, build on one provider, and discover six months later that the switching cost is the entire integration layer. Hermes makes switching a one-line command. hermes model. Done.

The architecture nobody is showing you

Star topology with a hub. Every message flows through the gateway. Every skill lookup, every cron trigger, every subagent spawn. The README calls this "one gateway process" as if it's a feature. An infrastructure engineer reads it as a single point of failure.

Surfaces

CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, QQ Bot, Google Chat, plus 5 more

↓

Gateway process · the critical path

Routes messages across every platform. Single node, no built-in replication. v0.13 added auto-resume after restart, but a crashed gateway still drops every connected channel.

↓

Memory

SQLite + FTS5. Cross-session recall via search. Honcho dialectic user modeling.

Skills

Agent-generated markdown files. Self-improving in use. agentskills.io standard.

Cron

Natural language job specs. Unattended execution. Cross-platform delivery.

↓

Subagents

Parallel workers. RPC tool access. Zero-context-cost pipelines.

Model router

400+ models via OpenRouter, Nous Portal, local endpoints, custom OpenAI-compatible.

Deployment

Local, Docker, SSH, Singularity, Modal, Daytona, Vercel Sandbox.

Figure 2: star topology. The gateway is the sole component without redundancy.

How the skills system actually works

Most agents ship a static tool list. You define your tools, the agent calls them, the list never changes until you push new code. Hermes inverts this.

Observation. The user asks for a multi-step task. The agent executes via the standard tool-calling loop and records the full trajectory.

Generation. After success, the agent writes a skill file: a structured definition (markdown per the agentskills.io standard) containing the tool sequence, parameter patterns, guard constraints, and error handling rules.

Adaptation. On the next invocation, the agent compares the current request against prior skill definitions. It tightens parameter bounds based on failures. It adds error handling based on exceptions it caught.

Persistence. The skill is stored in ~/.hermes/skills/. It survives restart. agentskills.io compatibility makes it portable to other runtimes that adopt the standard.

Recall. When the same user returns in a new session, FTS5 search loads relevant prior skills. The agent does not start from zero. It starts from the last successful interaction. v0.12's autonomous Curator now grades and prunes the skill library on a 7-day cron so it doesn't drift unbounded.

The unpaid bill: if an attacker injects a malicious prompt during observation, the agent records that trajectory and generates a poisoned skill during generation. The skill is persisted and recalled across sessions, giving the attack durability. This is not a bug. It is the skills system working exactly as designed.

OpenClaw shipped the same attack surface. In March 2026 it ate 9 CVEs in four days, one at CVSS 9.9, plus a supply-chain audit that found 341 malicious skills (~12%) in its public marketplace. Hermes has had zero disclosed CVEs as of April. The clock is running on whether design discipline or sheer scrutiny pays the difference.

Hermes vs OpenClaw, the actual numbers

Metric	Hermes Agent	OpenClaw
Stars	~142K (May 2026)	~370K (May 2026)
Launch	Feb 25, 2026	Late 2025
Star growth	0 to 95K in 7 weeks	Slower per-week pace
Language	Python	TypeScript
Public CVEs	0 (as of April 2026)	9 in 4 days (Mar 2026)
Skill marketplace	agentskills.io standard, no central hub yet	ClawHub, 341 malicious skills found in audit
License	MIT	Not specified

Figure 3: Hermes versus OpenClaw. Source: GitHub data, TokenMix.ai review (April 17, 2026), Medium analysis (April 2026).

OpenClaw wins on ecosystem breadth. Hermes wins on learning depth and (so far) security posture. The CVE delta is the one nobody can quite explain yet. Same skill-poisoning attack surface. Same multi-platform gateway. Different security outcomes by a factor of nine.

Where Hermes works, where it collapses

Strengths

· Solo devs learning codebase patterns across weeks

· Research teams collecting tool-call trajectories for training data

· Small teams on $5 VPS needing one cross-platform bot

· Anyone switching LLM providers frequently

· Cron jobs specified in natural language

· Projects that need to be portable across 7 infrastructure backends

Failure modes

· Production HA: single gateway, no replication

· Audit: no immutable skill registry, skills mutate in place

· Security: prompt injection during observation poisons future sessions

· Scale: gateway bottleneck under high concurrency

· Model abstraction: tool behavior still varies subtly by provider

What this means for builders

For researchers. Batch trajectory generation and the Atropos RL environments are genuinely useful for training data collection. Run it in an isolated environment. Audit the CVE history quarterly before processing sensitive datasets.

For infrastructure engineers. Seven deployment backends is a flexibility win. The single-gateway design is a reliability loss. If you need production uptime, you'll build gateway replication yourself or accept scheduled downtime during restarts. v0.13's session auto-resume helps but isn't HA.

For product teams. Honcho dialectic user modeling plus cross-session recall is the most production-ready personalization layer in open-source agents today. The tradeoff is skill drift. What the agent learned last month may not reflect your current requirements. v0.12's autonomous Curator helps; it doesn't eliminate the problem.

For security teams. Treat auto-generated skills as executable code from a partially-trusted source. The OpenClaw incident gives you the threat model in advance. Sandbox the runtime. Don't connect it to internal databases until your own audit cycle is in place.

Reference card

hermes agent · reference card

────────────────────

architecture: Star topology, one gateway (no HA)

core loop:   User → Gateway → Memory/Skills/Cron → Model

key stats:    ~142K stars · 11 weeks · 0 CVEs

              MIT license · Python · Nous Research

what works:

 · Auto-generates skills from experience

 · 400+ model router with zero code changes

 · 20+ messaging platforms from one gateway

 · Natural language cron scheduling

 · 7 deployment backends, serverless hibernation

what breaks:

 · Single gateway, no redundancy

 · Mutable skill registry, poisonable surface

 · No immutable skill audit trail

 · OpenClaw's CVE story is a cautionary precedent

use if:    Solo dev, researcher, small team on VPS

avoid if:  HA requirement, strict audit, security-first

compare to: OpenClaw (~370K stars, TypeScript, 9 CVEs)

Figure 4: TLDR reference card.

Reader challenge

1. If an agent auto-generates skills from observation, what proof do you have that a skill wasn't injected by a prompt at generation time.

2. The gateway is a single point of failure. Is architectural simplicity worth the redundancy tradeoff for an open-source framework, or is HA replication the price of growing past 200K stars.

3. Hermes and OpenClaw shipped the same skill-poisoning attack surface. One has 9 disclosed CVEs. The other has zero. Is that design discipline, scrutiny asymmetry, or just timing.

Reply with your take. I'm collecting answers for next issue.

"Lines of code are vanity. Deletion is sanity. Stars are vanity. CVEs are sanity."

ResearchAudio.io

Next issue

What happens when you let an agent run a cron job for 48 hours unattended. I tested it. The logs are not what you expect.

ResearchAudio.io · what should an AI engineer building with frontier models know this week

Sources: repo · official docs · TokenMix review (Apr 2026) · agentskills.io

Agent that rewrites its own code

142K people starred an agent that rewrites its own code

Why it won 142K stars in 11 weeks

The architecture nobody is showing you

How the skills system actually works

Hermes vs OpenClaw, the actual numbers

Where Hermes works, where it collapses

Reference card

Keep Reading

Quick Links

Stay Updated