I Gave My AI a Soul and a Phone Number

I can text an AI from my phone at 3am, ask it to dig through a repo, draft a pull request, and check on a deploy, and it will be awake. It remembers the last conversation we had. It remembers how I like my diffs. It is getting better at working with me every single week, without me re-explaining anything.

That is not a chatbot app. It is not a coding assistant. It is something I have started thinking of as a different category entirely.

I call it Sammy. It runs on a framework called Hermes, it lives in Docker on a server, and I reach it through Telegram. This post is about how it is built, what makes it different, and why I think an always-on, self-learning agent is a real step forward and not just another wrapper around an API.

What Sammy actually is

Sammy is a personal AI agent built on Hermes, an open-source agent framework from Nous Research. The important word is agent, not chatbot. It plans, uses tools, runs code in a sandbox, and asks me for approval before it does anything risky.

The setup has three moving parts:

The Hermes runtime, running in Docker on a server. This is the brain. It does the planning, the tool calls, and the actual work.
A Cloudflare Gateway Worker, which is a thin relay. Telegram sends messages to it, it checks a secret header to confirm the message is really from my bot, and only then does it pass the message to the runtime.
A dashboard worker plus a database, which tracks task state, approvals, and an audit log so I can see what Sammy did and when.

The flow is simple to picture:

iPhone -> Telegram -> Gateway Worker -> Hermes runtime (Docker) -> sandbox + tools

Always-on agent architecture: iPhone to Telegram to Cloudflare Gateway Worker to Hermes runtime in Docker to sandbox and tools, with an audit trail to a dashboard and database

I open Telegram like I would text a person. I say something like "create a branch and draft a PR for this bug." Sammy goes and does it. If it needs to push code, deploy, or delete something, it stops and asks me yes or no first. Everything heavy runs inside a Docker sandbox, not on the host, so a bad command cannot wander off and break my machine.

That is the boring infrastructure version. The interesting part is what sits on top of it.

What a "soul" is, and why it matters

When I first heard about this structure from my friend Michael Shutt, the part that stuck with me was the idea of a soul.

In Hermes, the soul is a file. Literally SOUL.md. It is the agent's permanent identity and behavior contract. Mine tells Sammy to be direct and security-conscious, to prefer small reviewable diffs over big refactors, to run shell commands in the sandbox unless I explicitly approve otherwise, and to ask before anything destructive or expensive.

The key insight is that the soul is separate from memory.

Memory is what the agent learns about me over time. The soul is who the agent is, full stop. It does not drift. It does not get diluted as the conversation history grows. No matter how long a session runs or how much context piles up, the soul stays fixed. It is the difference between a coworker having a bad day and a coworker forgetting their own job.

If you have read my post on owning your memory, this will feel familiar. I am obsessed with the line between stable instructions and accumulated knowledge. The soul is the cleanest version of that idea I have seen. Stable identity in one file. Everything it learns lives somewhere else.

The real unlock: it actually learns

Here is the part that makes this feel like a step change instead of a nicer interface.

Sammy gets better the longer it runs.

Hermes has what its creators call a closed learning loop. It keeps persistent memory across every session. It builds a model of who I am, what my projects are, and how I like to work. And when it does a complex task more than once, it can write itself a reusable skill document so it does not have to figure the same thing out from scratch next time.

Concretely, that means a few files and a database that persist on the server:

A memory file for durable facts and decisions.
A user profile that captures my preferences and stack.
A searchable history of past sessions.
Skills the agent writes for itself as it learns my workflows.

Fresh instance versus six-month instance: both share a fixed SOUL.md identity, but the six-month instance has far richer persistent memory and a dense web of self-written skills that compounds over time

The result is that a six-month-old instance of this thing is materially different from a fresh one. It is not a smarter model. It is the same model with a compounding context about me.

Compare that to how most of us work today. Every new Cursor chat starts from zero. Every new Claude Code session reads a static CLAUDE.md and then forgets everything the moment you close the window. You are the memory. You carry the context from session to session in your head, and you pay the re-explanation tax every single time.

Sammy does not make me carry that. It keeps it.

Why this is past Claude Code, Codex and Cursor

I want to be fair here, because I use Claude Code, Codex, and Cursor every day and they are excellent. This is not a takedown.

But they are coding copilots. They are built to live inside an editor and help you write code in a session. They are very good at that. They are not built to remember why a change mattered three weeks later, or to keep working after you close your laptop, or to ping you on your phone when a job finishes.

The cleanest way I have heard it put is this: a coding agent writes the patch. A persistent agent remembers why the patch matters, schedules the follow-up, and tells you about it where you actually live.

Those are different jobs. For now, Cursor, Codex and Claude Code are tethered to the IDE and reset between sessions by design. Sammy lives outside the IDE, persists by design, and compounds by design. The future probably uses both. A coding agent for the repo. A persistent agent for the workflow around the repo.

Always-on is not the same as a chatbot app

This is the distinction people miss, so it is worth being blunt about it.

A chatbot app is a thing you open. You launch it, you prompt it, you read the answer, you close it. It is a tool that sits still until you pick it up. Nothing happens when you are not looking.

An always-on agent is a thing that runs. Sammy is a Docker service set to restart unless I stop it. It is awake whether or not I am. It can be reached from my phone, and it can work while my laptop is shut.

That sounds like a small difference. It is not. A tool that only exists when you are actively using it can never do the most valuable things: monitor, react, follow up, and report back on its own schedule. The moment an agent can run without you babysitting it, the question changes from "what can I ask it" to "what should it be doing while I am asleep."

Use cases that actually push things forward

Once you have an agent that persists and runs on its own, the use cases stop being about answering questions and start being about owning workflows. A few I either run or am building toward:

Morning briefing. Every morning, read my inbox and calendar, then send me the three things that actually need attention as a Telegram message. No app to open. It comes to me.
Monitoring and alerts. Watch a deploy, a site, or a job, and text me only when something needs a decision. Silence is the feature.
Long-running research. Hand it a real research task, let it grind for an hour, and have it report back when it has something. I do not sit and watch a spinner.
Approval-gated ops from my phone. Draft a PR, stage a deploy, and wait for my yes or no over Telegram before it does the risky part. I get the leverage of automation with a human in the loop.
Capturing what I learn. Pipe insights straight into my Obsidian vault so the knowledge lands somewhere durable instead of dying in a chat thread.

None of those are "ask a model a question." They are "give a colleague a standing responsibility."

The takeaway

The shift I keep coming back to is from sessions to a system.

A session is something you start and end. A system persists, learns, and runs. Most AI tools today, even the great ones, are still session-shaped. You are the part that remembers. You are the part that is always on.

Sammy moves that burden into the agent. It has a fixed identity, a compounding memory, and a body that runs on a server I control. It is not magic, and it took real setup to get there. But the first time it texted me a finished task that I had forgotten I asked for, something clicked.

The next era of useful AI is not a smarter chatbot. It is an agent that does not reset to zero every time you walk away.