The secret to making AI agents work.
It isn't a better prompt. It's a compounding system of skills, data, and code wrapped around an interchangeable model — and once you've built one, the curve is real. Fat skills · Fat code · Thin harness — the chant that runs through everything.
A Pema Chödrön text, 162 pages, mirrored into 30,000 words of life-mapped commentary in forty minutes. The first proof that this was different.
Last month I was reading Pema Chödrön's When Things Fall Apart. 162 pages, twenty-two chapters on Buddhist approaches to suffering, groundlessness, and letting go. A friend recommended it during a hard period.
I asked my AI to do a book mirror.
What that means concretely: the system extracted all twenty-two chapters, and then, for each chapter, ran a sub-agent that did two things simultaneously — summarized the author's ideas, and mapped every idea to my actual life. Not generic "this applies to leaders" pablum. Specific mapping. It knows my family history — immigrant parents, dad from Hong Kong and Singapore, mom from Burma. It knows my professional context. It knows what I've been reading, what I've been thinking about at 2 a.m., what my therapists and I are working on.
The output was a 30,000-word brain page. Each chapter rendered as two columns: what Pema says, and how it maps to what I'm actually living through. The chapter on groundlessness connected to a specific founder conversation I'd had the week before. The chapter on fear mapped to patterns my therapist had identified. The chapter on letting go referenced a late-night session where I'd written about the creative freedom I'd found this year.
The whole thing took about forty minutes.
I've done this with over twenty books now. Each one gets richer because the brain gets richer. The second mirror knew about the first. The twentieth knew about all nineteen.
Every fix gets baked into the skill. Every mirror since has been clean. This is what skillification means in practice.
The first book mirror I did was terrible. Version one had three factual errors about my family. It said my parents were divorced when they weren't. Said I grew up in Hong Kong when I was born in Canada. Basic stuff that could have damaged trust if I'd shared it.
So I added a mandatory fact-check step. Every mirror now runs cross-modal evaluation against known facts in the brain before it ships. Then I upgraded to deep retrieval with brain tool use. The original version was good at synthesis but weak on specificity.
Good at finding patterns. Trusts the model's reading of "me" too far. Ships without a check. The three factual errors slipped through because nothing in the pipeline was looking for them.
Add a mandatory fact-check against known facts in the brain before anything ships. Then have three different models score the output on different dimensions — precision, missing context, generic-ness — and reject anything that fails. The errors disappear.
V3 does per-section brain searches. Every right-column entry cites actual brain pages. When the book talks about dealing with difficult conversations, V3 doesn't synthesize general principles. It pulls from my meeting notes with specific founders. The Thursday I spent with my brother James. The IM chat with my college roommate when I was nineteen. It's uncanny.
This is what skillification (using /skillify in GBrain) means in practice. I took the first manual attempt, extracted the repeatable pattern, wrote a tested skill file with triggers and edge cases, and every fix compounded across all future book mirrors.
The skill remembers.On why skills beat prompts
No more "forgot to mention this edge case in my prompt."
The system that runs my life was assembled from skills. And those skills were themselves created by a skill. This is where it gets recursive.
The system that runs my life didn't exist as a monolith. It was assembled from skills. And those skills were themselves created by a skill.
When I encounter a workflow I'm going to repeat, I say skillify this and it examines what just happened, extracts the repeatable pattern, writes a tested skill file with triggers and edge cases, and registers it in the resolver. The book-mirror pipeline was skillified from the first time I did it manually. The meeting-prep workflow was skillified after I noticed I was doing the same steps before every call.
Skills compose. Book-mirror calls brain-ops for storage, enrich for context, cross-modal-eval for quality, pdf-generation for output. Each skill is focused on one thing. They chain. Improve one skill and every workflow that uses it gets better automatically.
skillify. The meta-skill writes the new skill. The new skill is now available for the next time. The next time, it does what you did, but better. You add a fix. The fix lives in the skill, not in your head. Every future run inherits it.
Demis Hassabis came to YC for a fireside chat. Sebastian Mallaby's biography of him had just come out. Two minutes of system prep.
I asked the system to prep me for a Demis Hassabis fireside.
In under two minutes, it pulled the following:
This wasn't just a better Google search. This was preparation that used my accumulated context about Demis, my own positions, and the strategic goals for the conversation. The system prepped not just facts, but angles.
Every person, every meeting, every book, every idea gets a page. A personal Wikipedia, continuously updated by an AI that was at the meeting, read the email, watched the talk, ingested the PDF.
I maintain a structured knowledge base with about 100,000 pages. Every person I meet gets a page with a timeline, a state section (what's currently true), open threads, and a score. Every meeting gets a transcript, a structured summary, and something I call entity propagation: after every meeting, the system walks through every person and company mentioned and updates their brain pages with what was discussed.
The schema is simple.
The current best understanding of this entity. Rewritten as new information arrives. Always reflects what's true now. This is what the LLM reads when this person comes up anywhere.
Events in chronological order. Nothing is ever deleted. Meetings, mentions, news, document references. The history is the audit trail; the compiled truth above is summarized from these entries.
Source material — full transcripts, scraped articles, screenshots, attached files. The compiled truth above is summarized from these. Garbage in, generic out.
Here's an example of how this compounds. I meet a founder at office hours. The system creates or updates their person page, their company page, cross-references the meeting notes, checks if I've met them before (and surfaces what we discussed last time), checks their application data, pulls their latest metrics, and identifies if any of my portfolio companies or contacts are relevant to their problem. By the time I walk into the next meeting with them, the system has a full context pack ready.
A filing cabinet stores things.On what the brain actually is
A nervous system connects them.
Thin harness. Fat skills. Fat data. Fat code. Interchangeable models. All open-sourced.
The harness is just a router. The skills are the prompts. The data is where the compound value lives. The models are interchangeable.
OpenClaw or Hermes Agent. Receives messages, figures out which skill applies, dispatches. A few thousand lines of routing logic. Knows nothing about books or meetings or founders. It just routes.
100+ self-contained markdown files. One specific task each. They chain. They compose. Improve one and every workflow that uses it gets better. When someone asks how I "prompt" my AI, the answer is: I don't. The skills are the prompts.
100,000 pages of structured knowledge in the brain repo. Every person, company, meeting, book, article, idea — all linked, all searchable, all growing every day. The compound value lives here.
Feeders. Transcription, OCR, social archival, calendar sync, API integrations. 100+ crons per day check the channels I pay attention to: email, Slack, social, calendars.
The models are interchangeable. Opus 4.7 1M for precision. GPT-5.5 for recall and exhaustive extraction. DeepSeek V4-Pro for creative work and third perspectives. Groq + Llama for speed. The skill decides which model to call for which task. The harness doesn't care. "Which AI model is best" is the wrong question — the model is the engine, everything else is the car.
Each skill encodes operational knowledge that would take a new human assistant months to learn. A slice of the ones that ship with GBrain:
Extracts chapters. A sub-agent per chapter — summarizes the author's ideas in one column, maps them to your life in the other. Cross-modal eval before shipping.
Transcript → structured summary → entity propagation. Walks every person and company mentioned and updates their brain pages. The propagation is the real value.
Person → brain page. Five sources, merged into one page with career arc, contact info, meeting history, relationship context. Cited on every claim.
Video, audio, PDF, screenshots, GitHub repos. Transcribes, extracts entities, files to the right brain location. The constant in-take pipe.
Brain-augmented web research. Searches the web, but checks what the brain already knows first — tells you what's actually new vs. what you've already captured.
The meta-skill. Examines what just happened. Extracts the repeatable pattern. Writes a tested skill file with triggers and edge cases. Registers it in the resolver. The recursion engine.
The 2 a.m. builder and the curve that's actually real. 10× every two months.
People ask me about productivity. I don't think about it that way. What I think about is compounding.
Every meeting I take adds to the brain. Every book enriches the next book. Every skill makes the next workflow faster. Every person page I update makes the next meeting prep sharper. The system today is 10× what it was two months ago, and two months from now it'll be 10× again.
When I'm still up at 2 a.m. coding — and I am, regularly, because AI gave me back the joy of building — I'm not just writing software. I'm adding to a system that gets better every hour. 100+ cronjobs run 24/7. The meeting ingestion runs automatically. The email triage runs every ten minutes. The knowledge graph enriches itself from every conversation.
This is not a writing tool. It's not a search engine. It's not a chatbot. It's a second brain that actually works — not as a metaphor, but as a running system with 100,000 pages, 100+ skills, 100+ daily cron jobs, and the accumulated context of every professional relationship, meeting, book, and idea I've engaged with in the last year.
The future belongs to individuals who build compounding AI systemsThe thesis
— not to those who use corporate-owned centralized tools.
Don't begin by planning your skill architecture. Begin by doing a thing. The loop turns one-off work into compounding infrastructure.
OpenClaw, Hermes Agent, or build your own from scratch with Pi. Keep it thin — the harness is just the router. Host it on a spare computer at home with Tailscale, or use Render or Railway in the cloud.
I got inspired by Karpathy's LLM Wiki, implemented it in OpenClaw, and extended it into GBrain. One command to install. A git repo where every person, meeting, article, and idea gets a page. Benchmarks at 97.6% recall on LongMemEval with no LLM in the retrieval loop, and ships 39 installable skills.
Don't start by planning your skill architecture. Start by doing a thing. Write a report. Research a person. Analyze your portfolio. Build a prediction model for your sports bets. Whatever you actually care about. Do it with your agent, iterate until it's good, then run skillify to extract the pattern into a reusable skill. That loop turns one-off work into compounding infrastructure.
The skill will be mediocre at first. That's the point. Use it, read what it produces, and when something is off, run cross-modal eval — send the output through multiple models and have them score each other on the dimensions you care about. That's how I caught the factual errors in book-mirror. The fix got baked into the skill, and every mirror since has been clean.
In six months you'll have something no chatbot can replicate, because the value isn't in the model — it's in what you've taught the system about your specific life, work, and judgment.
The LLM on its own is just an engine. You can build your own car. The first thing I built with this system was terrible. The hundredth was something I'd trust with my calendar, my inbox, my meeting prep, and my reading list. The compound curve is real.