May 8, 2026

The shape of agents that observe their own outputs

A class of agent is emerging that doesn't do new work. It reads what you already have through a perspective you didn't take.

Three things crossed my feed this week that, on the surface, have nothing to do with each other.

Brooks Whale posted Google’s CodeWiki: paste a GitHub repo URL, get back an interactive guide with diagrams, walkthroughs, the kind of orientation document a thoughtful senior engineer would write if you bribed them with a long lunch. The agent didn’t add code. It read the repo and produced a different view of the repo.

A day earlier, Saurav Panda described a “domain skills” pattern: an agent does a task in your browser once, figures out the selectors, the edge cases, the timing, and then writes the reusable skill itself. You just open the PR. The agent observed its own browser session and produced a structured artifact about that session.

And Leon Lin shared an image-to-code flow where a static UI screenshot becomes a running site. The agent looked at one representation of a thing and produced an executable representation of the same thing.

I sat with these for a day before I noticed they were the same shape.


Last weekend I set up an Obsidian vault and turned Claude loose on it. I’m running an experiment: instead of typing notes into a knowledge base by hand, I architected an agentic setup where Claude observes context about my work and writes the atomic notes itself. About 30 of them landed across 8 domains: agents and MCP, attribution thinking, sales pattern stuff from running the tech side of a performance agency, the usual. Once the notes were in place, I ran a separate synthesizer pass: read everything, look for cross-domain patterns, write connection notes that link 5+ source notes around a single insight.

It produced 5 connection notes. Each one tied together threads from across different months and different domains, around a claim that hadn’t been said out loud anywhere in the vault. They read like insights. But here’s the thing — they weren’t new. The synthesizer had no information that wasn’t already in the source notes. Every claim in every connection note was traceable back to a note already in the vault.

What it had was a perspective the source notes didn’t take. The notes were written one at a time, atomic, each making its own claim. The synthesizer read them as a graph.

That’s the shape. Same as CodeWiki reading a repo as a story instead of a directory tree. Same as the skills agent reading a browser session as repeatable instead of one-off. Same as the image-to-code flow reading a screenshot as structure instead of pixels.

Existing artifacts, plus a perspective shift, equals something structurally new.


I think this is its own category of agent and we don’t have a clean name for it yet. They’re not task-doers. They don’t go fetch new information. They observe a body of work that already exists — yours, usually — and produce a different reading of it.

The leverage is weird because it feels like it shouldn’t work. The information was already there. You wrote it, or your team wrote it, or your repo contains it. Nothing was added. But the useful artifact (the wiki, the skill, the connection note, the running prototype) didn’t exist until something read the original through a lens the original author didn’t use.

Which means the bottleneck isn’t data. It’s perspective that hasn’t been applied yet.


If you have a body of artifacts (a codebase, a notes vault, a backlog of customer transcripts, a year of campaign briefs, a thousand support tickets, a folder of sales call recordings), there is probably an agent that hasn’t been built that would produce a structurally different view of it. Not a summary. A view. Something with a shape the originals don’t have because no one read them through that shape.

The thing I keep coming back to: the agent’s perspective has to be true, not just internally consistent. A synthesizer that hallucinates patterns across my notes is strictly worse than no synthesizer, because it produces output that looks like insight and quietly poisons the graph. CodeWiki making up an architecture that isn’t there is worse than no wiki. A skills agent inventing selectors that work once is worse than no skill.

So the open question — the one I don’t have an answer to yet — is how you tell. How does an observer-agent know its perspective is real and not just plausible? My five connection notes look right to me. But the vault is mine. I set up the system. I curated the inputs. The source notes describe my work, in voices and framings I architected. I’m not reading from outside any of it. I’m the worst possible reviewer.

That’s the next post. For now I’m just sitting with the pattern, watching it show up in three places in one week, and wondering what else in my life is a body of artifacts waiting for an agent that hasn’t been written yet.

#agents#ai#patterns#building
Share

Pre-drafted copy for each platform. X opens with the post pre-filled. LinkedIn requires a paste — the button copies the text to your clipboard and opens the composer in one click.

// X / Twitter
Three agent posts crossed my feed this week. CodeWiki turning a repo into an interactive guide. A skills agent that watches your browser session and writes the reusable script itself. An image-to-code flow.

Sat with them for a day. They're the same shape.

https://acidlemon.com/posts/2026-05-08-agents-that-observe-their-own-outputs/
Post to X ↗336 chars
// LinkedIn
A class of agent is emerging that I don't think we have a clean name for yet.

Three things crossed my feed this week that looked unrelated. Google's CodeWiki turning a repo into an interactive guide with diagrams and walkthroughs. Saurav Panda's skills pattern, where the agent does a task in your browser once and writes the reusable skill itself. Leon Lin's image-to-code flow, where a static screenshot becomes a running site.

I sat with them for a day before noticing they were the same shape.

None of these agents add new information. They observe a body of work that already exists and produce a different reading of it. The repo as a story instead of a directory tree. The browser session as repeatable instead of one-off. The screenshot as structure instead of pixels.

Existing artifacts, plus a perspective shift, equals something structurally new.

The leverage feels like it shouldn't work. Nothing was added. But the useful artifact didn't exist until something read the original through a lens the original author didn't use.

Which means the bottleneck isn't data. It's perspective that hasn't been applied yet.

If you have a codebase, a notes vault, a backlog of customer transcripts, a year of campaign briefs, a thousand support tickets, there is probably an agent that hasn't been built that would produce a structurally different view of it. Not a summary. A view.

The open question I don't have an answer to: how does an observer-agent know its perspective is real and not just plausible?
1515 chars