Purpose-Built Over Autonomous: What Killing Paperclip Taught Us About AI Agents

by Alien Brain Trust AI Learning
Purpose-Built Over Autonomous: What Killing Paperclip Taught Us About AI Agents

Purpose-Built Over Autonomous: What Killing Paperclip Taught Us About AI Agents

The lesson from shutting down Paperclip wasn’t “don’t use agents.” It was “don’t start with autonomous agents.”

If you missed the last post: we ran a multi-agent system called Paperclip — autonomous AI agents with real infrastructure access, designed to operate the business without us in the loop. We shut it down last week. Too much overhead correcting drift, too expensive to run honestly at scale, and fundamentally doing the wrong job.

Here’s what we replaced it with, and the framework we’re using going forward.

The Training That Changed My Thinking

One of the clearest things I learned in my AI training: start slow with agents. Don’t try to scale before you understand the failure modes.

That sounds obvious in hindsight. In practice, it’s easy to skip. The demos are impressive. The potential is real. You wire things together, it works, and you add more scope. Then more. Then you have a system that’s doing a lot — and when it goes wrong, you can’t tell where it went wrong or why.

Paperclip was a case study in exactly that. We scaled agent autonomy before we understood where the judgment calls were. When agents run amok, they don’t fail dramatically. They just drift — and drift at speed creates compounding work.

The Framework We’re Using Now

One agent. One job. One approval gate.

That’s it. Before adding any agent to our workflow, we answer three questions:

  1. What is the single job this agent does? If the answer is more than one sentence, the scope is too broad.
  2. Where does it stop and wait for a human? There has to be a checkpoint. Agents that run to completion without a review step are Paperclip.
  3. What does failure look like, and is it recoverable? An agent that creates a PR on a wrong branch is recoverable. An agent that sends an email to the wrong person is not.

What Our Agent Stack Looks Like Now

Course QA bot (EC2, daily cron): Runs test scenarios against our AI course content, scores responses, sends a Telegram report. No decisions. No follow-up actions. Runs, reports, stops. If scores drop, a human decides what to do about it.

Content bot (Telegram command): /draft <topic or URL> → drafts a blog post using Sonnet, creates a PR. Stops there. Human reviews the PR, approves or closes it, then triggers publish if ready. The bot doesn’t decide what to write or when to publish.

Linear integration (GitHub Actions): On commit, updates the relevant Linear ticket. One job, zero autonomy, fully deterministic.

Notice what these have in common: they all do something a human would otherwise have to do manually, and they all stop at a point where a human makes a judgment call.

Why Starting Slow Wins

The failure mode of a narrow agent is small and visible. The failure mode of an autonomous agent is large and invisible until it compounds.

When our content bot drafts a bad post, we see a bad PR and close it. Five minutes.

When Paperclip made a sequence of slightly wrong decisions over a week, we’d spend a Saturday unwinding it.

Scope creep in agent systems is also insidious. “It can write content — maybe it can also decide what to post and when.” That one extension turned a useful tool into a system that needed constant supervision. Every time you expand agent scope, you’re adding a new surface where drift can start.

Start with the minimum viable scope. Prove it. Then — maybe — expand.

The Honest Cost Calculation

Autonomous agents sound like leverage. The real calculation is:

Time saved by agent automation vs. Time spent reviewing, correcting, and auditing agent output

For narrow agents doing deterministic jobs: the math is almost always positive.

For autonomous agents making judgment calls: the math depends entirely on how well-aligned they are. In practice, alignment degrades with scope. More autonomy = more drift = more audit time.

Paperclip tipped us into negative leverage. Our current agents are solidly positive.

What We’d Do Differently

If we rebuilt Paperclip today, we’d start with a single agent doing one job — probably content drafting — with a hard stop at PR creation. We’d run that for a month, understand the failure modes, and extend from there.

We wouldn’t wire together CEO/CTO/CFO agents in the first week. We wouldn’t give them access to take actions without checkpoints. We’d treat every expansion of agent scope the same way we’d treat expanding a user’s permissions in an IAM system: justify it, scope it tightly, and audit what it does.

That’s the boring answer. It’s also the right one.

The Takeaway

Purpose-built agents that extend your capability beat autonomous agents that try to replace your judgment — at least at this stage of the technology and this stage of the business.

The goal isn’t agents that run the company. It’s agents that make you faster at running the company. Those are different jobs, and confusing them is expensive.

Paperclip was a useful experiment. The lesson cost us a few months and some infrastructure bills. It could have cost more.

Start slow. Know where the approval gate is. Earn the expanded scope.

Tags: #ai-agents#automation#implementation#workflows#building-in-public

Comments

Loading comments...