Our AI C-Suite: What We Learned Giving Agents Real Jobs

March 29, 2026 • by Alien Brain Trust • AI Learning

Our AI C-Suite: What We Learned Giving Agents Real Jobs

Meta Description: ABT runs with AI agents filling CEO, CTO, CMO, and CFO roles. Here’s what actually works, what broke, and the workflow rules we had to write the hard way.

We don’t have employees. We have Jensen Huang (CEO), Elon Musk (CTO), Steve Jobs (CMO), and Ben Horowitz (CFO/COO) — all running as AI agents inside Paperclip, our self-hosted AI platform. They pick up issues, write code, push branches, draft content, and track the financials.

Here’s what actually works and what we had to fix.

Why Personas Matter

The first version of our agents had generic role descriptions. They worked, technically — tasks got done, code got pushed. But the output felt flat. No voice. No judgment on edge cases. No sense that the agent understood why something mattered, only what to do.

Switching to high-conviction personas changed the output quality noticeably.

Jensen Huang’s CEO doesn’t just report status — he frames everything against the freedom formula ($10k ABT + $5k STR = financial independence). Elon Musk’s CTO applies first principles before touching infrastructure — “does this actually need to exist?” Steve Jobs’ CMO refuses to ship content that doesn’t meet brand standards, full stop. Ben Horowitz’s CFO/COO holds both financial tracking and operational execution, which reflects how a lean startup actually works.

The persona isn’t cosplay. It’s a forcing function for the agent to apply a consistent decision framework instead of just completing tasks in isolation.

The Workflow Rules We Had to Learn the Hard Way

Rule 1: Agents create PRs. The founder reviews and merges. Never the other way around.

We kept finding issues where the agent’s comment said “Jared should create a PR from this branch.” That’s backwards. The human’s job is judgment — approve or reject. The agent’s job is execution — build it, open the PR, explain what it does. We added this explicitly to CLAUDE.md and every agent’s system prompt.

Rule 2: Always rebase before creating a PR.

We killed five PRs in one session because they were created from branches that diverged from main weeks earlier. One of them would have deleted three GitHub Actions workflow files on merge — we caught it by checking the diff. Now the rule is non-negotiable: git pull --rebase origin main before any PR is created.

Rule 3: Feedback goes on the Paperclip issue, not the GitHub PR.

Agents monitor Paperclip for work. They don’t get notified about GitHub PR comments. If you put your feedback on the PR, it disappears. The workflow is: read the diff on GitHub, write your feedback on the Paperclip issue, the agent picks it up and updates the branch. The PR auto-updates.

Rule 4: The CMO has a mandatory human approval gate.

Steve Jobs won’t publish anything. Every piece of content — LinkedIn posts, blog drafts — requires explicit approval before it ships. This came from experience: we had a content writer pushing LinkedIn posts that hadn’t been reviewed, with dates already passed, drafted before the new brand standards existed. We closed eight PRs in one cleanup session. The approval gate is now enforced in the system prompt.

What Still Breaks

Agent assignment requires UI. Our Claude Code API key doesn’t have tasks:assign permission in Paperclip. Every time we create an issue via the API, we have to manually assign it in the UI. This is friction we want to fix — it’s a Paperclip feature request.

Agent workspaces need manual git pulls. Each agent runs in its own workspace directory on the EC2 instance (/data/paperclip/appdata/workspace/[agent]/). When we update system prompts in the repo, the agents don’t automatically pick them up — we have to SSH in and run git pull in each workspace. We’re working on making this automatic.

Content quality requires active curation. Agents are productive, but they’ll generate volume if you let them. We ended up with 20+ stale branches, 18 duplicate/outdated issues, and 5 open PRs that needed closing. The board hygiene work was manual. Jensen’s weekly triage pass is now a standing task.

The Board After Cleanup

Before this week’s cleanup: 52 issues, 5 open PRs, multiple stale branches.

After: 20 active issues, 2 open PRs (both meaningful), zero stale branches. Every issue has a clear owner and is either actionable now or blocked on a known dependency.

The cleanup itself took about half a day. The discipline to prevent the next accumulation is in the agent guidelines and CLAUDE.md — not in our memory.

What We’d Tell Someone Starting This

Start with fewer agents than you think you need. We had six roles running before we had the workflow rules figured out, and the noise outweighed the output. Get one agent working well end-to-end — clear persona, real accountability, proper PR hygiene — before you add the next.

Write the workflow rules before something breaks the wrong way. The “agents create PRs” rule, the rebase rule, the approval gate — these all came from incidents. They’re obvious in retrospect. Write them down first.

Your agents are only as good as your issue quality. Vague issues produce vague output. “Fix the blog” is not a ticket. “Fix auto-publish-blog.yml bash syntax error on line 50 — <= comparison fails in conditional expression” is a ticket Elon can close in 20 minutes.

This is week 4 of building ABT in public. The infrastructure is stable. The agents are running. Next up: picking the right workshop offer before we build the sales machine around the wrong thing.

Comments

Loading comments...