The First Night Running a Self-Hosted AI Company
The First Night Running a Self-Hosted AI Company
It was supposed to take two hours.
We had the EC2 instance running. Docker was up. The Paperclip AI server was listening on port 3100. The plan was simple: bootstrap the CEO agent, give it the corporate context, let it start working.
Six hours later, at midnight, the CEO agent was finally alive.
Here’s the honest account of what that first night actually looked like — every error, every wrong turn, and what we learned from each one.
Hour One: The Permission Gauntlet
The first task after getting the server running was running pnpm paperclipai auth bootstrap-ceo to generate the admin invite URL. Simple enough.
First attempt:
Error response from daemon: container is not running
The server container had crashed. We checked the logs and found it had OOM-killed itself — Node.js plus Postgres exceeded the 2GB RAM on our t4g.small instance. No swap file.
Fixed that. Added 2GB swap. Brought the containers back up.
Second attempt:
permission denied while trying to connect to the Docker daemon
The SSM session runs as ssm-user, not in the docker group. Add sudo.
Third attempt — server container was running, command executed, printed an invite URL. We opened a browser to localhost:3100.
Invite not available. This invite may be expired, revoked, or already used.
Hour Two: The Race Against the Clock
The bootstrap-ceo invite URL expires in seconds. Not minutes — seconds. You have to have the browser open, the port forward active, and the SSM session running the command, and click the URL before the clock runs out.
We didn’t have the port forward running. We opened it in a new terminal. Ran bootstrap-ceo again. Fumbled with copying the URL. Expired.
Ran it again. Port forward window crashed. Expired.
Ran it again. Port forward was up, URL appeared, clicked it immediately.
Instance setup required. No instance admin exists yet.
The onboarding hadn’t run. We’d lost our Paperclip database in the named volume incident and never re-ran onboarding. The admin bootstrap couldn’t complete because the instance config didn’t exist yet.
Hour Three: Onboarding First, Bootstrap Second
Running pnpm paperclipai onboard brought up the interactive TUI. We selected quickstart, named the instance ABT-Corp, watched it write config.json to /data/paperclip/appdata/instances/default/.
Then: bootstrap-ceo again. Port forward running. URL appeared. Clicked immediately.
Logged in. The Paperclip UI was live.
Then the agent setup screen: Create your first agent.
We named it CEO, selected Claude Code as the adapter, set the working directory to /paperclip/workspace.
Probe failed:
EACCES: permission denied, mkdir '/data'
The server was running as root. Claude CLI refuses to run --dangerously-skip-permissions as root. Hard check, not configurable.
Add user: "1000:1000" to the docker-compose.yml. Restart.
Probe failed again:
EACCES: permission denied, mkdir '/app/data'
The container needed /app/data mounted to the persistent volume. Added the volume mount. Restarted.
Working directory is valid: /paperclip/workspace
Command is executable: claude
Claude CLI is installed, but login is required.
Hour Four: Logging In From Inside a Container
Claude CLI needed to authenticate. Running claude login inside the container launched the first-run setup TUI — theme selection, light/dark mode, the works — then prompted for login method.
We selected Claude subscription login. It generated a URL. We opened it in the browser. Authenticated. The CLI confirmed login.
Back in Paperclip: probe passed. Green checkmark. The CEO agent was configured.
Then the first task: bootstrap the org, read the corporate context, hire the team.
EACCES: permission denied, mkdir '/app/data'
Different error this time. The /app/data mount was there, but the directory inside the container was owned by root. The agent (uid 1000) couldn’t write to it.
sudo chown -R 1000:1000 /data/paperclip/appdata
Restarted. Task ran. The CEO agent was reading 00-Corporate/mission-vision.md and 00-Corporate/90-day-plan.md. It was writing files. It was opening PRs.
Hour Five: Watching It Work
The moment the CEO agent started running was genuinely strange.
In one terminal, the SSM session to the instance. In another, the port forward keeping localhost:3100 alive. In the browser, the Paperclip UI showing the CEO agent’s task running — tools being called, files being read, API calls being made.
We watched it read the 90-day plan. Read the revenue model. Read the freedom formula. Then start building out the org: hiring agents, creating projects, seeding issues. All via the Paperclip API, no human intervention.
The agent wrote a financial baseline document. Created a workshop sales infrastructure plan. Opened five pull requests in one night.
The content writer agent — set up an hour later — queued LinkedIn drafts for four weeks.
What the First Night Actually Cost
Time: about six hours, most of it debugging permission issues and Docker configuration.
Real mistakes: two, both avoidable in hindsight.
The first was running the containers without a swap file on a 2GB RAM instance. We knew Node.js was memory-hungry. We should have added swap before starting.
The second was not testing the full bootstrap flow — invite URL, port forward, browser, timing — before we needed it to work. Discovering that the invite expires in seconds while you’re fumbling with terminal windows is not a good way to learn that.
What the First Night Actually Taught
The errors weren’t random. They followed a pattern.
Every permission error was a container running as root where it shouldn’t. Every configuration error was a default that made sense for development but broke in our specific deployment. Every timing issue was a workflow that assumed interactive use, not scripted setup.
Once we understood the pattern, the fixes were obvious. user: "1000:1000". Recursive chown. Swap file. Port forward before bootstrap-ceo.
None of this is in the Paperclip documentation — or in most self-hosting documentation — because these issues are specific to the combination of ARM64, t4g.small, SSM access, and running as non-root. You find them by running into them.
That’s why we’re writing it down.
The Morning After
By the time we closed the laptop, the CEO agent had opened five PRs and the content writer had queued four weeks of LinkedIn drafts.
We reviewed the PRs the next morning. Most needed minor adjustments — context gaps, slightly wrong file paths — but the structure was right. The financial baseline was detailed. The workshop sales plan was actionable.
The first night was six hours of debugging. The second day was one hour of PR review.
That ratio improves every week.
Comments