Stop Losing Dev Sessions to Broken Setup: Build a Session Startup Script

by Alien Brain Trust AI Learning
Stop Losing Dev Sessions to Broken Setup: Build a Session Startup Script

Stop Losing Dev Sessions to Broken Setup: Build a Session Startup Script

Here’s a workflow problem nobody talks about when they’re selling you on AI-powered development.

You sit down to work. You open Claude. You start a task. Twenty minutes in, something breaks. An agent can’t authenticate. A port isn’t open. A secret is missing. You stop the task, start debugging the environment instead of the problem, and by the time everything is working again you’ve lost the thread of what you were trying to do.

This was happening to us constantly.


What Was Actually Breaking

We run Paperclip — our AI agent platform — on an EC2 instance in AWS. Claude Code runs locally on Windows. Every working session requires:

  • AWS authentication (SSO via IAM)
  • An SSM port forward to EC2 (localhost:3100 → Paperclip UI)
  • All SSM secrets present and readable by the agents
  • Git repo in a clean known state

Any one of these failing mid-session costs time. But the worst part isn’t the time fixing it — it’s the context loss. You’re in the middle of something with an agent, you hit an auth error, you spend 15 minutes rotating a key or restarting a tunnel, and then you’re back at the prompt trying to remember where you were.

The secrets piece was the most frustrating. We’d start a session, assign a task to an agent, and 10 minutes later get a cryptic error that traced back to a missing SSM parameter — something that was there yesterday but got rotated, deleted, or never migrated from a plain-text environment variable in the first place. We knew we had gaps in the inventory. We just didn’t know which one was the problem until an agent hit it.

And then there was the root account issue. We discovered mid-session that we’d been running AWS CLI commands as the management account root because our IAM user credentials had been rotated mid-session and the fallback was the SSO session. Nothing catastrophic. But also not acceptable.


The Fix: Make Setup a First-Class Part of the Workflow

The answer wasn’t better debugging skills. It was making the environment check automatic, fast, and part of the ritual of starting work.

We built two scripts:

  • dev-session.ps1 — the session launcher, runs at the start of every working session
  • check-secrets.ps1 — the secrets health check, called by the launcher

Here’s what dev-session.ps1 does, in order:

Step 1: Confirm AWS Auth

$identity = aws sts get-caller-identity --output json 2>$null | ConvertFrom-Json
if ($identity.Arn -match "root") {
    Write-Status "Auth" "root session active — check your credentials" Yellow
} else {
    Write-Status "Auth" $identity.Arn Green
}

This runs first, before anything else. If you’re not authenticated, it prompts you to log in before wasting time on the rest. If you’re authenticated as root, it flags it in yellow — you can still proceed, but you see it clearly. That one check alone would have caught our root incident before any infrastructure commands ran.

Step 2: Open the Port Forward (and Keep It Open)

The SSM port forward to Paperclip drops periodically. Without automation, that means manually restarting it, sometimes multiple times a session. The script checks whether localhost:3100 is already responding — if not, it launches a separate PowerShell window with an auto-reconnect loop:

while ($true) {
    aws ssm start-session --target $INSTANCE_ID --region $REGION `
      --document-name AWS-StartPortForwardingSession `
      --parameters portNumber=3100,localPortNumber=3100
    Start-Sleep 3  # reconnect on drop
}

It runs in the background, restarts automatically if it drops, and you never think about it again.

Step 3: Git Repo State

Quick sanity check — current branch, unpushed commits, dirty files. Sounds minor but this has saved us from starting agent work on the wrong branch more than once.

[ Repo ]
  Branch               bot/kickstarter-campaign-may-2026
  Unpushed             3 commit(s)
  Uncommitted          clean

Step 4: Secrets Health Check

This is the one that matters most for our workflow. check-secrets.ps1 loops through every SSM parameter in our inventory and confirms it exists and is readable:

$params = @(
    @{ Name = "/paperclip-app/anthropic-key";       Label = "Paperclip Anthropic key" },
    @{ Name = "/paperclip-app/better-auth-secret";  Label = "Paperclip Better Auth secret" },
    @{ Name = "/paperclip-app/github-pat";          Label = "Paperclip GitHub PAT" },
    @{ Name = "/paperclip-app/claude-code-api-key"; Label = "Claude Code API key" },
    @{ Name = "/abt/replicate/api-token";           Label = "Replicate API token" },
    # ... all keys in inventory
)

foreach ($p in $params) {
    $result = aws ssm get-parameter --name $p.Name --with-decryption --region $REGION 2>&1
    if ($LASTEXITCODE -eq 0) {
        Write-Host ("  {0,-40} OK" -f $p.Label) -ForegroundColor Green
    } else {
        Write-Host ("  {0,-40} MISSING" -f $p.Label) -ForegroundColor Red
    }
}

Critical detail: it never prints values. Only the parameter name and OK or MISSING. This is non-negotiable — the whole point of SSM SecureString is that the value doesn’t appear in logs, terminals, or session history. We’re checking existence, not exposing the secret.

The output looks like this every morning:

[ Secrets Health Check ]
  Paperclip Anthropic key                  OK
  Paperclip Better Auth secret             OK
  Paperclip GitHub PAT                     OK
  Claude Code API key                      OK
  Replicate API token                      OK
  Test bot Anthropic key                   MISSING

  One or more secrets missing. See SECRETS-RUNBOOK.md to fix.

If something shows MISSING, you fix it before the session starts. Before Claude opens. Before any agent gets assigned a task. The problem is surfaced at the moment you can actually do something about it, not 20 minutes in when an agent is mid-task and failing in a confusing way.


Why This Changed the Workflow

The script takes about 15 seconds to run. What it replaces is anywhere from 15 minutes to an hour of reactive debugging when something inevitably breaks mid-session.

More importantly, it changes the mental model of starting work. Instead of assuming the environment is fine until something fails, you verify it’s fine before you start. That’s a security posture shift as much as a productivity one.

We also found it surfaced gaps we didn’t know we had. The first time we ran the full health check, two parameters showed MISSING — a test bot key that had been rotated and never updated in SSM, and an API token that had only ever lived in a GitHub Secret and never been migrated. We’d been lucky those agents hadn’t needed them in a while.

The inventory was incomplete. The health check made that visible.


The Pattern Is Simple, Even If You’re Not on AWS

If you’re building with AI agents, you probably have some version of this problem — environment state you assume is fine until it isn’t. The specific implementation doesn’t matter. The pattern does:

  1. Check auth before anything else — and make root/elevated access visually obvious
  2. Automate the tunnel/connection so it’s not a manual step you forget
  3. Run a secrets inventory check — by name only, never by value
  4. Make it fast enough that you actually run it — 15 seconds is fine, 2 minutes isn’t

The alternative is continuing to lose sessions to setup problems that were preventable.


Dev session startup script and secrets health check are both in our repo. If you’re building something similar and want to see the full implementation, reach out at j@alienbraintrust.ai

Tags: #workflow#developer-experience#aws#building-in-public#ai-agents#productivity

Comments

Loading comments...