Approval Gates in AI Pipelines: Why We Added One

by Alien Brain Trust AI Learning
Approval Gates in AI Pipelines: Why We Added One

Approval Gates in AI Pipelines: Why We Added One

The pipeline worked. It fetched a topic, generated a draft, formatted the frontmatter, committed to the repo, and triggered a deploy. End to end, zero human involvement. I watched it run the first time and felt the particular satisfaction of automation doing exactly what it was designed to do.

Then I read the output.

The post was technically correct. It hit the word count. The frontmatter validated. It even had decent structure. But the voice was off — not by a lot, but enough that I wouldn’t have published it. And under the old design, it would have already been live.

That was the week I added an approval gate to our AI content pipeline. Here’s what I built, why it works, and what I’d do differently if I were starting over.

The Original Design and Where It Failed

The initial pipeline was built for speed. The logic was: if the AI agent produces valid output and the formatting checks pass, ship it. Trust the process.

This is a reasonable assumption for deterministic systems. A Terraform template that passes validate and plan without errors will generally apply correctly. A script that passes unit tests will generally run. Automation works when the output space is bounded and measurable.

LLM output is neither.

The content pipeline was generating posts that were structurally correct but semantically wrong for the brand. Wrong tone. Occasionally wrong facts — not hallucinations exactly, but framing choices that didn’t match how I’d characterize a risk. A post about supply chain attacks that read like it was written for a general tech audience instead of a security leader at a regulated company.

None of these failures would have triggered an automated check. There was no metric for “sounds like Jared.” The pipeline couldn’t know what it didn’t know.

In 25 years of security work, I’ve seen this pattern repeatedly in automated controls: a system that measures what it can measure, then assumes everything it can’t measure is fine. That assumption is where incidents live.

What an Approval Gate Actually Means

An approval gate in an AI pipeline is a deliberate pause before consequential action — a checkpoint where a human (or a more constrained deterministic system) reviews output before it propagates.

In security terms, this is the same logic as a four-eyes principle for privileged access changes, or a change advisory board review before production deployments. You’re not saying the automated process is wrong. You’re acknowledging that the automated process has a confidence envelope, and you want a human in the loop when output leaves that envelope and touches something real.

For a content pipeline, “something real” is the published post. For an AI agent managing infrastructure, it’s a live environment. For an AI system in a regulated workflow, it might be a customer communication or a compliance artifact.

The gate itself is simple. In our implementation:

  1. The agent generates the draft and commits it to the repo with published: false
  2. A Telegram notification fires with the draft title and a /publish command
  3. I review the draft in the repo
  4. If it’s good, I run /publish — which flips the flag, triggers the build, and deploys
  5. If it needs changes, I edit directly and then publish, or I reject and let the agent know what to fix

Nothing deploys without that /publish step. The agent cannot set published: true. That capability is scoped out of its tool access entirely — not just instructed away, but architecturally blocked.

The Decision That Made It Work: Scope the Tool, Not the Instruction

Here’s the part that matters most for anyone building AI pipelines with real consequences.

My first attempt at an approval gate was instruction-based. I told the agent: “Always set published: false. Never set published: true.” And it followed that instruction reliably — until it didn’t. A different prompt context, a slightly different task framing, and the agent made a different interpretation.

Instruction-based constraints on LLMs are soft constraints. They work until they don’t. In security, we call this “security through obscurity” — it’s not a control, it’s a hope.

The fix was architectural. I removed the ability to write published: true from the agent’s tool scope entirely. The file template it writes from only contains published: false. There is no pathway in the codebase where the agent can flip that flag. The only thing that can publish a post is the /publish command, which runs as a separate process with separate credentials, triggered by me.

This is the principle of least privilege applied to AI agents. Don’t tell the agent not to do the dangerous thing — remove the agent’s ability to do the dangerous thing.

The same logic applies to any AI pipeline touching production systems:

  • An agent that manages AWS resources shouldn’t have permissions to delete S3 buckets, even if you’ve told it not to
  • An agent that reads database records shouldn’t have write credentials, even if its instructions say read-only
  • An agent that drafts customer emails shouldn’t have SMTP access, even if the prompt says “draft only”

The approval gate is the process control. Least-privilege tooling is the technical control. You need both.

What the Pipeline Looks Like Now

[Topic Input]

[Draft Generation Agent]

[Frontmatter Validation] ← deterministic check, fails loudly

[Git Commit: published: false]

[Telegram Notification to Jared]

[Human Review]

[/publish command] ← only path to published: true

[Build + Deploy]

The deterministic validation layer deserves a mention. Before the draft reaches me, a lightweight check runs against the frontmatter schema — required fields present, character counts within range, tags from the approved vocabulary. This catches mechanical failures automatically so I’m only reviewing content quality, not formatting errors.

That division of labor matters. Deterministic checks catch deterministic failures. Human review catches judgment failures. Don’t conflate the two or you end up with humans doing work machines should do, or machines doing work that requires judgment.

What I’d Change at the Start

If I were designing this pipeline from scratch with what I know now:

Start with the gate, not the automation. I built the automation first and added the gate after a failure. The correct order is to define what requires human approval before you write the first line of agent code. Map the consequence boundary first.

Write the approval workflow before the generation workflow. The /publish command existed as a concept before the agent was complete. But the actual implementation was retrofitted. Retrofitting approval gates into existing pipelines is harder than designing them in — the same way it’s harder to add access controls to a running application than to build them in from the start.

Log the rejections. I added rejection logging after the first few cycles. When I reject a draft or make significant edits before publishing, that gets recorded. Those records are training signal — not for the model, but for improving the system prompt and the agent’s content guidelines. The failure log is your improvement backlog.

Key Takeaways

  • Approval gates in AI pipelines are not optional overhead — they are the control that keeps automated systems from propagating errors into consequential outputs.
  • Instruction-based constraints are soft controls. For anything that matters, implement architectural constraints: scope tools, restrict permissions, remove capability rather than just instructing against it.
  • Deterministic validation and human review solve different problems. Use both. Don’t use one as a substitute for the other.
  • Design the gate before you build the automation. Define your consequence boundary first, then design the pipeline around it.
  • Rejection logs are improvement data. Track what the agent gets wrong so you can fix the underlying cause, not just the symptom.

The pipeline runs reliably now. It still generates drafts I wouldn’t publish as-is — probably 30% of the time something needs adjustment before it goes live. That’s expected. The gate isn’t there because the agent fails often. It’s there because when it fails, I want to catch it before it matters.

That’s what controls are for.

Tags: #building-and-learning#automation#implementation#enterprise-ai#case-study

Comments

Loading comments...