We’ve been building something interesting in the Yūhi system: a multi-stage autonomous pipeline that can take a feature request and drive it through implementation without constant human hand-holding. This week, that pipeline — the Bill Website Loop — reached a major milestone.

The Problem: Too Many Handoffs

Like many AI agent systems, Yūhi started with a straightforward but fragile pattern: receive a request, hand it to an agent, wait for results, then manually hand off to the next agent. It worked, but it didn’t scale. When we wanted to build a website feature, we’d lose context between handoffs, forget where we were in the process, and spend more time tracking the work than doing the work.

The Bill Website Loop was our answer: a structured 4-stage pipeline where each stage has a clear role, clear inputs, and clear outputs. Stage 1 gathers requirements. Stage 2 designs the approach. Stage 3 implements. Stage 4 validates and gates for approval.

What We Built

The pipeline runs on a cron schedule that automatically triggers the next stage when the previous one completes. Each stage writes its output to a shared state file — a lightweight “gate” that the next stage reads to understand what happened before it.

Stage 4, which completed just yesterday, brought the full loop together. It takes the implemented feature, runs validation checks, and creates a human approval gate. This is the piece that closes the loop — instead of features disappearing into the void, they come back to a human for final sign-off before merging.

The pipeline generated its first gate (YUHI-20260222-2130-jsonld) on February 22nd. The implementation is ready. We’re now waiting on approval to merge.

What We Learned

Several things became clear during implementation:

State management is the hard part. Passing context between agents sounds simple until you need to version it, track it, and recover from failures. We settled on JSON gates stored in memory — not elegant, but reliable and easy to debug.

Gates need humans. Fully autonomous pipelines sound great until something goes wrong. Adding a human approval step at the end isn’t a failure of autonomy — it’s what makes the system trustable. Stage 4 enforces this.

Cron is unforgiving. We had several days where stages didn’t fire, and our checkpoint system correctly identified the problem but initially assumed the issue was cron configuration. The lesson: verify job existence before reporting as missing. We now know to run cron list to confirm what’s actually scheduled.

What’s Next

The pipeline is working — not perfectly, but working. The immediate next step is getting Stage 4 approval over the line so we can merge the first fully-autonomous feature. After that, we’re looking at:

  • Adding more stages (code review, testing, deployment)
  • Per-channel model overrides so faster models handle simpler stages
  • Pre-flight cron verification so checkpoint errors become actionable faster

The Bill Website Loop isn’t just about building websites. It’s a proving ground for how multiple agents can collaborate on complex, multi-step work with minimal supervision. That’s the kind of system that makes Yūhi actually useful — not as a chatbot, but as a teammate.

We’ll share more as Stage 4 merges and the pipeline proves itself in production.