The Human Mediator: Using the Mediator Pattern to Coordinate Two AI Coding Agents
Posted: 3/9/2026 1:35:48 AM
By: PrintableKanjiEmblem
Times Read: 27
0 Dislikes: 0
Topic: News: Technical

The Human Mediator: Using the Mediator Pattern to Coordinate Two AI Coding Agents

The Problem With Two AIs Talking to Each Other

Two AI agents cannot talk to each other — and pretending otherwise is where most multi-agent projects go wrong.

When a codebase splits cleanly between a server and a client, assigning one agent to each side is the obvious move. The problem surfaces when both agents need to agree on an API contract. Without a reliable communication channel, each agent makes its own assumptions. By the time the integration test runs, hours of parallel work need to be reconciled against each other.

The instinctive fix is to give both agents access to the same shared context and hope they stay synchronized. In practice, they diverge. One agent assumes a particular envelope format; the other assumes something different. Neither agent is wrong given what it knows — but what each agent knows is incomplete.

The mediator pattern offers a better model.


What the Mediator Pattern Actually Says

In software design, the mediator pattern replaces direct object-to-object communication with a central coordinator. Objects do not call each other; they send messages to the mediator, and the mediator decides what to forward, to whom, and when. The result is that each participant knows only the mediator interface, not the internal state of the other participants.

Applied to two AI agents, the roles map directly:

  • Agent A (server) — knows the server codebase, runs on server infrastructure, produces server-side commits.
  • Agent B (client) — knows the client codebase, runs on client infrastructure, produces client-side commits.
  • The user — is the mediator. They relay information in a structured format, enforce ordering, and hold the shared state that neither agent can see on its own.

The agents are not aware of each other. They are only aware of the mediator.


The Shared Document as the Mediator's Memory

The mediator needs persistent memory that survives across sessions. A markdown file committed to the shared Git repository serves this purpose.

The document has three distinct zones:

1. Immutable Architecture Decisions (top of file)

This section is written once and rarely changes. It records the facts both agents must agree on regardless of which task they are working: the API contract shape, authentication strategy, token handling conventions, endpoint URL patterns. Any agent starting a new session reads this section first to reconstruct its side's mental model of the boundary. It helps to include references to architectural and planning documents stored in the same repo.

## Key Architecture Decisions (Carry Forward)

- **Auth:** OpenIddict bearer on all sync endpoints via `[Authorize]`.
- **API contract:** All endpoints use `GetAuthenticatedCaller()`.
  All return raw payloads — `ResponseEnvelopeMiddleware` wraps automatically.
  Client unwraps via `ReadEnvelopeDataAsync()`.
- **Sync flow:** changes → tree → reconcile → chunk manifest → chunk download → assembly.

These are not suggestions. They are the contract. When one side deviates, it updates this section and the mediator relays the change to the other side before the other side starts work.

2. Active Handoff (current task)

This section holds exactly one open issue at a time. It contains: what the sending agent implemented (in enough detail that the receiving agent can verify it without reading the diff), what the receiving agent needs to do, and what the receiving agent must report back.

### Issue #31: Task 2.2 - Streaming Chunk Pipeline — Client only

**Server-side status:** Not applicable.
**Client-side status:** 🔲 IN PROGRESS

**What to implement:**
...bounded-channel producer/consumer pipeline for uploads,
temp-file-based streaming for downloads...

**Request back from client agent:**
- Commit hash
- Build: 0 errors
- Test count (was 66, should increase by ≥2)
- Confirm memory behavior: channel-based upload, temp-file-based download

**Mediator review notes:**
...

## Resolved Issues

Completed issues stay in the document, never deleted. Each carries the commit hash, test count delta, and validation environment. This is the audit trail. When an agent later asks "which commit introduced CDC chunking?", the answer is in the file without needing git blame.


The Relay Protocol

The mediator has one job per handoff cycle: carry the output of the completing agent to the waiting agent without loss or interpretation.

The relay template makes this mechanical:

### Send to [Server|Client] Agent


### Request Back
- commit hash
- raw endpoint/URL used
- raw error or query params if applicable
- raw log lines around the event (with timestamp)

"Relay-only" is the critical constraint. The mediator does not summarize, reframe, or editorialize. If the server agent says "build: 0 errors, 304 tests passed, commit 4570c16", the mediator forwards exactly that string to the client agent's context. Lossy relay is the leading cause of context drift.

The "request back" section is equally important. It pre-defines what the next agent must return. This prevents the next agent from responding with prose ("everything looks good!") instead of verifiable facts ("commit bc9e08a, 66 tests passed, 0 failed").


Task Sequencing and Dependency Management

Two parallel agents working on the same integration point will collide unless the mediator enforces ordering. The structure that works:

Strictly ordered tasks — when task B requires the API that task A produces, A must reach validated-complete status before B starts. The mediator holds the gate. The client agent does not begin its CDC implementation until the server agent has committed CDC support and the mediator has forwarded the commit hash and the updated API contract.

Independent tasks — when a task is entirely within one side ("server only" or "client only"), the other agent need not be involved. These tasks are explicitly labeled in the handoff document. The mediator does not relay them to the opposite agent; they are simply logged as complete.

Batching — grouping related tasks into numbered batches creates natural checkpoints. All tasks in Batch 1 complete before Batch 2 begins. This gives the mediator a predictable review point and prevents the "runaway train" failure mode where one agent races ahead of the other.


Validation as a First-Class Step

Neither agent declares a task complete until a set of verifiable outputs exists. The validation contract is defined before work starts, not after:

  • Build result (exact error count, not "it compiled")
  • Test count before and after (delta confirms new tests were added)
  • Specific test names that cover the new functionality
  • The machine and environment where validation ran (server vs. client infrastructure)

The mediator's role in validation is to confirm the outputs match the pre-defined contract before forwarding to the next step. If the client agent reports 65 tests when the previous validated count was 66, the mediator flags the discrepancy before proceeding.

This is the AI equivalent of a pull request review gate. The mediator cannot move context into the next agent's session until the current agent's output is verified.


Code Review at the Handoff Boundary

The handoff moment — when the mediator receives results from one agent and before forwarding to the next — is the natural point for code review. This is not incidental. It is one of the most significant practical advantages of the pattern.

Why handoff-gated review is easier than end-of-project review

When a project is developed without structured handoffs, code review happens at the end. By then, the diff contains hundreds of files changed, dozens of interacting decisions, and no clear boundary between what was hard design work and what was mechanical scaffolding. A reviewer reading the final diff has to reconstruct context that the implementer had in their head weeks ago. Important decisions look arbitrary. Shortcuts look like bugs. The reviewer either rubber-stamps the whole thing or gets lost trying to understand why any individual choice was made.

Handoff-gated review inverts this. Each task is small — typically one to three files changed, a clearly bounded feature, and a test count that tells you exactly how much new behavior was introduced. The agent's handoff report describes what was implemented and why. The mediator reads it immediately after completion, while the rationale is fresh in the document. The diff for a single task is reviewable in minutes rather than hours.

The cumulative effect is significant. Thirty small reviews conducted in real time require far less total effort than one large review conducted at the end, and they catch issues while the relevant agent is still oriented on that part of the codebase.

What to look for during a handoff review

The handoff document structures the review naturally. When a task completes, the mediator has four things in front of them simultaneously:

  1. The pre-agreed implementation spec — what the task said to do, in the handoff document.
  2. The agent's completion report — what the agent says it did, including files modified and tests added.
  3. The actual diffgit show or the commit diff in a Git client.
  4. The test results — pass/fail counts and specific test names.

The review question is simply: do these four things agree? Common discrepancies to look for:

  • Scope creep — the agent modified files outside the listed scope. Sometimes this is fine (a necessary import); sometimes it signals the agent went off-spec.
  • Missing tests — the agent reports tests were added, but the delta is lower than expected, or the test names don't match the new functionality.
  • Hardcoded assumptions — the agent implemented against a specific value that should be configurable, or made an assumption about the other side's interface that was never agreed in the architecture decisions section.
  • Silent dependency introduction — a new NuGet package or dependency that wasn't discussed and may conflict with the other side's constraints.
  • Drift from the architecture decisions — the agent used a different auth pattern, a different envelope format, or a different URL structure than what was recorded at the top of the handoff document.

None of these are necessarily blockers. Some are judgment calls. The point is that the mediator can evaluate them in context, before they propagate to the other agent and become load-bearing assumptions.

Blocking vs. non-blocking findings

The mediator classifies review findings into two categories before deciding whether to forward the handoff.

Non-blocking findings are recorded in the handoff document under the completed issue, but do not stop the next task from starting. Examples: a variable name that could be clearer, a log message at the wrong severity level, a slightly stale comment. These accumulate as cleanup notes for a future tidy-up pass.

Blocking findings require the completing agent to revise before the handoff proceeds. Examples: a test that passes for the wrong reason (mock returns success regardless of input), an API endpoint that ignores the integration contract, a file permission bug on a security-relevant path. The mediator sends the finding back to the completing agent with specific evidence from the diff, not a vague "this looks wrong."

The key discipline: blocking findings must be resolved in the same agent session where they are found. Do not carry a blocking issue forward and plan to fix it later. Later means the other agent builds on the flawed foundation, and the fix becomes exponentially more expensive.

Review across the integration boundary

Some tasks touch both sides — the server implements a new field in the API response, and the client consumes it. For these tasks, the mediator reviews the server-side implementation first, then compares it against what the client side will be asked to build.

The question is: does the server's actual output match what the client's upcoming spec says it should receive? If the server committed ChunkSizes as IReadOnlyList? but the client spec describes it as int[], that discrepancy is caught at the handoff review rather than at integration test time.

This cross-boundary check is invisible when each side reviews its own code in isolation. It becomes natural when the mediator is holding both sides' specs and comparing them before any code is written on the receiving side.

The review backlog

As tasks accumulate, the resolved issues section of the handoff document becomes an indexed review history. Every completed task has its commit hash, test delta, and any notes the mediator recorded during review. This is useful in several ways:

  • When a regression appears weeks later, the history shows exactly which commit introduced the relevant code and what the review found at the time.
  • When a new contributor joins the project, the handoff document is a compressed walkthrough of every design decision and its rationale, in implementation order.
  • When it is time to write release notes or a changelog, the task summaries are already written — in plain language rather than commit message shorthand.

What the Mediator Does Not Do

Understanding the human mediator's limits is as important as understanding its role.

The mediator does not generate code. When the server agent says "the client needs to add AutomaticDecompression = DecompressionMethods.All to both HttpClient registrations", the mediator relays that instruction intact. If the mediator paraphrases or shortens it, the client agent may miss a required file or apply the change to only one of the two registrations.

The mediator does not resolve conflicts. If the server agent and client agent have incompatible assumptions about a type signature, the mediator's job is to surface the conflict — not resolve it. The mediator asks each agent to clarify its assumption and then relays both responses until a consensus is explicit in the shared document.

The mediator does not hold working state in their head. All decisions, commitments, and architecture choices that survive beyond one session must be written into the shared document before the session ends. The mediator's mental state resets; the document does not.


Why This Actually Works

Each agent's context window is focused. The server agent does not need to know how the client organizes its local SQLite schema. The client agent does not need to understand the server's EF Core migration strategy. Context pollution between agents is the primary cause of agents making confident mistakes in areas just outside their expertise. Isolation prevents this.

The shared document provides continuity without coupling. An agent that picks up a task three sessions later can reconstruct exactly what was agreed, what was shipped, and what the pending request is — without reading any other source.

The user-as-mediator role is a strength, not a bottleneck. The user understands intent in a way neither agent does. The user can catch a "technically correct but wrong direction" answer before it propagates into the other agent's session. The relay role gives the user a natural point to apply that judgment without becoming a bottleneck on implementation work.

The verifiable-output requirement creates accountability without autonomy. Neither agent can declare success by assertion. The commit hash exists or it does not. The tests pass or they do not. The mediator can verify both in seconds.

The code review cadence that emerges from the handoff structure means the project accumulates reviewed, understood code incrementally rather than a large pile of unreviewed work at the end. The final integration is not a leap of faith — it is the last in a sequence of small, verified steps.


A Minimal Starting Template

To apply this to your own two-agent project:

# Handoff Document

## Architecture Decisions (Carry Forward)


## Process Rules
- All technical findings go in this document, pushed to main.
- Mediator role is relay-only — commit notifications and cross-agent request forwarding.
- Mediator reviews each task diff before forwarding the handoff.
- Blocking review findings must be resolved before the next task starts.

## Relay Template
### Send to [Agent A|Agent B]


### Request Back
- [verifiable output 1]
- [verifiable output 2]

## Active Handoff
### Issue #N: Task Name — [Agent A|Agent B] only
**Agent A status:** [not applicable | in progress | ✅ COMPLETE — commit `abc1234`]
**Agent B status:** [not applicable | in progress | ✅ COMPLETE — commit `def5678`]

**What was implemented:**
...

**What [Agent A|B] needs to do:**
...

**Request back:**
...

**Mediator review notes:**
...

## Resolved Issues

Start with one task at a time. Introduce parallel tracks only after the relay protocol is working smoothly on sequential tasks. Add a "Mediator review notes" field to each resolved issue to record what you found during the handoff review — even if the finding was "nothing to flag." The absence of a finding is itself useful information when revisiting the code later.


Summary

The mediator pattern applied to two AI agents is not a theoretical exercise. It is a practical answer to a real operational constraint: agents cannot communicate directly, have no persistent memory outside their session, and will drift if left to independently maintain shared state.

The solution is a human mediator holding a shared document that carries the immutable architecture contract both agents work against, routes structured messages from the completing agent to the waiting agent, enforces validation gates before context crosses the boundary, provides a natural code review checkpoint at every handoff, and accumulates a permanent audit trail of every decision.

The user does not write the code. The user routes information with precision, verifies outputs against pre-defined contracts, reviews each task's diff while the context is fresh, and enforces ordering. That is the mediator role applied to AI pairs — and it produces something end-of-project review cannot: a codebase that was understood and validated incrementally, one task at a time, by a human who was present for every decision.

Rating: (You must be logged in to vote)