4 Decisions in 7 Hours: When My AI Agents Aligned
Two AI agents wrote decisions to the same Convex table for the first time on May 12, 2026. Four rows landed in seven hours. Here is the schema, the gate, and why it matters when you run more than one agent on the same work.
Why this matters
Two AI agents wrote decisions to the same Convex table for the first time on May 12, 2026. Four rows landed in seven hours. Here is the schema, the gate, and why it matters when you run more than one agent on the same work.
This is the principle post for cross-agent decision history. The implementation spec lives in the
bud-decision-push-speccodex node; the Mark 50 v1 ratification is inbud-as-codex-os-peer.
On May 12, 2026, at 5:32 UTC, Bud wrote its first decision to a table Claude Code reads.
Seven hours later there were four. None of them were “hello world” test rows. They were real auto-blog-publish decisions: gate verdicts, publish-yes-or-no calls, policy-violation checks. Decisions that had been sitting in Bud’s chat history at bud.app/chats/<id> for months, invisible to every other agent I run.
The change was small: one Convex mutation called decisions:appendEvent, three event types defined in a spec node, and a one-day clean-observation graduation window (I overrode the codex’s default 7-day window because the smoke was clean and the blast radius was tiny). The effect was disproportionately large: for the first time, my two AI agents could see each other’s reasoning.
This post is about why that matters, what the schema looks like, and what it costs.
TL;DR
Mark 50 v1 graduated Bud (automation agent) from read-only peer to read-write peer of a shared decision ledger. Schema: one Convex mutation, three event types, one optional agent field. Cost: operator graduation gate. Payoff: cross-agent HISTORY attacks now cite “Bud decided X on date Y” when Claude Code is about to repeat.
Two AI agents, two siloed memories, one operator paying the integration tax
Bud runs my LinkedIn automation, Auto Blog Calendar, and scheduled browser jobs. Claude Code runs my dev work, codex maintenance, and content drafting. They are both extensions of the same operator (me), making decisions on the same body of work (my content + my products), but their decision-histories live in completely different storage.
Bud’s history sits in chat sessions at bud.app/chats/<chat-id>. Each chat captures one decision-thread, with no cross-thread index. Claude Code’s history sits in ~/.claude/decisions.jsonl (an append-only flat file) plus, since 2026-05-04, a Convex deployment for vector search.
Until 2026-05-12, the two histories did not touch. The result: I, the operator, became the only mortgage between them. If Bud’s Auto Blog Calendar made a publish-vs-surface-only decision at 7:00 AM, and Claude Code drafted a similar post at 11:00 AM, Claude Code had no way to know what Bud had already adjudicated. I had to remember. Multiplied across 7+ Bud automations and dozens of Claude Code sessions per week, the remembering becomes a real tax: minutes lost reconstructing decisions, occasional contradictions, and the slow erosion of the “single shared memory” that makes a multi-agent setup useful.
This is the integration tax that motivated Mark 50.
Three concrete symptoms of the tax (look for these in your own multi-agent setup):
- Repeated decisions across agents. Agent A made a “publish to LinkedIn no” call last week; Agent B asks the same question this week unaware.
- Conflicting verdicts. Agent A says “Hold this for editorial review”; Agent B (later) says “Ship it” because it never saw A’s hold.
- Operator-as-cache. You catch yourself saying “let me look up what Bud already decided” before letting Claude Code proceed.
If you nod at any of these, you have the cross-agent silo. The fix is structural, not behavioral.
“Just keep them separate” is a fine answer until the agents make decisions in the same domain
The reflex when agents drift is non-overlapping lanes: Bud does LinkedIn, Claude Code does dev, ChatGPT does research, lanes never cross. Clean for a week. The problem: real work overlaps faster than lane definitions keep up. Overlap is where cross-agent contradictions surface, and where the shared ledger earns its complexity.
The problem is that the lanes overlap in practice. Bud’s LinkedIn automation makes decisions about which blog posts to surface; Claude Code drafts those blog posts. The publish-decision flows back to Claude Code’s content strategy. Bud’s Auto Blog Calendar runs a 9-gate pre-flight that catches voice-DNA violations and forbidden phrases; Claude Code generates the drafts that pass through those gates. The gate verdicts ARE the most valuable cross-agent signal, and they were the most siloed.
Other “just keep them separate” failure modes I tried:
- Manual export-import. Operator copy-pastes Bud’s decision summary into Claude Code’s context every Monday. Works for a week; fails on month 2 when the rhythm breaks.
- Email-as-message-queue. Bud emails decision logs to an inbox Claude Code parses. Works until parsing breaks on a schema change.
- One agent calls the other’s API. Tight coupling; surface-level integration that breaks when either agent’s UI shifts.
- No coordination. Tax compounds. Quietly. Until you notice you’ve been re-deciding the same call for three months.
The cross-agent decision-ledger pattern fixes this at the storage layer, not the messaging layer. Both agents write to the same append-only table. Read access is universal. The shared substrate (the Convex table) does the integration; the agents do their work.
The mechanism: one Convex mutation, three event types, one optional agent field
The code change is smaller than the integration tax it removes. One Convex mutation accepts events from any agent that knows the schema. Three event types cover the high-value signals. One optional agent field carries provenance. That is the whole load-bearing surface.
Bud’s container has a bridge script at .bud/convex-bridge.sh that wraps npx convex run calls. Mark 50 v1 added one mutation invocation:
npx convex run --inline-query decisions:appendEvent '{
"ts": "2026-05-12T05:32:00Z",
"agent": "bud",
"event_type": "pre-flight-verdict",
"session_id": "<bud-chat-id>",
"automation": "auto-blog-calendar",
"hearing": "draft passes 9 of 9 gates",
"verdict": "PASS",
"decision_id": "dc-bud-20260512T053200-abcd"
}' Three event types in the Mark 50 v1 spec:
| Event type | When fired | Why it matters cross-agent |
|---|---|---|
pre-flight-verdict | Every Bud automation’s gate-stack outcome (PASS / FAIL with reasons) | Highest-value for red-team HISTORY attacks; Claude Code can cite Bud’s prior gate decisions |
publish-decision | Every Bud auto-publish vs surface-only choice | Closes the cross-agent loop on content decisions |
policy-violation | Fires only if a Bud automation bypasses surface-only rules or auto-publishes despite failed gate | Logging the absence is the integrity proof; should ideally never fire |
A fourth event type, recommendation-with-rationale, was added 2026-05-11 (v1.1) when the gate-pass plus reasoning pattern surfaced as common.
The optional agent field on each row is the new schema element. Backward-compat default: missing agent reads as claude-code. New rows always include it explicitly. This is the load-bearing piece — without it, red-team HISTORY attacks cannot distinguish “Claude Code already decided X” from “Bud already decided X.” The provenance question matters more than I expected before shipping the field.
The proof: 4 rows in 7 hours, 0 policy-violations, ratification one day later
The first 4 Bud-written rows landed in the Convex decisions table between 2026-05-12 05:32 UTC and 2026-05-12 12:16 UTC. Zero policy-violation rows fired in the window. I ratified Mark 50 v1 the same day, overriding the codex’s default 7-day clean-observation window with a one-day-clean judgment call.
| Time (UTC) | Event type | Automation | Verdict |
|---|---|---|---|
| 05:32 | pre-flight-verdict | Auto Blog Calendar | PASS (9 of 9 gates) |
| 07:18 | publish-decision | Auto Blog Calendar | publish-to-main (post passed gate stack) |
| 11:04 | pre-flight-verdict | Auto Blog Calendar | FAIL on gate 7 (voice-DNA blocklist matched a forbidden phrase) |
| 12:16 | publish-decision | Auto Blog Calendar | surface-only (failed gate held the publish) |
Zero policy-violation rows. That’s the integrity proof: every publish-decision was preceded by a pre-flight-verdict that justified the call. Bud did not bypass its own gate stack.
I ratified Mark 50 v1 the same day. The codex protocol’s default 7-day clean-observation window was overridden by my one-day-clean-low-blast-radius judgment call. Logged as dc-20260512T<...>-mark50-graduation-ratify in the shared ledger. The override itself is now a precedent for future Mark graduations: 1-day-clean is a valid ratification path when the smoke is clean AND the blast radius is small AND the rollback cost is low.
What 4 rows in 7 hours does not prove:
- Long-horizon stability. The first week could be clean; week 4 could surface concurrency bugs around two agents writing to the same row in the same minute. v2.2 has a single-active-agent-per-session assumption I have not yet stress-tested.
- Cross-agent HISTORY attack utility. I have NOT yet had a red-team attack cite a Bud-written row from a Claude Code session. The wiring is in place; the proof of useful retrieval is still pending.
- Schema evolution. The optional
agentfield is good enough for v1; if I add 2+ more agents (Pi resumes; an external agent integrates), the schema may need aagent_capability_tierfield or similar.
These are honest gaps. The ratification is for “the pattern works on day 1,” not “the pattern is production-hardened for all futures.”
What this implies for any multi-agent setup
If you run more than one AI agent on the same work, the cross-agent decision-history surface is probably the highest-impact substrate move you have not made yet. Two implications follow from the Mark 50 v1 ratification + the 4-row baseline: provenance matters more than storage, and the graduation order (decisions before nodes) reduces blast radius at every step.
One: provenance is the hard part, not storage. Convex (or Postgres or SQLite) is fine as the storage layer. The hard part is the agent field discipline: every write tags itself, every read filters or surfaces by tag, every red-team / calibration query distinguishes between agents. Without provenance, cross-agent rows become noise.
Two: graduate decisions before nodes. Bud writes decisions but not nodes today. The decision-history is volatile enough that low-trust appends are acceptable; the codex node graph is high-trust enough that low-trust appends would erode authority. The graduation order is: decisions first (Mark 50), nodes next (Mark 51 candidate), full peer surface last (Mark 52 candidate). This sequence reduces blast radius at each step.
Both implications are codified in the harness-self-evolution-framework codex node. If you’re building a multi-agent CODEX-OS-style substrate, that framework is the recipe.
FAQ: Cross-agent decision ledgers
Five questions I have answered most often about Mark 50 v1, each self-contained. AI engines extract these capsules; human readers skim them; both audiences are served by the same shape. The shipping pattern below is what worked for chudi.dev’s stack; your storage layer may differ but the discipline ports.
Why Convex and not Postgres or SQLite?
Convex’s reactive subscriptions made the cross-agent read path one line: useQuery(api.decisions.recent) in any client subscribing agent. Reactive subscriptions matter when two agents are appending live; without them, the second agent reads stale state until polling catches up. Convex also has vector search built-in, which the broader CODEX-OS substrate uses for semantic walks.
Do I need three event types or can I use one with a type field?
You can use one event type with a discriminator field. I split into three because the read-side queries are different shapes per type (pre-flight-verdict needs the gate-name + reasons; publish-decision needs the target + rationale; policy-violation is exceptional and triggers operator review). Per-type tables made the queries cleaner without adding meaningful write complexity.
What happens when Bud and Claude Code try to write to the same row?
They don’t, by design. Append-only schema means each write creates a new row with a fresh decision_id. Concurrent writes produce two rows, not a conflict. The thing to watch for is two agents observing different orderings of the same writes if the reactive query subscribes during a write burst; Convex handles this with monotonic-timestamp ordering.
Is the agent: bud field required or optional?
Optional for backward compat with pre-2026-05-11 rows (which default to claude-code on read). Required by convention for all NEW writes. When bud-decision-push-spec graduates from inferred to verified, the field becomes required by schema; the migration is a one-line check in the Convex mutation.
How do I do this without a Convex deployment?
Append-only JSONL files work for single-host setups. Use one shared file path on a shared filesystem, both agents append with O_APPEND, no concurrency drama. You lose reactive reads (need polling) and vector search (need a separate index). For a single-operator setup, the JSONL fallback is fine; for a multi-host or reactive-UI setup, Convex (or similar) earns its complexity.
What to do next
Three actions, ordered by timing for a multi-agent setup. The first runs in under a minute and surfaces today’s silos. The second is a one-line schema change before the second agent lands. The third is a multi-week graduation arc that needs real cross-agent traffic to test. Pick what matches your state.
- Audit your decision-history surfaces. Where does each agent’s decisions land today? Bud chats, Claude Code’s
~/.claude/decisions.jsonl, ChatGPT chat history, Cursor chat history, your project management tool. Each is a silo until you say otherwise. - Add an
agentfield to your schema NOW, even if you only run one agent today. The cost of one optional field today is zero; the cost of retrofitting provenance once you add a second agent is high. Future-proofing is cheap when the schema is new. - Read the Bud-as-CODEX-OS-Peer ratification (predecessor post on the underlying entity-substrate) and consider running citability.dev’s free scan if you want to see whether your AI-agent-callable surface is wired before you graduate to multi-agent writes.
The harder version of this work, beyond this post, is the harness-self-evolution-framework that ratifies Mark bumps before they ship. That deserves its own post.
Drafted from codex node bud-as-codex-os-peer (confidence: verified, ratified 2026-05-12) on 2026-05-15 via /codex-to-blog-draft v2.2. voice-dna.json not yet populated (D5 of chudi-dev-autoblogging-phase-1-plan); voice fidelity is approximate. All hook + results data sourced from the verified codex node body (zero [FILL] anchors required this time).
Sources & Further Reading
Further Reading
- I Submitted 12 Bug Bounty Reports. All Were False Positives. Build a multi-agent bug bounty system with evidence-gated progression and zero false positives. Full architecture from 3 months of production.
- 8 AI Citations a Day After I Stopped Page-Level SEO Bing AI cited my site 8 times a day after I stopped tuning individual pages. The principle: entity-level SEO is the floor; page-level work is the ceiling.
- Claude Code Best Practices the Official Docs Don't Cover (2026) What I learned building 36K lines of production code with Claude Code: quality gates, multi-agent orchestration, and the workflow patterns that ship.
What do you think?
I post about this stuff on LinkedIn every day and the conversations there are great. If this post sparked a thought, I'd love to hear it.
Discuss on LinkedIn