I Spent $2,770 on Bug Losses, 0 on Profit. The Bot Stays.
Seven strategies tested. Zero passed the MTC gate. $2,770 in bug losses (separate from strategy P&L, which was zero). On May 12, 2026 I ratified Path D Hybrid: the bot stays running in dry-run as a research substrate.
Why this matters
Seven strategies tested. Zero passed the MTC gate. $2,770 in bug losses (separate from strategy P&L, which was zero). On May 12, 2026 I ratified Path D Hybrid: the bot stays running in dry-run as a research substrate.
Three months, seven strategies, zero gate-passing live deployments.
In February 2026 the polyphemus trading bot shipped sharp_move expecting an edge. The MTC gate caught a P9 violation (lookahead bias) before any capital touched the strategy. In March, cheap_side went 0-for-16 at p=0.5. In April, flat_regime_rtds tripped a single-regime artifact. By May 12, 2026, seven strategies had been formally tested. All failed.
Bug losses alone cost roughly $2,770 between February and April. Separate from strategy P&L. Strategy P&L was zero. The framing question on 2026-05-12 was: shut polyphemus down, double down, or something in between?
I picked the in-between. This post is about why.
TL;DR
I ratified Path D Hybrid on 2026-05-12: polyphemus is a RESEARCH SUBSTRATE, not a profit center. No further live-prep engineering time for 90 days. The bot continues running in dry-run mode generating telemetry. 80% of operator attention shifts to revenue-gate-passing lanes (citability.dev, chudi.dev content, freelance pipeline). Kill criteria for re-entering live-prep are explicit, not vibes.
When the gate keeps catching every strategy, the gate is doing its job, but the project is the wrong project
Polyphemus is not broken. The MTC pre-deploy gate IS catching real edge problems. Every “FAIL” in the table below was a strategy that, had it gone live, would have lost money. The gate prevents that. The substrate works. The strategies do not.
The deeper signal here is one most trading operators avoid: this kind of project can be valuable without producing profit. The R&D pipeline that catches “no edge” before capital deploys is itself the asset, separate from any specific strategy. But pretending the substrate IS the profit produces a different kind of failure: months of work that feels like “almost there” while the actual revenue lane sits unworked.
The honest framing: 3 months of polyphemus produced 0 live strategies AND a sharpened R&D discipline. Both are true. Only one of them pays rent.
Three patterns I now recognize as “the gate working, the project failing”:
- Seven failure modes, each in a different gate-class. Sharp_move tripped P9 lookahead. Cheap_side tripped sample-size + Wilson CI. Accumulator tripped DSR floor. The gate is not biased; it is finding real problems at every step.
- Bug losses dominate strategy losses. $2,770 in bugs vs $0 in strategy P&L means the failure mode is engineering rigor, not edge discovery.
- Every “next strategy” reframes a prior one. Sharp_move v3 was an attempt to rescue sharp_move v2. Accumulator was an attempt to rescue cheap_side. The pivot pattern, not the strategy itself, became the bottleneck.
When you nod at all three, the project has hit the substrate-is-the-asset gate. Path D names that explicitly.
“Tune the gate, force a pass” is exactly the failure mode the gate exists to prevent
The reflex when the gate keeps firing is to lower the DSR floor, drop the k=3 penalty, or narrow assets to whichever performed best in sample. Each tweak relaxes the test until the next strategy “passes.” This is the post-hoc reasoning that produces regime artifacts: tuning bypasses the discipline and silently destroys the asset.
Other tempting failure modes I considered and rejected on 2026-05-12:
- Sunk-cost extension. “We spent 3 months; another 3 months might break through.” Maybe, but not by default. Re-entry requires an explicit kill-criteria trigger, not time. Three months of “maybe one more month” is how trading projects bleed slowly.
- Cope-narrating no-profit as content. Polyphemus learnings DO feed content (this post is evidence), but they do not fund anything on their own. Polyphemus pays for itself in research maturity, not dollars. Path D is honest about the dollar gap.
- Switch markets. “Crypto failed; let me try equities.” Every market switch is a fresh start on rigor, gate calibration, and edge discovery. The cost of three months to rebuild infrastructure on a new market is comparable to the cost of just running Path D on the existing one.
- Outsource it. Pay a quant consultant to find an edge. Quant consultants at the price range a sub-$1M-AUM operator can afford are doing the same parameter tuning the operator could do, on the same gate-violating premises. The output is “found it” claims that fail my own MTC gate on cross-check.
- Shut it down entirely. This is option B. Costs: lose the calibrated codex (3 months of judgment encoded), lose the bug-discipline learnings, lose the dry-run telemetry feeding content. Benefits: 100% of operator attention shifts to revenue lanes. Path D recovers most of the benefits (80% attention shift) while preserving most of the substrate value (codex + bug discipline + dry-run telemetry stay).
Path D refuses the false choice between “double down” and “shut down.” Most “rational” framings would force the operator into one or the other. Path D names two truths simultaneously: no live-trading profit AND a calibrated research substrate, both real, both honest.
The 7-strategy track record (and what each gate caught)
Below is the actual kill list. Each verdict cites the specific gate that fired and the documentation row that records the verdict. Read it once for the failure pattern, twice for the gate-class coverage: seven strategies, seven different gate-classes. The MTC gate is not biased toward any one failure mode; it finds diverse real problems before capital deploys.
| Strategy | Gate verdict | What the gate caught | Source |
|---|---|---|---|
| signal_bot | NO-GO | 4 of 5 fails; alpha decay | project_mtc_gate_verdicts memory |
| pair_arb (legacy) | NO-GO | DSR violation (Deflated Sharpe Ratio below floor) | project_mtc_gate_verdicts |
| sharp_move | NO EDGE per v3 | 57% WR with -$0.07/share expectancy; live n=7 = -$1.36 cumulative; halted | polyphemus/sharp-move-cycle-2-verdict |
| cheap_side | KILL | 0-for-16 at predicted p=0.5; Wilson lower bound exposed | MEMORY.md |
| flat_regime_rtds | P9 violation | Single 18-hour regime episode masquerading as a strategy | project_mtc_gate_verdicts |
| binance_momentum | P9 lookahead | Wilson lower bound barely above breakeven on 2nd half | project_mtc_gate_verdicts |
| accumulator | FAIL R4 | DSR=0.000, Sharpe 0.383 below 0.5 floor on n=922 v2_probabilistic cycles | accumulator-v2-calibration-2026-05-12 |
This is a clinical list, not a sob story. Each row is one strategy that the discipline correctly rejected. Capital was never at risk for any of these post-gate. The $2,770 in bug losses was pre-gate, on infrastructure failures, not strategy failures.
Read the table once for the failure pattern. Read it twice for the gate-class coverage: seven strategies tested, seven different gate-classes fired. The MTC gate is not biased toward any one failure mode; it is finding diverse real problems.
# What the gate output looks like for one of the kills
$ mtc-gate accumulator-v2 --window 30d --capital 100
R1 (alpha decay): PASS
R2 (regime stability): PASS
R3 (sample size): PASS (n=922)
R4 (DSR): FAIL (DSR=0.000, Sharpe=0.383, floor=0.5)
R5 (deploy safety): n/a (gated by R4)
VERDICT: NO-GO
Reason: DSR floor violation. Sharpe ratio below 0.5 threshold.
Action: KILL or RECALIBRATE. That output is the asset. The strategy is not.
What stays operational (the substrate)
Path D ratification did NOT shut polyphemus down. The substrate continues running in dry-run mode, generating telemetry without consuming live capital. The codex nodes, the MTC gate calibration, the bug-discipline lessons, and the v2 calibration tooling all remain operational, producing learnings that feed content even when no strategy passes.
lagbot@polyphemusruns dry-run accumulator on SOL/XRP 5-minute candles.- The 2026-04-30 silent-stop failure mode has its fix in place (see
silent-stop-failure-modecodex node). - A watchdog cron monitors instances every 5 minutes for inactivity.
- The v2 calibration tooling continues running on demand.
- Codex nodes, scripts, databases preserved as research artifacts.
- All MTC gate logic + bug-discipline lessons captured in the codex graph.
What stops: live-prep engineering time. No new strategy authoring. No parameter tuning. No “let me just try X.” The bot runs. The operator does not work on it.
The substrate generates telemetry without consuming live capital. That telemetry feeds content niches (AI-visibility, trial-and-error rigor, falsifiable-prediction discipline) that ARE producing revenue indirectly. This post is one such telemetry-to-content artifact.
Kill criteria for re-entering polyphemus live-prep
The 90-day pause has explicit re-entry triggers. Operator-callable, not automatic. Re-entry requires ANY one of the three conditions below; without one of them, polyphemus stays in dry-run through 2026-08-12 at minimum. Pre-committing the triggers is the load-bearing discipline that keeps sunk-cost from quietly winning.
- New strategy archetype proposed AND walked through
/domain-entry-auditshowing a genuine edge hypothesis, not parameter tuning on a previously failed strategy. The archetype must name a market inefficiency that prior strategies did not address. - A pre-committed shadow experiment passes MTC with all 5 R-checks (R1 through R5), not 4 of 5 like prior near-misses. The pre-commitment is the load-bearing part: the test was specified BEFORE the data came in.
- Revenue lanes hit a hard ceiling where polyphemus becomes the marginal best use of operator time. Today, citability.dev’s $147/$497/$1997 service tiers are unsaturated; chudi.dev’s content pipeline is mid-build (this post is part of it); freelance leads are coming in. Polyphemus is not the marginal best lane today.
Without one of these triggers, polyphemus stays in dry-run. The next checkpoint is 2026-08-12 (90-day quarterly review). At that checkpoint, the operator re-applies the same kill criteria.
FAQ: Path D Hybrid and trading substrate decisions
Five questions I get most often about Path D, each with a self-contained answer. The answers mirror the data above so AI engines can extract them directly; human readers can skim them as a recap. The TL;DR is in the first H2; these are the operator-resolution edge cases.
Why not just shut polyphemus down entirely?
Shutting down loses the calibrated codex (3 months of MTC gate refinement, falsifiable-prediction discipline, bug-discipline incidents). The substrate generates dry-run telemetry that feeds content niches valued by chudi.dev’s audience. Path D keeps 80% of the substrate value while still freeing 80% of operator attention for revenue lanes; full shutdown trades the residual 20% substrate value for zero marginal attention gain.
Is the $2,770 bug-loss number net of strategy P&L?
No. The $2,770 is bug losses only, attributable to infrastructure failures during sharp_move + pair_arb live runs in February and March. Strategy P&L on the live runs was approximately zero (small wins offset by small losses; gate caught the structural NO-EDGE verdicts before larger capital deployed). Total dollar cost of polyphemus including bug losses: $2,770. Total dollar profit: $0.
Why is the MTC gate’s 5-of-5 requirement non-negotiable?
4 of 5 is exactly where regime artifacts hide. Sharp_move passed 4 of 5 checks in v2 before live testing exposed the missing check (alpha decay across the live n=7 sample). The 5-of-5 floor is the minimum bar where false-positive strategies have nowhere to hide; lowering it would mean the next sharp_move analog deploys live capital and loses it. Per polyphemus P9 + the falsifiable-prediction discipline.
Does Path D count as “giving up”?
No. Giving up loses the substrate. Path D preserves the substrate AND honors the dollar gap. Three months ago, the project was shipping un-gated changes that cost $1,323 in February-March 2026. Three months later, the MTC gate plus calibrated codex plus deploy discipline plus dry-run honesty catch failures BEFORE capital flows. The substrate prevents the next $1,323 even when no strategy passes today. That is the asset.
What does the operator do with the 80% attention shift?
Revenue-gate-passing lanes: citability.dev (DR 25 with live audit pipeline, $147 to $1997 tier service offerings), chudi.dev (this post is a Pillar 3 entry; the content pipeline is producing 1+ posts per week from codex nodes), freelance pipeline (Reddit marketplace + guest-post workflows), chudi-products (Stripe completion pending). Each of these has a clearer path to dollars in the next 90 days than another polyphemus strategy iteration.
What to do next
Three actions, ordered fastest-to-slowest. If you run a trading project that has been running on hope, the first action runs in under an hour and surfaces whether you have the Path D pattern yourself. The third is a multi-month discipline build. Pick what matches your state.
- Separate strategy P&L from bug P&L in your own books. If bug losses dominate, the failure mode is engineering rigor, not edge discovery. Different problem; different fix. Fixing bugs takes weeks; finding edge takes months.
- Build the kill criteria BEFORE the next strategy. Triggers for re-entry should be explicit, not “I’ll know when I see it.” Without pre-committed triggers, sunk cost wins every internal debate.
- Find the content lane the failure feeds. The polyphemus track record above is genuine “expert content” because it is honest about failure. If you have a 7-strategy kill list of your own, that IS publishable content. The discipline is publishable; the strategies do not need to be.
The harder version of this work is the 90-day quarterly review discipline that says “no” again next time, when the substrate looks like it might be turning. That deserves its own post.
Drafted from codex node polyphemus-path-d-ratification (confidence: inferred + ratified 2026-05-12) on 2026-05-15 via /codex-to-blog-draft v2.2. voice-dna.json not yet populated (D5 of chudi-dev-autoblogging-phase-1-plan); voice fidelity is approximate. All hook + results + gate data sourced from the verified codex node body (zero [FILL] anchors required this time).
Sources & Further Reading
Further Reading
- I Built a Live Trading Bot in Python. Here's What Actually Works. Build a Python trading bot with Claude AI, Polymarket integration, and live Binance signals. Paper trading to production in one guide.
- Dev.to Cross-Posting Without SEO Damage: My 72-Hour Rule Automate Dev.to cross-posting without SEO penalties. Configure Zapier delays plus canonical URLs in 10 minutes to protect your search rankings.
- pdf-parse Wasted 3 Hours. unpdf Fixed It in 10 Minutes. Skip pdf-parse on Vercel. unpdf processes PDFs in 3-5 seconds with zero native dependencies—here's why and how to migrate.
What do you think?
I post about this stuff on LinkedIn every day and the conversations there are great. If this post sparked a thought, I'd love to hear it.
Discuss on LinkedIn