Skip to main content
Build a Binance-Polymarket Signal Pipeline in 2026

Build a Binance-Polymarket Signal Pipeline in 2026

Build a Binance-Polymarket Signal Pipeline in 2026
Chudi Nnorukam Oct 15, 2025 Updated Mar 11, 2026 13 min read

Build a Binance-to-Polymarket trading pipeline: real-time signal detection, deduplication, and sub-100ms order execution for momentum trading.

Why this matters

The pipeline has four stages: Binance aggTrade WebSocket stream, a rolling window momentum detector, a signal guard that prevents duplicate entries, and a CLOB order executor. Calibration matters — 0.3% in 60 seconds is the threshold that generates 3-8 signals per day without trading noise. Below that you drown in false triggers.

In this cluster

Cluster context

This article sits inside Quantitative Trading Systems.

Open topic hub

Building, testing, and operating automated trading bots on prediction markets and crypto exchanges.

Trading systems are only interesting if they survive production. This cluster is about execution, robustness, and market reality.

Prediction markets are slow to reprice. Binance spot is fast. The pipeline between them is where the edge lives.

This post covers the full signal pipeline for a Polymarket latency arbitrage bot: from Binance WebSocket stream to CLOB order placement, including the filtering, deduplication, and async architecture that prevents duplicate entries and missed signals. If you want the full system architecture first, start with how I built the Polymarket trading bot — this post is a deep-dive on Stages 1 through 3 of that architecture.

TL;DR

  • Stream Binance aggTrade via WebSocket — tick-level data, not candles
  • Rolling 60-second window detects >0.3% momentum
  • Signal guard suppresses re-entry on the same direction
  • Async executor places CLOB maker order within 100-200ms of signal detection
  • Run on Amsterdam VPS: 5-12ms to Polymarket CLOB in London

Why This Pipeline Exists

Polymarket lists binary markets on 5-minute BTC price movements: “Will BTC be higher in 5 minutes?” Market makers reprice YES/NO probabilities based on Binance spot. When BTC moves sharply, there is a 30-90 second lag before Polymarket odds fully reflect the move.

The pipeline exploits that lag by:

  1. Detecting the move on Binance first
  2. Placing a maker order on Polymarket before market makers reprice
  3. Collecting the spread between entry price and resolved probability — the expected value math that makes this profitable is straightforward once you have a calibrated p_true

Stage 1: Binance WebSocket Stream

The signal source is Binance’s aggTrade stream — not klines (candlesticks). Here is why that distinction matters:

  • aggTrade: fires on every individual trade, sub-second latency
  • klines/1m: fires once per minute, 0-60 second stale data window

For a strategy where the edge window is 30-90 seconds, klines make the signal source 33-100% as wide as the edge itself. You need tick data.

import asyncio
import websockets
import json
from collections import deque
from dataclasses import dataclass
from typing import Optional

@dataclass
class Tick:
    timestamp: float  # unix seconds
    price: float

class BinanceStream:
    SYMBOL = "btcusdt"
    URL = f"wss://stream.binance.com:9443/ws/{SYMBOL}@aggTrade"

    def __init__(self, on_tick):
        self._on_tick = on_tick
        self._running = False

    async def run(self):
        self._running = True
        backoff = 1
        while self._running:
            try:
                async with websockets.connect(self.URL) as ws:
                    backoff = 1  # reset on successful connect
                    async for raw in ws:
                        msg = json.loads(raw)
                        tick = Tick(
                            timestamp=msg["T"] / 1000,
                            price=float(msg["p"])
                        )
                        await self._on_tick(tick)
            except Exception as e:
                await asyncio.sleep(backoff)
                backoff = min(backoff * 2, 30)  # exponential backoff, max 30s

    def stop(self):
        self._running = False

The reconnect loop with exponential backoff is not optional. Binance WebSocket connections drop 2-3 times per 24-hour session in my experience, sometimes silently — no error event, no close frame, just a dead socket. Without reconnect logic, your bot goes silent and you don’t notice until you check P&L and find it hasn’t traded in 6 hours.

The exponential backoff caps at 30 seconds because Binance rate-limits reconnection attempts. If you hammer the endpoint with immediate retries, you’ll get temporarily banned. The backoff resets to 1 second on successful connection, so normal reconnects happen quickly while sustained outages don’t burn through your rate limit budget.

One subtlety worth noting: the on_tick callback is an async function, which means the entire pipeline from tick receipt through order placement runs on the same event loop. This is intentional. A synchronous callback would block the WebSocket receive loop during order placement, causing missed ticks. The async design lets Python interleave tick processing with CLOB API calls without threading complexity.

Stage 2: Rolling Window Momentum Detector

The detector maintains a 60-second rolling window of price ticks and fires a signal when the net move exceeds the threshold.

from collections import deque
from enum import Enum
from typing import Optional

class Direction(Enum):
    UP = "UP"
    DOWN = "DOWN"

class MomentumDetector:
    THRESHOLD_PCT = 0.003   # 0.3%
    WINDOW_SECS = 60

    def __init__(self):
        self._window: deque[Tick] = deque()

    def update(self, tick: Tick) -> Optional[Direction]:
        self._window.append(tick)
        self._prune(tick.timestamp)

        if len(self._window) < 2:
            return None

        oldest = self._window[0].price
        newest = self._window[-1].price
        pct_move = (newest - oldest) / oldest

        if pct_move >= self.THRESHOLD_PCT:
            return Direction.UP
        if pct_move <= -self.THRESHOLD_PCT:
            return Direction.DOWN
        return None

    def _prune(self, now: float):
        cutoff = now - self.WINDOW_SECS
        while self._window and self._window[0].timestamp < cutoff:
            self._window.popleft()

The deque pruning keeps memory bounded regardless of how long the bot runs. Without pruning, the window grows unboundedly and the oldest price comparison becomes meaningless.

How the Momentum Detector Works

The detector solves a core problem in momentum-based trading: distinguishing real directional moves from noise. Traditional momentum indicators use closed candles or fixed time windows, but a prediction market bot can’t wait for a candle to close—by then the Polymarket market makers have already repriced.

The rolling window approach is different. Instead of waiting for a time boundary, we track every single tick that arrives and continuously evaluate whether the oldest and newest prices in our 60-second window differ by at least 0.3%. This means the signal can fire at any moment during the window, not just at fixed intervals.

The algorithm’s elegance lies in its simplicity: for every new tick, we append it to the deque, prune old ticks that fall outside the window, and check if the price move from oldest to newest exceeds our threshold. The computation is O(1) per tick (minus the O(n) for pruning, which amortizes to O(1) across the session because each tick is pruned exactly once).

Why a deque instead of a circular buffer? A circular buffer would require pre-allocating memory for 60 seconds of ticks (roughly 1000+ ticks per second on active Binance, so 60K+ entries). A deque grows naturally with actual tick volume and shrinks as old ticks are discarded. On slow market days, memory usage is negligible. On volatile days, the deque expands to match real activity.

The deque approach also handles market microstructure better than you might expect. During a flash crash or sudden spike, ticks come in clusters — multiple trades at the same or nearby prices fire within milliseconds. A rolling window sees all of them, whereas a sampled-tick approach would miss intermediate prices and underestimate the move. This is critical for latency arbitrage: you want to detect moves as early as possible, and the deque gives you continuous visibility into the price action rather than discrete snapshots.

Another practical advantage: the deque naturally handles reconnects gracefully. When the WebSocket reconnects, you can reseed the deque from historical REST API data (the last 120 seconds of klines), and the detector will have full context for the next signal. A circular buffer would need explicit reset logic. A rolling window with a deque just works.

Threshold Calibration

0.3% in 60 seconds is the calibrated threshold for BTC. How I arrived at it:

  • 0.15% in 30s: fires constantly — 15-20 signals per day. Most resolve as noise.
  • 0.3% in 60s: fires 3-8 times per day. 62% of signals land in-range (market resolves in signal direction).
  • 0.5% in 60s: fires 0-2 times per day. Too infrequent to validate or build statistical confidence.

The 62% in-range rate translates directly to your p_true estimate for position sizing — see the math behind directional betting in binary markets for how to convert that into expected value and Kelly fraction.

The threshold needs recalibration for other assets. ETH has a different noise floor than BTC. SOL and XRP are noisier at shorter windows.

Stage 3: Signal Guard

Without a signal guard, a sustained 3-minute BTC rally fires 3 separate signals — and the bot opens 3 long positions on what is effectively one trade. The signal guard prevents this.

import time

class SignalGuard:
    COOLDOWN_SECS = 120  # 2 minutes

    def __init__(self):
        self._last_direction: Optional[Direction] = None
        self._last_signal_ts: float = 0

    def should_trade(self, direction: Direction) -> bool:
        now = time.time()
        time_since_last = now - self._last_signal_ts

        # Same direction within cooldown: suppress
        if (direction == self._last_direction and
                time_since_last < self.COOLDOWN_SECS):
            return False

        self._last_direction = direction
        self._last_signal_ts = now
        return True

The guard resets on direction change. A BTC DOWN signal immediately after a BTC UP position is a new, independent signal — not a duplicate. Only same-direction signals within the cooldown window are suppressed.

The 120-second cooldown is calibrated to the 5-minute market duration. A cooldown shorter than 60 seconds lets the bot stack positions on a single sustained move — you end up with three long entries that are really one trade with 3x the risk. A cooldown longer than 180 seconds suppresses legitimate second signals in volatile markets where BTC can make two independent moves within the same 5-minute window. The 120-second sweet spot prevents stacking while preserving the ability to trade a genuine reversal.

In practice, the signal guard prevents roughly 30-40% of raw signals from reaching the executor. This sounds like a lot of missed trades, but those suppressed signals are almost always duplicates of a sustained move — entering a second position on the same momentum rarely adds edge and always adds risk. The guard is one of the highest-value components per line of code in the entire pipeline.

Stage 4: CLOB Order Executor

The executor receives a validated signal and places a maker order on the Polymarket CLOB. The BASE_BET_USDC here is fixed for simplicity — in production, a self-tuning system adjusts this value dynamically based on recent win rate.

from py_clob_client.client import ClobClient
from py_clob_client.clob_types import OrderArgs, BUY

class CLOBExecutor:
    MIN_SHARES = 5       # Polymarket minimum
    BASE_BET_USDC = 15   # dollar amount per trade

    def __init__(self, client: ClobClient, market_registry):
        self._client = client
        self._registry = market_registry

    async def execute(self, direction: Direction) -> Optional[str]:
        market = self._registry.get_active_5m_btc_market(direction)
        if not market:
            return None

        # Check market has enough time remaining
        if market.secs_remaining <= 30:
            return None

        # Compute shares from dollar amount
        mid_price = market.mid_price
        shares = self.BASE_BET_USDC / mid_price

        if shares < self.MIN_SHARES:
            return None

        order_args = OrderArgs(
            token_id=market.token_id,
            price=mid_price,
            size=round(shares, 2),
            side=BUY,
        )

        result = self._client.create_and_post_order(order_args)
        return result.order_id if result else None

The expiry check (secs_remaining <= 30) prevents placing orders on markets that are about to close. Without it, you fill on a market that resolves 10 seconds later and cannot place an exit if needed. I learned this the hard way — an early version of the bot placed an order with 8 seconds remaining, got filled, and the market resolved before I could even check the position status. The capital was locked for the full resolution cycle.

The create_and_post_order call is critical. The Polymarket SDK has both create_order and create_and_post_order. The first only builds the order object locally without submitting it — a naming trap that cost me an evening of debugging when orders appeared to succeed but never showed up on the CLOB. Always use create_and_post_order for actual submission.

Wiring It Together: Async Event Loop

The full pipeline runs on a single asyncio event loop. The WebSocket receive loop, momentum detection, and CLOB placement all happen asynchronously without blocking each other.

async def main():
    detector = MomentumDetector()
    guard = SignalGuard()
    executor = CLOBExecutor(client, registry)

    async def on_tick(tick: Tick):
        direction = detector.update(tick)
        if direction is None:
            return

        if not guard.should_trade(direction):
            return

        order_id = await executor.execute(direction)
        if order_id:
            log.info(f"Placed order {order_id}{direction.value}")

    stream = BinanceStream(on_tick)
    await stream.run()

asyncio.run(main())

The event loop is single-threaded but non-blocking. on_tick is called for every Binance trade tick. The CLOB placement is awaited inline — if the CLOB call takes 50ms, the next tick is processed after it completes.

For strategies with heavier logic, you can offload placement to a separate task:

async def on_tick(tick: Tick):
    direction = detector.update(tick)
    if direction and guard.should_trade(direction):
        asyncio.create_task(executor.execute(direction))

This lets tick processing continue while placement runs in the background.

Latency Breakdown

On a well-tuned Amsterdam VPS, the pipeline latency from Binance tick to CLOB order acknowledgement:

StageLatency
Binance WS → Python receive1-5ms
Deque update + threshold check<1ms
Signal guard check<1ms
CLOB order construction<1ms
CLOB API round trip (Amsterdam→London)5-12ms
Total~10-20ms

20ms from Binance tick to CLOB order. The Polymarket repricing lag is 30-90 seconds. You have 30-90,000ms of edge window, and you use 20ms of it.

The math still holds even with US East latency (130-150ms total). The bottleneck is not network — it is whether anyone else detected the signal first.

What Can Go Wrong

WebSocket disconnects during a live position. Your position is still open, but you have no incoming price data to decide when to exit. Solution: implement position health checks on reconnect that query CLOB for open positions before resuming signal processing.

Market not found for a signal direction. The 5-minute BTC market rolls over every 5 minutes. If a signal fires at minute 4:58, the current market might have 2 seconds remaining. Your registry needs to handle market lookup gracefully with a fallback to the next available market.

CLOB rate limits. If you place too many orders in rapid succession, the CLOB returns rate limit errors. The signal guard helps here, but also implement a per-minute order count limiter with an asyncio.sleep backoff.

Silent order rejections. The CLOB occasionally rejects orders with a 200 OK but no order_id in the response. Always check the return value — do not assume success from HTTP status alone.

Testing the Pipeline Before Going Live

Before deploying this pipeline with real money, you need a systematic way to validate that every stage works correctly in isolation and as a connected system. The testing approach I used saved me from at least three bugs that would have cost real capital.

Start with the momentum detector. Feed it historical tick data from the Binance REST API and verify that signals fire at the expected timestamps. I downloaded 24 hours of aggTrade data for three different volatility regimes: a quiet Sunday, a normal weekday, and a day with a major price move. The detector should produce zero signals on the quiet day, three to eight on the normal day, and ten to fifteen on the volatile day. If the numbers are wildly different, your threshold is miscalibrated.

Next, test the signal guard in isolation. Create a synthetic sequence of signals: UP, UP, UP (should suppress the second and third), then DOWN (should pass through immediately), then DOWN again within two minutes (should suppress). This catches the most common guard bug, which is accidentally suppressing direction changes along with duplicates.

The executor is the hardest to test without real money. I built a dry-run mode that constructs the order object and logs it without calling the CLOB API. This validates the share calculation, the minimum share check, and the expiry guard. Run the dry-run executor against live market data for a full trading day and manually verify that every logged order makes sense: correct direction, reasonable share count, valid token ID, and sufficient time remaining on the market.

Finally, run the full pipeline end-to-end in dry-run mode for at least 48 hours. Watch for three specific failure patterns. First, zombie signals where the detector fires but the guard or executor silently drops the signal without logging why. Second, phantom orders where the executor logs an order that should have been suppressed by the guard. Third, timing failures where the executor tries to place an order on an expired market because the registry returned stale data.

The 48-hour window matters because it covers both high-volatility and low-volatility periods. A pipeline that works perfectly during active US trading hours might break during the Asian session when tick frequency drops and the rolling window behaves differently with sparse data.

Monitoring in Production

Once the pipeline is live, you need real-time visibility into every stage. Logging alone is not sufficient because you cannot watch logs continuously, and by the time you notice a problem in the logs, the damage is already done.

I built a lightweight health monitor that tracks four metrics. Signal rate measures how many raw signals the detector produces per hour. A sudden drop to zero means the WebSocket is probably disconnected. Guard pass rate measures what percentage of signals survive the guard. If this drops below fifty percent, the cooldown might be too aggressive for current market conditions. Executor success rate measures what percentage of passed signals result in a confirmed order on the CLOB. A drop here usually means an API issue or an allowance problem. Average latency measures the time from tick receipt to order acknowledgment. If this creeps above 100 milliseconds, something in the pipeline is blocking.

These four metrics tell you the health of the entire pipeline at a glance. I push them to a Slack channel every fifteen minutes during active trading hours. Any metric that crosses a threshold triggers an immediate alert. The alerting is simple: if signal rate hits zero for fifteen minutes during US market hours, something is wrong and the bot needs attention.

The most valuable metric over time turned out to be the guard pass rate. When market regime shifts from trending to ranging, the pass rate changes significantly. During strong trends, the guard suppresses sixty percent of signals because the same directional move generates multiple triggers. During choppy markets, the pass rate rises to eighty percent because signals alternate direction frequently. Tracking this ratio over weeks gave me insight into when the strategy was in a favorable regime versus when I should expect lower returns.

Signal Frequency Expectations

On active BTC trading days:

  • 3-8 signals with 0.3%/60s threshold
  • Each signal potentially captures a 5-minute market (or what remains of it)
  • 1-3 of those signals will be suppressed by the guard as duplicates of a sustained move

On quiet days (low volatility): 0-2 signals. This is fine. The strategy is about edge quality, not trade frequency.

Chudi Nnorukam

Written by Chudi Nnorukam

I develop products using AI-assisted workflows — from concept to production in days. chudi.dev is a live public experiment in AI-visible web architecture, designed for human readers, LLM retrieval, and AI agent interoperability. 5+ deployed products including production trading systems, SaaS tools, and automation platforms.

FAQ

Why use Binance as the signal source instead of Polymarket itself?

Polymarket's 5-minute BTC markets reprice in response to Binance — not the other way around. Binance spot is the primary market. Monitoring it directly gives you the earliest possible signal before Polymarket market makers update their quotes.

What is the aggTrade stream and why is it better than klines?

aggTrade streams every individual trade as it happens. Klines aggregate trades into OHLCV candles, introducing up to 60 seconds of delay depending on candle interval. For latency-sensitive strategies, aggTrade is the only option.

How do you prevent the bot from trading the same signal twice?

A signal guard tracks the last signal direction and timestamp. If a new signal fires in the same direction within a cooldown window (typically 2 minutes), it is suppressed. This prevents stacking multiple positions on a single sustained move.

What happens when the Binance WebSocket disconnects?

Implement reconnect logic with exponential backoff. On reconnect, reseed the rolling window from REST API klines for the last 60 seconds before resuming. Without reseeding, the first 60 seconds after reconnect are blind.

Should I run the Binance listener and CLOB executor in separate processes?

For simple strategies, a single asyncio event loop handles both cleanly. For strategies with heavy order management logic, separate processes with a queue improve isolation and prevent order handling from blocking signal detection.

Sources & Further Reading

Sources

Further Reading

Quantitative Trading Systems updates

Continue the Quantitative Trading Systems track

This signup keeps the reader in the same context as the article they just finished. It is intended as a track-specific continuation, not a generic site-wide interrupt.

  • Next posts in this reading path
  • New supporting notes tied to the same cluster
  • Distribution-ready summaries instead of generic blog digests

Segment: quantitative-trading

What do you think?

I post about this stuff on LinkedIn every day and the conversations there are great. If this post sparked a thought, I'd love to hear it.

Discuss on LinkedIn