
Build a Binance-Polymarket Signal Pipeline in 2026

Build a Binance-to-Polymarket trading pipeline: real-time signal detection, deduplication, and sub-100ms order execution for momentum trading.
Why this matters
The pipeline has four stages: Binance aggTrade WebSocket stream, a rolling window momentum detector, a signal guard that prevents duplicate entries, and a CLOB order executor. Calibration matters — 0.3% in 60 seconds is the threshold that generates 3-8 signals per day without trading noise. Below that you drown in false triggers.
In this cluster
Cluster context
This article sits inside Quantitative Trading Systems.
Building, testing, and operating automated trading bots on prediction markets and crypto exchanges.
Trading systems are only interesting if they survive production. This cluster is about execution, robustness, and market reality.
Polymarket Bot: Build a Python Trader in 2 Hours
Build a Polymarket arbitrage bot: capture price gaps between Binance and CLOB in milliseconds. Full code walkthrough from WebSocket to execution.
Directional Betting Math: 3 Profitable Strategies
Learn Kelly criterion and position sizing for binary markets. Stop losing money on directional bets without a math-backed strategy.
Self-Tuning Position Sizing in Python: 5 Adaptive Rules
How to build a self-tuning position sizing system that adjusts bet size based on recent performance — without overfitting or over-reacting to variance.
Prediction markets are slow to reprice. Binance spot is fast. The pipeline between them is where the edge lives.
This post covers the full signal pipeline for a Polymarket latency arbitrage bot: from Binance WebSocket stream to CLOB order placement, including the filtering, deduplication, and async architecture that prevents duplicate entries and missed signals. If you want the full system architecture first, start with how I built the Polymarket trading bot — this post is a deep-dive on Stages 1 through 3 of that architecture.
TL;DR
- Stream Binance aggTrade via WebSocket — tick-level data, not candles
- Rolling 60-second window detects >0.3% momentum
- Signal guard suppresses re-entry on the same direction
- Async executor places CLOB maker order within 100-200ms of signal detection
- Run on Amsterdam VPS: 5-12ms to Polymarket CLOB in London
Why This Pipeline Exists
Polymarket lists binary markets on 5-minute BTC price movements: “Will BTC be higher in 5 minutes?” Market makers reprice YES/NO probabilities based on Binance spot. When BTC moves sharply, there is a 30-90 second lag before Polymarket odds fully reflect the move.
The pipeline exploits that lag by:
- Detecting the move on Binance first
- Placing a maker order on Polymarket before market makers reprice
- Collecting the spread between entry price and resolved probability — the expected value math that makes this profitable is straightforward once you have a calibrated p_true
Stage 1: Binance WebSocket Stream
The signal source is Binance’s aggTrade stream — not klines (candlesticks). Here is why that distinction matters:
- aggTrade: fires on every individual trade, sub-second latency
- klines/1m: fires once per minute, 0-60 second stale data window
For a strategy where the edge window is 30-90 seconds, klines make the signal source 33-100% as wide as the edge itself. You need tick data.
import asyncio
import websockets
import json
from collections import deque
from dataclasses import dataclass
from typing import Optional
@dataclass
class Tick:
timestamp: float # unix seconds
price: float
class BinanceStream:
SYMBOL = "btcusdt"
URL = f"wss://stream.binance.com:9443/ws/{SYMBOL}@aggTrade"
def __init__(self, on_tick):
self._on_tick = on_tick
self._running = False
async def run(self):
self._running = True
backoff = 1
while self._running:
try:
async with websockets.connect(self.URL) as ws:
backoff = 1 # reset on successful connect
async for raw in ws:
msg = json.loads(raw)
tick = Tick(
timestamp=msg["T"] / 1000,
price=float(msg["p"])
)
await self._on_tick(tick)
except Exception as e:
await asyncio.sleep(backoff)
backoff = min(backoff * 2, 30) # exponential backoff, max 30s
def stop(self):
self._running = False The reconnect loop with exponential backoff is not optional. Binance WebSocket connections drop 2-3 times per 24-hour session in my experience, sometimes silently — no error event, no close frame, just a dead socket. Without reconnect logic, your bot goes silent and you don’t notice until you check P&L and find it hasn’t traded in 6 hours.
The exponential backoff caps at 30 seconds because Binance rate-limits reconnection attempts. If you hammer the endpoint with immediate retries, you’ll get temporarily banned. The backoff resets to 1 second on successful connection, so normal reconnects happen quickly while sustained outages don’t burn through your rate limit budget.
One subtlety worth noting: the on_tick callback is an async function, which means the entire pipeline from tick receipt through order placement runs on the same event loop. This is intentional. A synchronous callback would block the WebSocket receive loop during order placement, causing missed ticks. The async design lets Python interleave tick processing with CLOB API calls without threading complexity.
Stage 2: Rolling Window Momentum Detector
The detector maintains a 60-second rolling window of price ticks and fires a signal when the net move exceeds the threshold.
from collections import deque
from enum import Enum
from typing import Optional
class Direction(Enum):
UP = "UP"
DOWN = "DOWN"
class MomentumDetector:
THRESHOLD_PCT = 0.003 # 0.3%
WINDOW_SECS = 60
def __init__(self):
self._window: deque[Tick] = deque()
def update(self, tick: Tick) -> Optional[Direction]:
self._window.append(tick)
self._prune(tick.timestamp)
if len(self._window) < 2:
return None
oldest = self._window[0].price
newest = self._window[-1].price
pct_move = (newest - oldest) / oldest
if pct_move >= self.THRESHOLD_PCT:
return Direction.UP
if pct_move <= -self.THRESHOLD_PCT:
return Direction.DOWN
return None
def _prune(self, now: float):
cutoff = now - self.WINDOW_SECS
while self._window and self._window[0].timestamp < cutoff:
self._window.popleft() The deque pruning keeps memory bounded regardless of how long the bot runs. Without pruning, the window grows unboundedly and the oldest price comparison becomes meaningless.
How the Momentum Detector Works
The detector solves a core problem in momentum-based trading: distinguishing real directional moves from noise. Traditional momentum indicators use closed candles or fixed time windows, but a prediction market bot can’t wait for a candle to close—by then the Polymarket market makers have already repriced.
The rolling window approach is different. Instead of waiting for a time boundary, we track every single tick that arrives and continuously evaluate whether the oldest and newest prices in our 60-second window differ by at least 0.3%. This means the signal can fire at any moment during the window, not just at fixed intervals.
The algorithm’s elegance lies in its simplicity: for every new tick, we append it to the deque, prune old ticks that fall outside the window, and check if the price move from oldest to newest exceeds our threshold. The computation is O(1) per tick (minus the O(n) for pruning, which amortizes to O(1) across the session because each tick is pruned exactly once).
Why a deque instead of a circular buffer? A circular buffer would require pre-allocating memory for 60 seconds of ticks (roughly 1000+ ticks per second on active Binance, so 60K+ entries). A deque grows naturally with actual tick volume and shrinks as old ticks are discarded. On slow market days, memory usage is negligible. On volatile days, the deque expands to match real activity.
The deque approach also handles market microstructure better than you might expect. During a flash crash or sudden spike, ticks come in clusters — multiple trades at the same or nearby prices fire within milliseconds. A rolling window sees all of them, whereas a sampled-tick approach would miss intermediate prices and underestimate the move. This is critical for latency arbitrage: you want to detect moves as early as possible, and the deque gives you continuous visibility into the price action rather than discrete snapshots.
Another practical advantage: the deque naturally handles reconnects gracefully. When the WebSocket reconnects, you can reseed the deque from historical REST API data (the last 120 seconds of klines), and the detector will have full context for the next signal. A circular buffer would need explicit reset logic. A rolling window with a deque just works.
Threshold Calibration
0.3% in 60 seconds is the calibrated threshold for BTC. How I arrived at it:
- 0.15% in 30s: fires constantly — 15-20 signals per day. Most resolve as noise.
- 0.3% in 60s: fires 3-8 times per day. 62% of signals land in-range (market resolves in signal direction).
- 0.5% in 60s: fires 0-2 times per day. Too infrequent to validate or build statistical confidence.
The 62% in-range rate translates directly to your p_true estimate for position sizing — see the math behind directional betting in binary markets for how to convert that into expected value and Kelly fraction.
The threshold needs recalibration for other assets. ETH has a different noise floor than BTC. SOL and XRP are noisier at shorter windows.
Stage 3: Signal Guard
Without a signal guard, a sustained 3-minute BTC rally fires 3 separate signals — and the bot opens 3 long positions on what is effectively one trade. The signal guard prevents this.
import time
class SignalGuard:
COOLDOWN_SECS = 120 # 2 minutes
def __init__(self):
self._last_direction: Optional[Direction] = None
self._last_signal_ts: float = 0
def should_trade(self, direction: Direction) -> bool:
now = time.time()
time_since_last = now - self._last_signal_ts
# Same direction within cooldown: suppress
if (direction == self._last_direction and
time_since_last < self.COOLDOWN_SECS):
return False
self._last_direction = direction
self._last_signal_ts = now
return True The guard resets on direction change. A BTC DOWN signal immediately after a BTC UP position is a new, independent signal — not a duplicate. Only same-direction signals within the cooldown window are suppressed.
The 120-second cooldown is calibrated to the 5-minute market duration. A cooldown shorter than 60 seconds lets the bot stack positions on a single sustained move — you end up with three long entries that are really one trade with 3x the risk. A cooldown longer than 180 seconds suppresses legitimate second signals in volatile markets where BTC can make two independent moves within the same 5-minute window. The 120-second sweet spot prevents stacking while preserving the ability to trade a genuine reversal.
In practice, the signal guard prevents roughly 30-40% of raw signals from reaching the executor. This sounds like a lot of missed trades, but those suppressed signals are almost always duplicates of a sustained move — entering a second position on the same momentum rarely adds edge and always adds risk. The guard is one of the highest-value components per line of code in the entire pipeline.
Stage 4: CLOB Order Executor
The executor receives a validated signal and places a maker order on the Polymarket CLOB. The BASE_BET_USDC here is fixed for simplicity — in production, a self-tuning system adjusts this value dynamically based on recent win rate.
from py_clob_client.client import ClobClient
from py_clob_client.clob_types import OrderArgs, BUY
class CLOBExecutor:
MIN_SHARES = 5 # Polymarket minimum
BASE_BET_USDC = 15 # dollar amount per trade
def __init__(self, client: ClobClient, market_registry):
self._client = client
self._registry = market_registry
async def execute(self, direction: Direction) -> Optional[str]:
market = self._registry.get_active_5m_btc_market(direction)
if not market:
return None
# Check market has enough time remaining
if market.secs_remaining <= 30:
return None
# Compute shares from dollar amount
mid_price = market.mid_price
shares = self.BASE_BET_USDC / mid_price
if shares < self.MIN_SHARES:
return None
order_args = OrderArgs(
token_id=market.token_id,
price=mid_price,
size=round(shares, 2),
side=BUY,
)
result = self._client.create_and_post_order(order_args)
return result.order_id if result else None The expiry check (secs_remaining <= 30) prevents placing orders on markets that are about to close. Without it, you fill on a market that resolves 10 seconds later and cannot place an exit if needed. I learned this the hard way — an early version of the bot placed an order with 8 seconds remaining, got filled, and the market resolved before I could even check the position status. The capital was locked for the full resolution cycle.
The create_and_post_order call is critical. The Polymarket SDK has both create_order and create_and_post_order. The first only builds the order object locally without submitting it — a naming trap that cost me an evening of debugging when orders appeared to succeed but never showed up on the CLOB. Always use create_and_post_order for actual submission.
Wiring It Together: Async Event Loop
The full pipeline runs on a single asyncio event loop. The WebSocket receive loop, momentum detection, and CLOB placement all happen asynchronously without blocking each other.
async def main():
detector = MomentumDetector()
guard = SignalGuard()
executor = CLOBExecutor(client, registry)
async def on_tick(tick: Tick):
direction = detector.update(tick)
if direction is None:
return
if not guard.should_trade(direction):
return
order_id = await executor.execute(direction)
if order_id:
log.info(f"Placed order {order_id} — {direction.value}")
stream = BinanceStream(on_tick)
await stream.run()
asyncio.run(main()) The event loop is single-threaded but non-blocking. on_tick is called for every Binance trade tick. The CLOB placement is awaited inline — if the CLOB call takes 50ms, the next tick is processed after it completes.
For strategies with heavier logic, you can offload placement to a separate task:
async def on_tick(tick: Tick):
direction = detector.update(tick)
if direction and guard.should_trade(direction):
asyncio.create_task(executor.execute(direction)) This lets tick processing continue while placement runs in the background.
Latency Breakdown
On a well-tuned Amsterdam VPS, the pipeline latency from Binance tick to CLOB order acknowledgement:
| Stage | Latency |
|---|---|
| Binance WS → Python receive | 1-5ms |
| Deque update + threshold check | <1ms |
| Signal guard check | <1ms |
| CLOB order construction | <1ms |
| CLOB API round trip (Amsterdam→London) | 5-12ms |
| Total | ~10-20ms |
20ms from Binance tick to CLOB order. The Polymarket repricing lag is 30-90 seconds. You have 30-90,000ms of edge window, and you use 20ms of it.
The math still holds even with US East latency (130-150ms total). The bottleneck is not network — it is whether anyone else detected the signal first.
What Can Go Wrong
WebSocket disconnects during a live position. Your position is still open, but you have no incoming price data to decide when to exit. Solution: implement position health checks on reconnect that query CLOB for open positions before resuming signal processing.
Market not found for a signal direction. The 5-minute BTC market rolls over every 5 minutes. If a signal fires at minute 4:58, the current market might have 2 seconds remaining. Your registry needs to handle market lookup gracefully with a fallback to the next available market.
CLOB rate limits. If you place too many orders in rapid succession, the CLOB returns rate limit errors. The signal guard helps here, but also implement a per-minute order count limiter with an asyncio.sleep backoff.
Silent order rejections. The CLOB occasionally rejects orders with a 200 OK but no order_id in the response. Always check the return value — do not assume success from HTTP status alone.
Testing the Pipeline Before Going Live
Before deploying this pipeline with real money, you need a systematic way to validate that every stage works correctly in isolation and as a connected system. The testing approach I used saved me from at least three bugs that would have cost real capital.
Start with the momentum detector. Feed it historical tick data from the Binance REST API and verify that signals fire at the expected timestamps. I downloaded 24 hours of aggTrade data for three different volatility regimes: a quiet Sunday, a normal weekday, and a day with a major price move. The detector should produce zero signals on the quiet day, three to eight on the normal day, and ten to fifteen on the volatile day. If the numbers are wildly different, your threshold is miscalibrated.
Next, test the signal guard in isolation. Create a synthetic sequence of signals: UP, UP, UP (should suppress the second and third), then DOWN (should pass through immediately), then DOWN again within two minutes (should suppress). This catches the most common guard bug, which is accidentally suppressing direction changes along with duplicates.
The executor is the hardest to test without real money. I built a dry-run mode that constructs the order object and logs it without calling the CLOB API. This validates the share calculation, the minimum share check, and the expiry guard. Run the dry-run executor against live market data for a full trading day and manually verify that every logged order makes sense: correct direction, reasonable share count, valid token ID, and sufficient time remaining on the market.
Finally, run the full pipeline end-to-end in dry-run mode for at least 48 hours. Watch for three specific failure patterns. First, zombie signals where the detector fires but the guard or executor silently drops the signal without logging why. Second, phantom orders where the executor logs an order that should have been suppressed by the guard. Third, timing failures where the executor tries to place an order on an expired market because the registry returned stale data.
The 48-hour window matters because it covers both high-volatility and low-volatility periods. A pipeline that works perfectly during active US trading hours might break during the Asian session when tick frequency drops and the rolling window behaves differently with sparse data.
Monitoring in Production
Once the pipeline is live, you need real-time visibility into every stage. Logging alone is not sufficient because you cannot watch logs continuously, and by the time you notice a problem in the logs, the damage is already done.
I built a lightweight health monitor that tracks four metrics. Signal rate measures how many raw signals the detector produces per hour. A sudden drop to zero means the WebSocket is probably disconnected. Guard pass rate measures what percentage of signals survive the guard. If this drops below fifty percent, the cooldown might be too aggressive for current market conditions. Executor success rate measures what percentage of passed signals result in a confirmed order on the CLOB. A drop here usually means an API issue or an allowance problem. Average latency measures the time from tick receipt to order acknowledgment. If this creeps above 100 milliseconds, something in the pipeline is blocking.
These four metrics tell you the health of the entire pipeline at a glance. I push them to a Slack channel every fifteen minutes during active trading hours. Any metric that crosses a threshold triggers an immediate alert. The alerting is simple: if signal rate hits zero for fifteen minutes during US market hours, something is wrong and the bot needs attention.
The most valuable metric over time turned out to be the guard pass rate. When market regime shifts from trending to ranging, the pass rate changes significantly. During strong trends, the guard suppresses sixty percent of signals because the same directional move generates multiple triggers. During choppy markets, the pass rate rises to eighty percent because signals alternate direction frequently. Tracking this ratio over weeks gave me insight into when the strategy was in a favorable regime versus when I should expect lower returns.
Signal Frequency Expectations
On active BTC trading days:
- 3-8 signals with 0.3%/60s threshold
- Each signal potentially captures a 5-minute market (or what remains of it)
- 1-3 of those signals will be suppressed by the guard as duplicates of a sustained move
On quiet days (low volatility): 0-2 signals. This is fine. The strategy is about edge quality, not trade frequency.
FAQ
Why use Binance as the signal source instead of Polymarket itself?
Polymarket's 5-minute BTC markets reprice in response to Binance — not the other way around. Binance spot is the primary market. Monitoring it directly gives you the earliest possible signal before Polymarket market makers update their quotes.
What is the aggTrade stream and why is it better than klines?
aggTrade streams every individual trade as it happens. Klines aggregate trades into OHLCV candles, introducing up to 60 seconds of delay depending on candle interval. For latency-sensitive strategies, aggTrade is the only option.
How do you prevent the bot from trading the same signal twice?
A signal guard tracks the last signal direction and timestamp. If a new signal fires in the same direction within a cooldown window (typically 2 minutes), it is suppressed. This prevents stacking multiple positions on a single sustained move.
What happens when the Binance WebSocket disconnects?
Implement reconnect logic with exponential backoff. On reconnect, reseed the rolling window from REST API klines for the last 60 seconds before resuming. Without reseeding, the first 60 seconds after reconnect are blind.
Should I run the Binance listener and CLOB executor in separate processes?
For simple strategies, a single asyncio event loop handles both cleanly. For strategies with heavy order management logic, separate processes with a queue improve isolation and prevent order handling from blocking signal detection.
Sources & Further Reading
Sources
- Binance WebSocket API — Aggregate Trade Streams Official documentation for the aggTrade WebSocket stream used as the signal source.
- Python asyncio — Event Loop Reference for the async event loop pattern used to multiplex WebSocket and CLOB operations.
- Polymarket py_clob_client Python SDK for CLOB order placement on Polymarket.
Further Reading
- Polymarket Bot: Build a Python Trader in 2 Hours Build a Polymarket arbitrage bot: capture price gaps between Binance and CLOB in milliseconds. Full code walkthrough from WebSocket to execution.
- Self-Tuning Position Sizing in Python: 5 Adaptive Rules How to build a self-tuning position sizing system that adjusts bet size based on recent performance — without overfitting or over-reacting to variance.
- Directional Betting Math: 3 Profitable Strategies Learn Kelly criterion and position sizing for binary markets. Stop losing money on directional bets without a math-backed strategy.
Reading Path
Continue the Quantitative Trading Systems track
Previous
None
Next
None
Contextual next reads
Polymarket Bot: Build a Python Trader in 2 Hours
Build a Polymarket arbitrage bot: capture price gaps between Binance and CLOB in milliseconds. Full code walkthrough from WebSocket to execution.
Directional Betting Math: 3 Profitable Strategies
Learn Kelly criterion and position sizing for binary markets. Stop losing money on directional bets without a math-backed strategy.
Self-Tuning Position Sizing in Python: 5 Adaptive Rules
How to build a self-tuning position sizing system that adjusts bet size based on recent performance — without overfitting or over-reacting to variance.
Continue the Quantitative Trading Systems track
This signup keeps the reader in the same context as the article they just finished. It is intended as a track-specific continuation, not a generic site-wide interrupt.
- Next posts in this reading path
- New supporting notes tied to the same cluster
- Distribution-ready summaries instead of generic blog digests
What do you think?
I post about this stuff on LinkedIn every day and the conversations there are great. If this post sparked a thought, I'd love to hear it.
Discuss on LinkedIn