Kalshi API ──▶ Ingestion ──▶ PostgreSQL ──▶ Scoring ──▶ Alerts ──▶ Email
│ │ │
Every 5 min Every 5 min Every 5 min
(markets, (6 signals (if score
trades, computed, exceeds
orderbooks) weighted, threshold)
labeled)
System Architecture
Technical overview of how the Kalshi Unusual Market Activity Monitor is built and deployed
Tech Stack
| Layer | Technology |
|---|---|
| Web Framework | FastAPI (async Python) |
| Server | Uvicorn |
| Database | PostgreSQL via asyncpg + SQLAlchemy |
| Scheduler | APScheduler (in-process) |
| Resend API | |
| Templates | Jinja2 + Pico CSS |
| Hosting | Railway (Hobby plan, auto-deploy from GitHub) |
Data Pipeline
Data flows through five stages on a continuous loop:
1. Ingestion Layer
The ingestion layer pulls data from the Kalshi public API and stores it in PostgreSQL.
Market Sync
Streams all open markets from Kalshi page-by-page via async generators (200 per batch) to stay under the 512MB RAM limit. Skips the Sports category. Maps each market to its event category. Marks markets as eligible or ineligible based on volume and open interest thresholds.
Trade Sync
For each eligible market, fetches recent trades since the last known timestamp. Stores individual trade records with price, size, side, and timestamp.
Orderbook Sync
Snapshots the current bid/ask/spread for eligible markets. Hardened against malformed API data with try/except guards around float conversions.
API Client
Async httpx client with cursor-based pagination, automatic rate limit handling (auto-retry on HTTP 429), and batched async generators to keep memory usage low.
2. Scoring Engine
The core intelligence. Computes a composite anomaly score (0–100) for each eligible market using six weighted signals. Each signal compares the market's current behavior to its own 7-day historical baseline.
| Signal | Weight | What It Detects |
|---|---|---|
| Trade Size | 30% | Individual trades that are abnormally large vs. the market's historical baseline |
| Price Impact | 25% | Price moving more than expected given trade volume |
| Liquidity | 20% | Spread widening or thinning books — someone draining liquidity |
| Clustering | 15% | Bursts of trades concentrated in a short time window |
| Timing | 5% | Activity during off-hours when informed traders tend to operate |
| Cross-Market | 5% | Correlated unusual activity across related markets in the same event |
A confirmation gate prevents false positives: at least one primary signal (Trade Size or Price Impact) must score 60+ and one secondary signal must score 40+ for a market to reach "High" or above. Markets failing the gate are capped at 54.99.
For full scoring details, see How It Works.
3. Alert System
Alert Generator
After each scoring cycle, checks if any market exceeds the user-configured score threshold (default: 70). Includes deduplication: won't re-alert the same market within 4 hours unless the score jumps by 15+ points (escalation detection).
Email Delivery
Sends via the Resend API using async httpx. Supports multiple comma-separated recipients. Two email types:
- Instant alerts — fired per-market the moment a threshold is crossed
- Daily recap — top 10 alerts from the past 24 hours, sent at 8pm ET
4. Database Schema
PostgreSQL with connection resilience: pool_pre_ping, 300s pool recycle, and 5-attempt startup retry with backoff.
| Table | Purpose |
|---|---|
markets | All tracked markets — ticker, title, category, price, volume, eligibility flags |
trades | Individual trade records — price, size, side, timestamp |
orderbook_snapshots | Point-in-time bid/ask/spread snapshots |
alert_scores | Every scoring cycle result per market — all 6 sub-scores + composite |
baselines | Rolling 7-day statistical baselines per market |
alerts | Generated alerts with deduplication tracking |
user_alert_states | Watch/dismiss states per market |
user_settings | Key-value store for email, threshold, and recap preferences |
5. Web Interface
Server-rendered HTML with Jinja2 templates and Pico CSS. HTMX provides interactive features (watch/dismiss buttons) without a JavaScript framework.
| Page | Route | Purpose |
|---|---|---|
| Dashboard | / | Live table of markets scored Medium+, sorted by score |
| Market Detail | /market/{ticker} | Deep dive: score breakdown bars, score history, recent trades, explanation text |
| Watched | /watched | User's personal watchlist |
| Alert History | /alerts | All generated alerts with label filtering |
| How It Works | /about | Full scoring model documentation |
| Architecture | /architecture | This page — system technical overview |
| Settings | /settings | Email config, threshold, daily recap toggle |
| Health Check | /ping | Returns alive status and database type |
All timestamps display in 12-hour AM/PM Eastern time via a custom Jinja2 filter that converts UTC to US/Eastern with proper EST/EDT handling.
6. Background Scheduler
All background jobs run in-process via APScheduler. Each job is wrapped in an independent try/except so one failure doesn't cascade to the others.
| Job | Interval | What It Does |
|---|---|---|
| Market Sync | 5 min | Fetch and upsert all open markets from Kalshi |
| Data Ingestion | 5 min | Fetch trades and orderbook snapshots for eligible markets |
| Scoring Cycle | 5 min | Run the 6-signal model on all eligible markets |
| Alert Generation | 5 min | Check scores, create alerts, send emails |
| Baseline Computation | 6 hours | Recompute 7-day rolling baselines per market |
| Data Retention | 24 hours | Purge trades and snapshots older than 14 days |
| Daily Recap Email | Daily 8pm ET | Send recap of top alerts from the past 24 hours |
7. Deployment
GitHub push ──▶ Railway auto-deploy ──▶ Docker build ──▶ Container start
├── Uvicorn (web server)
├── APScheduler (background jobs)
└── PostgreSQL (Railway service)
Single-container architecture — the web server and all background jobs run in one process. This keeps costs minimal ($5/month on Railway Hobby) but means a redeploy briefly pauses both the UI and data ingestion.
Memory Management
Railway's container has a 512MB RAM limit. To stay within this:
- Market data is streamed page-by-page via async generators instead of loaded all at once
- Sports markets are skipped entirely during ingestion (they account for a large portion of Kalshi's catalog)
gc.collect()is called after each batch to free memory promptly- Database connection pool is kept small (3 connections + 5 overflow)
Resilience
- Database: 5-attempt startup retry with exponential backoff, pool_pre_ping to detect stale connections, 300s pool recycle
- API: Automatic retry on rate limits (HTTP 429) with backoff
- Jobs: Each scheduled job runs in independent try/except — one failure won't take down the others
- Timezone: Uses
ZoneInfo("US/Eastern")for proper EST/EDT handling instead of hardcoded offsets