Prediction market API rate limits 2026

Name: Prediction market API rate limits 2026
Creator: OpenChainBench
Published: 2026-06-16T11:48:27.000Z
License: https://creativecommons.org/licenses/by/4.0/

OpenChainBench

doi:10.5281/zenodo.20800312

All benchmarks

TradingLive·updated 4m ago

Prediction market API rate limits, tested with a daily ramp

Warm latency on book, price and list endpoints of five prediction market venue APIs, plus a daily rate limit ramp that records added latency and throttle onset per request tier.

TL;DR. As of 2026-08-01, Polymarket leads warm book latency at 200 ms (p50, 24h) on Prediction market API rate limits, tested with a daily ramp. Source: OpenChainBench, https://openchainbench.com/benchmarks/pm-rate-limits.

Read this carefully

Ramp tiers stay well inside documented budgets where they exist, at most 7 percent of Polymarket's published book allowance, and every run aborts at the first sign of sustained errors. This benchmark measures throttle onset behaviour at polite request rates, not maximum venue throughput.

Every prediction market venue documents its API rate limits differently, and some not at all. Polymarket publishes per endpoint budgets but fronts its CLOB with Cloudflare, which queues bursts instead of rejecting them. Kalshi documents a token bucket per access tier. Manifold publishes a per IP budget and explicitly welcomes bots. Limitless documents nothing. Builders sizing a polling loop or a trading bot against these APIs need two numbers nobody publishes: the real latency of the hot endpoints at a polite request rate, and what actually happens as the request rate climbs. This benchmark measures both. A harness in three regions probes the book, price and list endpoints of each venue continuously over warm connections, and once a day per venue it runs a short ramp, 60 seconds per tier at rising request rates, recording added latency versus the same hour baseline and any 429 or 5xx onset. The leaderboard sorts by warm book endpoint p50, the ramp results render as companion panels.

Methodology

Each venue is probed on three endpoint classes: book (the order book of a pinned market, or its closest equivalent), price (a single market quote) and list (the venue's market listing). Probes run every 5 to 7 seconds over a warm keep alive pool, with a separate cold connect probe once a minute that forces a fresh TCP and TLS handshake. Every sample records full round trip, time to first byte and a CDN cache flag taken from the response headers, because some venues serve part of their API from an edge cache and a cached read says nothing about the API origin. The pinned market is re selected daily at 00:00 UTC, most liquid, near the money, expiring more than 24 hours out, and immediately re pinned when a venue starts answering with errors that are really our pin's fault, such as a market that resolved intraday. Those samples are classified probe_invalid and never count against the venue. The daily ramp fires from one region at a time on disjoint UTC hours, 60 seconds per tier at rates that stay well inside documented budgets where they exist, and aborts the moment throttled plus server errors exceed 1 percent of a 10 second window.

Frequently asked

What is the Polymarket API rate limit?

Polymarket documents per endpoint budgets for the CLOB API, on the order of 1500 requests per 10 seconds for the book endpoint at the time of writing. In practice the CLOB sits behind Cloudflare, which queues bursts rather than rejecting them: at our daily ramp tiers (up to 100 requests per 10s) we observe added latency instead of 429 responses. The ramp panel above shows the measured added latency per tier, refreshed daily.

What happens if you exceed prediction market API rate limits?

It depends on the venue's architecture. Kalshi runs a documented token bucket and returns 429 when the bucket is empty. Polymarket's Cloudflare front queues requests, so you see rising latency before you ever see an error. Manifold documents 500 requests per minute per IP and asks bots to stay on one IP. Limitless documents nothing. The daily ramp records which of these behaviours each venue actually exhibits at each tier.

What is the Kalshi API rate limit?

Kalshi documents tiered rate limits per access level, with the basic read tier around 20 requests per second at the time of writing. It is the only venue in this cohort with a documented token bucket. Our ramp stops at the first 429 on Kalshi out of respect for that documented contract, and the tier where it happens (or does not) is recorded in the ramp panel.

Which prediction market has the fastest API?

Polymarket currently has the lowest warm book endpoint latency at 200 ms (p50 over the last 24h). Note what this does and does not measure: it is the latency of the venue's own public API over a warm connection at a polite request rate, per region. It is not data freshness via third party providers, which is a separate benchmark (pm-data-freshness).

Does Manifold have an API rate limit?

Yes, 500 requests per minute per IP, documented, and Manifold explicitly welcomes bots as long as they stay on a single IP. Be aware that the whole API is served behind a cache with max-age=5 and stale-while-revalidate=10, so polling faster than every 6 seconds mostly returns cached responses. Our probes space out to 7 seconds per URL and label every sample that still comes back from cache.

What are the Limitless API rate limits?

Limitless does not document rate limits. Our ramp therefore uses the most conservative tiers in the cohort (10/20/40 requests per 10s) with an automatic abort. One measured behaviour worth knowing: error responses are CDN cached for four hours, so a request for an expired market keeps returning the same cached 400 long after the market resolved. A client that does not handle this will misread the API's state for hours.

How do you test rate limits without abusing the APIs?

Four guardrails. Tiers stay well inside documented budgets where they exist (at most 7 percent of Polymarket's book allowance). Every run aborts as soon as throttled plus server errors exceed 1 percent of a 10 second window, and Kalshi stops at the first 429. Only one region ramps a venue at a time, on disjoint UTC hours. And every request carries an identifying User-Agent with a contact address, so a venue can reach us or filter us selectively. The measurement is throttle onset, not stress to failure.

Why measure the venue APIs directly instead of a data provider?

They answer different questions. If you build directly on a venue, its native API latency and throttle behaviour set your floor, and that is what this benchmark measures. If you consume the venue through a data provider such as Codex or Predexon, what matters is how fresh the provider's relay is, which is measured separately in the pm-data-freshness benchmark. Both pages link each other so you can compare the paths.

Source code github.com/ChainBench/OpenChainBench/tree/main/harnesses/pm-rate-limits