Methodology

How every benchmark on OpenChainBench is measured, reported and reproduced. Open by design. Every claim on this page is checkable against the underlying spec, harness and Prometheus dataset.

Section I

Design principles

  1. I

    Identical inputs

    Every provider sees the same request. same pair, same notional, same destination, submitted at the same moment from the same region. If inputs differ, we say so.

  2. II

    Honest aggregates

    We report p50, p90 and p99 latency along with success rate. Means are reported but never used as a headline. tail behaviour is what users feel.

  3. III

    Auditable runs

    Raw metrics are stored in Prometheus and exposed publicly. Anyone can re-run the harness against the same endpoints and verify the numbers match.

  4. IV

    No cherry-picking

    The benchmark plan is committed before each run: providers, routes, cadence, timeout. Adding or removing providers after seeing results requires a published correction.

  5. V

    Neutral presentation

    No spec marks a winner ahead of time. Tables sort mechanically by p50; readers compare the columns themselves.

Section II

Statistical conventions

Latency aggregates
Reported as p50, p90, p99 and arithmetic mean over the run window. Failed requests (timeout, 5xx, malformed response) are excluded from latency aggregates and counted toward success rate.
24h range
Min and max of p50 observed across the rolling 24-hour window. Captures the volatility of each provider, not just its central tendency.
Δ field
Each provider's p50 expressed as a percentage delta from the field mean. Negative is below the field, positive is above.
Success rate
Share of requests returning a usable result within the published timeout. The only metric that includes failures.
Region normalisation
Where a benchmark is multi-region, the headline figure is the cross-region median. Per-region figures appear on every benchmark page.
Significance
Differences smaller than the within-provider standard deviation are noted but not framed as a ranking.
Section III

Reproducing a result

  1. 01

    Clone the harness from the link at the bottom of any benchmark report.

  2. 02

    Set API keys for the providers you want to include. Public endpoints work for most aggregators; some bridges require allow-listing.

  3. 03

    Run the harness. it exposes /metrics over HTTP. Point a local Prometheus at it, or query the public OpenChainBench Prometheus directly.

  4. 04

    Run for at least 24 hours to get a comparable sample size (n typically ≥ 1,000 per provider per region).

  5. 05

    Compare your aggregates to the published numbers. If they diverge, file a provider correction with a reproducer.

Section IV

Corrections

Found a number you can't reproduce? File a data-quality issue (the published figure looks wrong) or a provider correction (your service measures a different value). Material errors are corrected in place with a dated note on the affected report.

Read more about the project on the About page or browse the source on GitHub.