The first decentralized benchmark network for crypto infrastructure is live.
Read the announcement.
All benchmarks
RPCsLive·updated 1s ago

Gas oracle vs realized percentile

Absolute gap in gwei between each oracle's predicted priority-fee tier and the realized percentile in the next mined block, per chain.

Read this carefully

Lower gap is NOT the same as "best oracle". Inclusion-confidence oracles (Blocknative, Etherscan) deliberately over-predict to guarantee next-block inclusion and show larger gaps here by design. Percentile-tracking oracles (PublicNode feeHistory, Owlracle) mirror the realized rewards distribution and hug the realized number by construction. Read this column as "distance from realized percentile", not "recommended for production".

This benchmark answers the question wallets, swap routers and bridge UIs all face before sending a transaction. which gas oracle actually matches what the next block will charge, on the chain my product runs on. Marketing pages quote "fast / standard / slow" without ever publishing the gap between prediction and reality. We normalize each oracle's tiers onto a unified p25 / p50 / p75 / p90 / p99 scheme, take the predicted priority fee in gwei, then compare it against the realized percentile computed directly from the actual transactions in the next mined block. The absolute error (|predicted, realized|, in gwei) is the headline. Coverage. Ethereum mainnet (Blocknative + PublicNode feeHistory + Owlracle + Etherscan v2) and Polygon (same four). Use the chain tab above to slice the leaderboard. Avalanche C-Chain was dropped because its auto-tuning fee market drives the priority fee to ~0 by design, which collapses prediction-error to ~0 across all oracles and makes the bench non-informative. What this bench does NOT capture. (a) inclusion latency (an oracle that under-predicts by 0.05 gwei looks "accurate" but the user's tx waits one extra block); (b) over-pay cost in USD (depends on gas_used and ETH price); (c) fundamental oracle reliability outside the priority-fee dimension. Read this number as "how close to the realized percentile, in gwei", not as "which oracle should I integrate end-to-end".

Methodology

We measure how accurately each gas oracle predicts the priority fee a transaction must pay to land in the next block, on two EIP-1559 chains (Ethereum, Polygon). Every oracle is polled at its tier-tolerant cadence and its predicted priority fee is buffered with the predicted block height PER chain. When that block is mined, the harness pulls the full block via `eth_getBlockByNumber(.., true)` on the chain's PublicNode RPC, computes the realized priority percentile from the actual `maxPriorityFeePerGas` values across all included transactions, and records the absolute error per (oracle, tier, chain) as both a gauge and a histogram. p50 / p90 / p99 are computed via Prometheus `quantile_over_time` over the 24 h window. Per-chain coverage: Ethereum and Polygon both have all four oracles (Etherscan v2 free tier covers chainid 1 and 137). The Etherscan call goes through a global rate-gate (≥6s between any two Etherscan requests across chains) because the no-key limit is 1 req/5s per IP shared. Pending-buffer growth surfaces per (oracle, chain) so a temporarily backlogged realizer on one chain doesn't silently inflate that chain's scores. Two chains are deliberately excluded. BNB Chain (not fully EIP-1559, no dynamic base fee). Avalanche C-Chain (auto-tuning fee market drives the priority fee to ~0, collapsing prediction-error to ~0 across all oracles and making the bench non-informative).

Frequently asked

What does this benchmark actually measure?

The absolute difference, in gwei, between each gas oracle's predicted p50 priority fee and the realized p50 priority fee in the next mined block. Lower error = closer prediction. Owlracle currently leads the active chain tab at 0.010 gwei. What it does NOT measure. inclusion latency (an oracle that under-predicts looks 'accurate' but the user's tx waits longer), USD over-pay cost (depends on gas_used × ETH price), or fundamental oracle reliability beyond priority-fee accuracy.

Why these specific chains (Ethereum, Polygon)?

Both are EIP-1559 with a proper dynamic base fee, which makes 'priority-fee prediction error' an apples-to-apples comparable metric across them. Avalanche C-Chain was dropped because its auto-tuning fee market collapses the priority fee to ~0 by design, which makes the prediction-error metric uniformly ~0 across all oracles and non-informative. BNB Chain is excluded because it isn't EIP-1559 (effective fee = gasPrice only, priority is structural noise). Optimism / Base / Arbitrum and other L2 OP Stack rollups are excluded because their priority fee is ~0 (centralised sequencer, no MEV) and the relevant cost is L1 data fee, a different metric that belongs in a separate bench. Solana uses lamports per CU with Jito MEV, different fee model entirely.

Which oracle should I pick for my wallet?

Read both p50 (typical minute) AND p99 (worst 1% of minutes) for the chain your product runs on. A low p50 means the oracle is usually accurate; a low p99 means it doesn't blow out during gas spikes. Owlracle currently leads the active tab at 0.010 gwei but the right oracle for your integration depends on whether you optimise for typical or tail behaviour, whether you can tolerate over-pay (then prefer a slightly higher-tier prediction), and whether the oracle's free quota fits your call volume. The bench gives you the live numbers. it cannot tell you which trade-off your product wants.

What about gas prediction on Optimism / Base / Arbitrum?

L2 OP Stack chains have priority fee ≈ 0 because the sequencer is centralised (no MEV competition, no public mempool). The actually meaningful cost on an L2 is the L1 data fee, what the sequencer pays Ethereum mainnet to post the batch, which is a different prediction problem. We're considering a separate `l2-data-fee-prediction` bench for that. Including L2s in this bench would silently produce flat ~0 numbers that pollute the comparison.

How is gas prediction error measured here, technically?

Per chain, every oracle is polled at its tier-tolerant cadence and the predicted priority fee per tier is buffered with the predicted block height. When that block is mined, the harness pulls the full block via `eth_getBlockByNumber(.., true)` on the chain's PublicNode RPC and computes the realized percentile from the actual `maxPriorityFeePerGas` values across every included transaction. Absolute error (|predicted, realized|, in gwei) is recorded per (oracle, tier, chain) as both a gauge and a histogram. p50 / p90 / p99 are computed via Prometheus `quantile_over_time` over the last 24 h. Empty or low-tx blocks are flagged separately because the realized percentile is noisy when transaction count is near zero.

Why is gas prediction so hard?

EIP-1559 makes the base fee deterministic (it adjusts by ±12.5% per block based on the previous block's gas used), so every oracle agrees on base fee within a fraction of a gwei. The hard part is predicting the priority-fee distribution in the *next* block. Priority fees are set by users in response to mempool congestion, which shifts on swap activity, MEV bot deployments, NFT mints and DEX volume in a way no historical-lookback model can fully anticipate. The leaderboard surfaces which oracle's lookback / inference scheme tracks reality best per chain, sustained over 24 h.

Source code github.com/OpenChainBench/OpenChainBench/tree/main/harnesses/gas-estimation