The receipts

How accurate are we?

We predict pump.fun mint graduation in the first 30-60 seconds. Every prediction logged before the outcome. Resolved against on-chain truth (98.4% of labels). Three numbers, three different questions — read them together:

Graduates (backtest)

…

of ≥70% confidence calls graduated, historically

Leave-one-out k-NN on … mints

Graduates (live, current product)

…

of ACT-band predictions actually graduated

Clean-entry ≥70% predictions since current rule activation, n=… resolved

Sustains 30m post-bond (live)

…

of graduated mints held ≥80% of grad price 30 min later

n=… resolved post-bond outcomes

The three numbers measure different things and shouldn't be averaged. Graduates (backtest) asks "how would the model have predicted historical mints?" via leave-one-out cross-validation. Graduates (live) asks "of predictions made under the current rule, how many graduated?" — anchored to the active rule's activation, published only at n ≥ 30 (otherwise: warming). Sustains 30m post-bond asks "of mints that did graduate, how many held value 30 min later?" The third number is sourced independently from on-chain DEX prices and answers the question a trader actually has — graduation alone is not a profit thesis.

By confidence band

The model is honest about uncertainty. Lower-confidence calls graduate at lower rates — exactly as predicted. The Telegram bot only fires at ≥70%.

If we say	Actual graduation rate	Sample size
loading…

SYSTEM STATUS

CALIBRATED · STABLE

verify any prediction →

runner_prob fields — directional, recalibration pending

The runner_prob_2x/5x/10x_from_now fields exposed at /api/v1/probe and /api/live are currently directional: mints with higher runner_prob do hit higher rates, but the absolute probability is over-stated by roughly 11-13 percentage points on high-confidence bins (≥0.5). The saturation case (kNN reports 1.0 because all neighbors hit) is the loudest miss — runner_prob_5x_from_launch at predicted 0.99 has actual rate around 0.29.

Magnitude recalibration is in progress via the existing apply_calibration infrastructure (the same self-correcting curve grad_prob uses). Until recalibration is verified, treat runner_prob fields as a ranking signal, not as a literal probability. Consumers making sizing decisions on the absolute number should discount by ~12pp at the high end. The full audit (/api/scope documents the field, n=89,077 sample) is intentionally surfaced here rather than hidden — same discipline as the warming label on the live rate above.

raw JSON: /api/accuracy · NFA · DYOR · prediction model output, not financial advice