We scored the HN front page. Here's how wrong we are.
Every 10 minutes a Fleet box pulls the live news.ycombinator.com front page and runs every story through the same model that scores user drafts. The numbers below are model output vs HN reality — not curated. Last snapshot 9 min ago · 30 stories.
ForesynCalibration over time — last 7 days of 10-min snapshots
Spearman ρ = ranking match — closer to 1, better.Top-10 hit rate = of our 10 highest-predicted, how many landed in HN's actual top-10.
ρ now 0.13( +0.32)Top-10 now 40%( +10pp)13 snapshots · 2.4 d window
-0.50-0.250.000.250.500.75zero (random)useful (ρ ≥ 0.3)05-0805-0805-1005-10time (UTC)2026-05-08 15:17 UTC — top-10 hit rate 30% (n=30)2026-05-08 16:20 UTC — top-10 hit rate 20% (n=30)2026-05-08 16:50 UTC — top-10 hit rate 40% (n=30)2026-05-08 17:28 UTC — top-10 hit rate 40% (n=30)2026-05-08 17:28 UTC — top-10 hit rate 40% (n=30)2026-05-09 14:33 UTC — top-10 hit rate 40% (n=30)2026-05-09 16:27 UTC — top-10 hit rate 50% (n=30)2026-05-10 22:54 UTC — top-10 hit rate 50% (n=30)2026-05-10 23:05 UTC — top-10 hit rate 40% (n=30)2026-05-10 23:15 UTC — top-10 hit rate 40% (n=30)2026-05-10 23:25 UTC — top-10 hit rate 40% (n=30)2026-05-10 23:35 UTC — top-10 hit rate 40% (n=30)2026-05-10 23:45 UTC — top-10 hit rate 40% (n=30)2026-05-08 15:17 UTC — Spearman ρ -0.184 (n=30)2026-05-08 16:20 UTC — Spearman ρ -0.358 (n=30)2026-05-08 16:50 UTC — Spearman ρ 0.269 (n=30)2026-05-08 17:28 UTC — Spearman ρ 0.257 (n=30)2026-05-08 17:28 UTC — Spearman ρ 0.263 (n=30)2026-05-09 14:33 UTC — Spearman ρ 0.388 (n=30)2026-05-09 16:27 UTC — Spearman ρ 0.415 (n=30)2026-05-10 22:54 UTC — Spearman ρ 0.237 (n=30)2026-05-10 23:05 UTC — Spearman ρ 0.239 (n=30)2026-05-10 23:15 UTC — Spearman ρ 0.216 (n=30)2026-05-10 23:25 UTC — Spearman ρ 0.220 (n=30)2026-05-10 23:35 UTC — Spearman ρ 0.140 (n=30)2026-05-10 23:45 UTC — Spearman ρ 0.133 (n=30)ρ 0.13top10 40%
Green band = useful zone (ρ ≥ 0.3); pink band = worse than chance. Each dot is one 10-minute snapshot — hover for exact values. Latest sample value floats at the right edge of each line.
Spearman ρ
0.13
MAE log-score
2.10
Top-10 hit rate
40%
Bias direction
under by 1.91 log
ρ closer to 1 = our ranking matches HN's. MAE on log: ~0.5 means typically off by ~50% on raw points. Top-10 hit rate: of our 10 highest-predicted, how many landed in HN's actual top-10. Bias direction: which way the average miss goes.
Best call
Plex's price hikes prove I was right to switch to Jellyfin
predicted 19 · actual 17 · 0.11 log off
Biggest underestimate
I returned to AWS and was reminded why I left
we said 3 · HN says 629 · missed by +5.06 log
Biggest overestimate
The people preserving the scientific practice of bird banding
we said 143 · HN says 16 · we overshot by 2.14 log
ForesynLive front page · predicted vs actual
1.Hardware Attestation as Monopoly Enabler (grapheneos.social)
predicted 235 · actual 731 (HN 3.1× higher) by ChuckMcM | 268 comments
2.Local AI needs to be the norm (unix.foo)
predicted 206 · actual 413 (HN 2.0× higher) by cylo | 214 comments
3.Incident Report: CVE-2024-YIKES (nesbitt.io)
predicted 9 · actual 339 (HN 37.7× higher) by miniBill | 84 comments
4.Running local models on an M4 with 24GB memory (jola.dev)
predicted 6 · actual 14 (HN 2.3× higher) by shintoist | 2 comments
5.Obsidian plugin was abused to deploy a remote access trojan (cyber.netsecops.io)
predicted 4 · actual 26 (HN 6.5× higher) by cmbailey | 8 comments
6.Why modern parents feel more sleep deprived than our ancestors did (bbc.com)
predicted 12 · actual 54 (HN 4.5× higher) by 1659447091 | 27 comments
7.Ask HN: What are you working on? (May 2026)
predicted 11 · actual 107 (HN 9.7× higher) by david927 | 374 comments
8.First tunnel element of the Fehmarnbelt Tunnel immersed (arup.com)
predicted 3 · actual 21 (HN 7.0× higher) by robin_reala | 5 comments
9.Traces Of Humanity (tracesofhumanity.org)
predicted 3 · actual 117 (HN 39.0× higher) by alex77456 | 19 comments
10.Maryland citizens hit with $2B power grid upgrade for out-of-state AI (tomshardware.com)
predicted 6 · actual 86 (HN 14.3× higher) by lemonberry | 29 comments
11.Guy Goma's Accidental BBC Interview Lives on After 20 Years (nytimes.com)
predicted 6 · actual 18 (HN 3.0× higher) by nxobject | 4 comments
12.I returned to AWS and was reminded why I left (fourlightyears.blogspot.com)
predicted 3 · actual 629 (HN 209.7× higher) by andrewstuart | 462 comments
13.Stop MitM on the first SSH connection, on any VPS or cloud provider (joachimschipper.nl)
predicted 4 · actual 64 (HN 16.0× higher) by JoachimSchipper | 37 comments
14.Eight More 8-bit Era Microprocessors (2024) (thechipletter.substack.com)
predicted 7 · actual 42 (HN 6.0× higher) by klelatti | 11 comments
15.The locals don't know (quarter--mile.com)
predicted 56 · actual 80 (HN 1.4× higher) by herbertl | 57 comments
16.Lakebase architecture delivers faster Postgres writes (databricks.com)
predicted 54 · actual 84 (HN 1.6× higher) by sp_from_db | 24 comments
17.The people preserving the scientific practice of bird banding (thenarwhal.ca)
predicted 143 · actual 16 (we 8.9× higher) by bookofjoe | 0 comments
18.What's a mathematician to do? (2010) (mathoverflow.net)
predicted 30 · actual 138 (HN 4.6× higher) by ipnon | 72 comments
19.Idempotency is easy until the second request is different (blog.dochia.dev)
predicted 3 · actual 271 (HN 90.3× higher) by ludovicianul | 173 comments
20.Louis Rossmann offers to pay legal fees for a threatened OrcaSlicer developer (tomshardware.com)
predicted 13 · actual 438 (HN 33.7× higher) by iancmceachern | 233 comments
21.Show HN: An index of indie web/blog indexes (theindex.fyi)
predicted 7 · actual 90 (HN 12.9× higher) by rocketpastsix | 28 comments
22.Think Linear Algebra (2023) (allendowney.github.io)
predicted 5 · actual 149 (HN 29.8× higher) by tamnd | 17 comments
23.Space Cadet Pinball on Linux (brennan.io)
predicted 231 · actual 305 (HN 1.3× higher) by jandeboevrie | 101 comments
24.Task Paralysis and AI (g5t.de)
predicted 5 · actual 180 (HN 36.0× higher) by MrGilbert | 104 comments
25.Walking slower? Your ears, not your knees, might be the problem (wsj.com)
predicted 3 · actual 76 (HN 25.3× higher) by marc__1 | 58 comments
26.Spain has become one of Europe’s cheapest power markets (janrosenow.substack.com)
predicted 3 · actual 135 (HN 45.0× higher) by marc__1 | 112 comments
27.YC's Biggest Scandals (ycombinator.fyi)
predicted 4 · actual 215 (HN 53.8× higher) by laserduck | 76 comments
28.Plex's price hikes prove I was right to switch to Jellyfin (androidauthority.com)
predicted 19 · actual 17 (we 1.1× higher) by Brajeshwar | 16 comments
29.9 Mothers (YC P26) Is Hiring (jobs.ashbyhq.com)
predicted 3 · actual 1 (we 3.0× higher) by ukd1 | 0 comments
30.Shunting-Yard Animation (somethingorotherwhatever.com)
predicted 15 · actual 54 (HN 3.6× higher) by s1291 | 15 comments
What this measures, what it doesn't. We score every story on the live front page through our 39K-post HN corpus. Actual scores are still in flight — a story 1h old at 80 pts may end at 400; this snapshot freezes the moment of capture. Stories posted after our corpus ingest could in principle leak into their own kNN neighbors, but front-page items are usually hours old against a months-old corpus, so leak risk is small. We re-snapshot every 10 minutes via a Fleet systemd timer (Vercel Hobby crons cap at daily; the off-platform timer hits an authenticated manual-trigger path on the same endpoint).