About Foresyn Hacker News

Soft launch — 2026-05-10. Posting this on HN on Monday. Until then it's open for friends + the @nn_for_science community to test, break, and call out anything weird. Bugs / UX holes / model-disagreement examples → DM @crimeacs on Telegram or open an issue against the public surface.

What this is

Foresyn Hacker News predicts your HN virality before you post. Paste a draft. We score it against a corpus of real HN stories, simulate the top comments — including the takedowns — your post would attract, and generate rewrites that actually beat the base score. Click a rewrite; the predicted score and the audience reaction both refresh, and you keep iterating until you're ready to submit on real HN.

The pixel-faithful HN chrome is the joke; the predictor + comment simulator are what make the loop useful. Every shipped rewrite comes with measured evidence that it lifts the score, not vibes. Live accuracy ledger tracks the predictor against the actual HN front page, daily.

Not affiliated with Hacker News or Y Combinator. A Foresyn project.

Sections

Corpus

Two numbers show up across the site, intentionally distinct. Training corpus is 148,400 stories — used by scripts/hackernews/train_predictor.py with a chronological holdout (train < 2025-07-01, val 2025-07-01 to 2025-12-01, holdout after) so neighbor-leakage from the future can't inflate metrics. Live serving uses ~39K embedded stories stored in a halfvec(3072) HNSW index on Supabase — the higher-quality slice we kept for fast cosine kNN at query time. Every prediction you see on the site retrieves its top-50 neighbors from this 39K subset; the score model itself was trained on the full 148K so it generalizes beyond what's indexed.

How the predictor works

  1. Title + first 200 chars of body are embedded with gemini-embedding-001 (3072 dims).
  2. Top-50 cosine neighbors are retrieved from a halfvec(3072) HNSW index over the corpus.
  3. ~31 features are computed: kNN-derived (neighbor score p10/p50/p90, mean cosine, front-page rate), time of day, domain priors, title craft (length, prefixes, punctuation).
  4. A LightGBM gradient-boosted regressor predicts log1p(score) with α=0.1/0.5/0.9 quantile heads for the interval, plus a calibrated binary classifier for “made the front page” (score≥100). Live serving currently uses a heuristic blend of the same kNN-derived features (LightGBM ONNX runtime didn't fit Vercel's 250 MB function size cap; wiring it up via Python sibling is a follow-up).

How the comments are simulated

For each draft, we pull the 5 closest historical neighbors and fetch up to 5 of each neighbor's real top comments via the Algolia HN API (cached). Gemini Flash mirrors the style and archetype mix in the neighborhood — skeptic, pedant, anecdote, historical precedent, correction — and produces 5 comments your draft would attract, with [src] links back to the real comments that anchored each one. Not real votes; an audience preview.

How the rewrites work

8 candidates are generated across distinct strategies (Show HN, contrarian, number, question, personal, outcome, punchy, technical), each scored against the same predictor. We filter to score-positive deltas only — if fewer than 3 beat the base, a second round runs with the round-1 winners and losers as feedback. Only rewrites that actually lift the score are shown. Click any of them and the page re-scores + regenerates.

Honest holdout metrics

Chronological split — train on items before the cutoff, evaluate on items posted strictly after. Single training run, no cherry-picking.

Model versionlgbm_v1_2026-05-10
Trained at2026-05-10T12:50:06Z
Corpus size148400
Spearman ρ on log_score0.326
MAE on log_score1.652
AUC on score≥1000.667
Brier on score≥1000.211
Precision@K=300.833

Trained via scripts/hackernews/train_predictor.py. kNN-derived features computed locally via brute-force chunked numpy kNN over the entire corpus, time-causal (neighbor.time < candidate.time) since v2.

Why these targets are believable

Early-vote stochasticity caps achievable AUC near 0.80; published title-only baselines on HN sit in the low-0.70s (e.g. Brust 2014 hit AUC 0.74–0.77 with logistic regression on title + hostname). We blend embedding-derived neighbor signals, time-of-day, domain priors, and title craft. Modern transformer fine-tunes generally underperform GBDT here because they overfit the small corpus. The live accuracy ledger tracks these claims against the actual HN front page once daily — biggest miss in each direction is surfaced, not buried.

Live serving uses the trained LightGBM — emitted as zero-dependency JavaScript via m2cgen (a stack of nested if-else over the 31-feature input vector, tree-shaken into the API function bundle). No ONNX runtime, no WASM, no proxy hop — the model is part of the same Vercel function. The earlier heuristic blend is kept as a fallback if anything in the JS evaluator throws, and remains accessible via HN_PREDICTOR=heuristic env override. See /predictions for the live calibration against the actual HN front page.

How to give feedback

Soft launch is open through the weekend before the Monday HN post. Things I specifically want to know from anyone who actually posts on HN:

  • Submit a draft you've actually shipped (or are about to). Does the model agree with what happened?
  • The simulated comments — do they read like the comments your post would actually attract? Where do they miss?
  • Auto-improve — does the "winning" rewrite read as actually better, or just keyword-stuffed?
  • Anything that looks broken, slow, or confusing.

Drop bugs / UX issues / model-disagreement examples in the @nn_for_science channel comments, or DM @crimeacs directly. Edit history on every submission is retained, so you can paste a permalink and I can see the exact draft and score you're flagging.

What this is not

  • Not a clickbait optimizer. Rewrites are bounded by URL/body content.
  • Not a YC product. The Foresyn-F logo and footer are explicit.
  • Not live HN. Corpus is a snapshot; /predictions is the only live surface.

Privacy

Submissions are public. We hash your IP for rate limiting (peppered sha256, 32 hex chars), nothing else. Edit history is retained on your submission so you can compare versions.