We tracked specific, dated predictions from CEOs, researchers, skeptics, calibrated forecasters, and Reddit crowds — from 1965 to 2026 — and scored every one against reality.
We collected 199 specific, dated AI predictions from CEOs, researchers, independent bloggers, professional skeptics, calibrated forecasters, and anonymous Reddit commenters. We scored 160 of them against what actually happened (the remaining 39 are still pending).
The average accuracy score across all scored predictions is 0.60 — barely better than a coin flip. That number should make you uncomfortable, because these aren't random people. This group includes the CEO of OpenAI, the co-founder of Google DeepMind, Nobel laureates, and tenured professors at Princeton and Stanford. 70 unique speakers, 160 scored predictions.
Three findings stand out from the data:
1. The extremes score worst. Maximum doom (Yudkowsky, 0.12) and maximum hype (Patterson, 0.10) are equally wrong. The most confident people in either direction cluster at the bottom of the leaderboard.
2. People closest to production consistently outpredict everyone else. NVIDIA engineers who ship actual products (Briski 0.92, Deierling 0.94, Das 0.88) outscored CEOs, academics, and futurists by a wide margin. Proximity to real-world deployment is the single best predictor of prediction accuracy.
3. Specificity correlates with accuracy. Vague predictions ("AI will transform everything") score poorly. Specific predictions ("agentic scaffolding will drive the next leap, not bigger models") score well. The more precisely you can be wrong, the more likely you are to be right.
| Rank | Speaker | Accuracy | Predictions scored | % safe |
|---|---|---|---|---|
| 1 | Kevin Deierling | 0.94 | 1 | 0% |
| 2 | Kari Briski | 0.92 | 1 | 0% |
| 3 | Armstrong & Sotala (MIRI) | 0.92 | 1 | 0% |
| 4 | Colette Kress | 0.91 | 1 | 0% |
| 5 | Demis Hassabis | 0.90 | 3 | 5% |
| 6 | Ajeya Cotra | 0.90 | 4 | 5% |
| 7 | Scott Alexander | 0.88 | 9 | 10% |
| 8 | Eric Schmidt | 0.88 | 1 | 0% |
| 9 | Mark Zuckerberg | 0.88 | 1 | 0% |
| 10 | Brad Gerstner | 0.88 | 1 | 0% |
| 11 | David Ferris | 0.86 | 2 | 10% |
| 12 | Andrew Ng | 0.85 | 5 | 10% |
| 13 | Ilya Sutskever | 0.84 | 1 | 0% |
| 14 | Fei-Fei Li | 0.84 | 1 | 0% |
| 15 | Andy Jassy | 0.84 | 1 | 20% |
| 16 | Thomas Kurian | 0.84 | 1 | 10% |
| 17 | Holden Karnofsky | 0.83 | 5 | 10% |
| 18 | Satya Nadella | 0.82 | 1 | 0% |
| 19 | Chamath Palihapitiya | 0.80 | 1 | 0% |
| 20 | Zvi Mowshowitz | 0.78 | 11 | 10% |
| 21 | Gary Marcus | 0.76 | 12 | 40% |
| 22 | Yann LeCun | 0.76 | 1 | 0% |
| 23 | Tim Cook | 0.76 | 1 | 30% |
| 24 | Lisa Su | 0.76 | 1 | 20% |
| 25 | Epoch AI | 0.72 | 1 | 0% |
| 26 | Goldman Sachs | 0.72 | 1 | 20% |
| 27 | McKinsey Survey | 0.71 | 3 | 30% |
| 28 | AAAI 2025 Panel | 0.70 | 1 | 20% |
| 29 | Jensen Huang | 0.68 | 8 | 15% |
| 30 | Arvind Narayanan | 0.68 | 2 | 15% |
| 31 | Rodney Brooks | 0.58 | 3 | 30% |
| 32 | Sam Altman | 0.53 | 7 | 5% |
| 33 | Arvind Krishna | 0.52 | 1 | 10% |
| 34 | Dario Amodei | 0.48 | 5 | 5% |
| 35 | Francois Chollet | 0.46 | 2 | 0% |
| 36 | I.J. Good | 0.44 | 1 | 0% |
| 37 | Hans Moravec | 0.39 | 1 | 0% |
| 38 | David Shapiro | 0.34 | 3 | 0% |
| 39 | Vernor Vinge | 0.34 | 2 | 0% |
| 40 | Ed Zitron | 0.30 | 7 | 0% |
| 41 | Gordon E. Moore | 0.28 | 1 | 0% |
| 42 | Cal Newport | 0.26 | 4 | 10% |
| 43 | Elon Musk | 0.23 | 2 | 0% |
| 44 | Emad Mostaque | 0.22 | 1 | 0% |
| 45 | Emily Bender | 0.17 | 2 | 0% |
| 46 | Tetlock / Superforecasters | 0.17 | 5 | 0% |
| 47 | Herbert Simon | 0.16 | 2 | 0% |
| 48 | Eliezer Yudkowsky | 0.12 | 1 | 0% |
| 49 | Michael Burry | 0.12 | 1 | 0% |
| 50 | David Patterson | 0.10 | 1 | 0% |
| Category | Average accuracy | Predictions |
|---|---|---|
| Infrastructure | 0.87 | 10 |
| Safety | 0.68 | 10 |
| Adoption | 0.65 | 31 |
| Capability | 0.63 | 64 |
| Economic | 0.61 | 9 |
| Market | 0.42 | 18 |
| Timeline | 0.42 | 18 |
Each dot is a scored prediction. Higher contrarian scores mean bolder claims.
These eight people got it significantly more right than wrong. The pattern is striking: they were close to production, specific about mechanisms, and honest about limitations.
These predictions scored worst. The striking thing: they come from both extremes. Maximum optimism and maximum pessimism are equally wrong.
Professional AI skeptics have built careers on "AI can't do X" claims. How did they actually perform?
Skeptics as a group scored 0.30 — about half the accuracy of the optimists. But one skeptic stands apart from the rest.
Princeton's Narayanan was the only skeptic who scored well, and the reason is instructive. He was specific about what he criticized. He didn't make sweeping "AI is fake" claims. Instead, he targeted predictive AI in hiring and criminal justice — and he was right. Those products genuinely are unreliable.
His second prediction — that AI wouldn't cause mass unemployment — scored 0.80. By April 2026, no mass job apocalypse had materialized.
The most damning pattern among AI skeptics isn't that they're wrong. It's that they never acknowledge when they're wrong.
Gary Marcus has never published "OK, Claude can actually build full apps now and I was wrong about the capability ceiling." Ed Zitron has never written "OpenAI hitting $5B revenue means I was wrong about product-market fit." Emily Bender has never said "reasoning models have moved significantly beyond stochastic parrots."
Meanwhile, their high-scoring predictions are disproportionately "safe" calls — predicting that AI will still have errors, still hallucinate, still not achieve AGI by absurdly optimistic deadlines. That's like predicting it will rain in Seattle. Technically correct, zero insight value.
The most durable pattern in AI prediction history: every generation of researchers believes AGI is just around the corner. It never is.
Philip Tetlock studied 284 experts making 82,361 predictions over two decades. His finding: the experts with the biggest media platforms were consistently the worst predictors. He called them "hedgehogs" — people who know one big thing and force everything through that lens. The accurate ones were "foxes" — people who know many small things and update constantly.
Our data confirms this perfectly. Here's what separates predictors from influencers:
| Trait | Predictor | Influencer |
|---|---|---|
| Framing | "There's a 70% chance X happens by 2027" | "X is DEFINITELY happening / DEFINITELY not" |
| Confidence | Calibrated — matches actual uncertainty | Maximum — uncertainty is bad for engagement |
| Speed | Slow. Waits for evidence. | Fast. First take = best take for clicks. |
| Output | Long-form analysis, explicit models | Hot takes, threads, podcast clips |
| When wrong | "I was wrong. Here's my updated model." | Quietly moves on. Deletes old posts. |
| On camera | Boring. Lots of caveats. | Entertaining. Strong opinions. |
| Priority | Being right over time | Being interesting right now |
The starkest comparison in our data: Scott Alexander (Astral Codex Ten) scored 0.88 across 9 predictions. Ed Zitron (Better Offline) scored 0.30 across 7 predictions. Zitron has a larger audience. Alexander has a better track record. The market rewards confidence, not calibration.
Each prediction was scored on three dimensions, weighted and combined into a 0–1 overall accuracy score:
| Dimension | Weight | What It Measures |
|---|---|---|
| Direction | 40% | Did the predicted thing move in the predicted direction? Binary 0/1. |
| Timing | 30% | How far off was the timeline? Measured in months early or late. |
| Magnitude | 30% | How close was the scale of the prediction to reality? 0 = wildly off, 1 = nailed it. |
Scored: Prediction window has passed, fully evaluable. Partial: Some evidence available, preliminary score assigned. Pending: Can't score yet — prediction window still open. Pending predictions are excluded from averages.
Predictions sourced from published interviews, blog posts, research papers, official company announcements, Forbes roundups, McKinsey surveys, LessWrong prediction markets, and Reddit threads. All quotes are verbatim with original source citations.
Outcomes assessed against publicly available data as of April 2026: company earnings reports, industry surveys, product launches, market data, and independent research.
Get updates when we score new predictions — plus free tools and research as I release them.
No spam. Unsubscribe anytime.