69 AI Predictions Scored: Who Actually Knew What They Were Talking About?

We tracked specific, dated predictions from CEOs, researchers, skeptics, and Reddit crowds — from 1965 to 2025 — and scored every one against reality.

69 predictions
57 scored
0.51 avg accuracy
60 years of data

The Big Picture

We collected 69 specific, dated AI predictions from CEOs, researchers, independent bloggers, professional skeptics, and anonymous Reddit commenters. We scored 57 of them against what actually happened (the remaining 12 are still pending).

The average accuracy score across all scored predictions is 0.51 — barely better than a coin flip. That number should make you uncomfortable, because these aren't random people. This group includes the CEO of OpenAI, the co-founder of Google DeepMind, Nobel laureates, and tenured professors at Princeton and Stanford.

The pattern across 60 years: People consistently overestimate near-term timelines for AGI while underestimating near-term practical capability improvements. They predict the revolution will happen "next year" while failing to notice the revolution already happening under their feet.

Three findings stand out from the data:

1. The extremes score worst. Maximum doom (Yudkowsky, 0.12) and maximum hype (Patterson, 0.10) are equally wrong. The most confident people in either direction cluster at the bottom of the leaderboard.

2. People closest to production consistently outpredict everyone else. NVIDIA engineers who ship actual products (Briski 0.92, Deierling 0.94, Das 0.88) outscored CEOs, academics, and futurists by a wide margin. Proximity to real-world deployment is the single best predictor of prediction accuracy.

3. Specificity correlates with accuracy. Vague predictions ("AI will transform everything") score poorly. Specific predictions ("agentic scaffolding will drive the next leap, not bigger models") score well. The more precisely you can be wrong, the more likely you are to be right.

The Data

The Leaderboard: All Speakers by Accuracy

Accuracy by Category

How Specific Were They vs. How Right Were They?

Each dot is a scored prediction. Higher contrarian scores mean bolder claims.

The A-Tier (0.84+)

These eight people got it significantly more right than wrong. The pattern is striking: they were close to production, specific about mechanisms, and honest about limitations.

Andrew Ng
Stanford / Landing AI
0.94
"Agentic workflows will drive massive progress this year... I think we can get to AGI without needing to build the next generation of foundation models."
Outcome: Perfectly called the industry's architectural pivot to agentic scaffolding a full year before it became the dominant paradigm. Best prediction in the entire database.
Kevin Deierling
NVIDIA
0.94
"Moore's law running up against the laws of physics... enterprises increasingly will turn to accelerated computing."
Outcome: NVIDIA revenue exploded from $27B to $130B+ driven by exactly this shift. Self-serving prediction, but completely accurate.
Kari Briski
NVIDIA
0.92
"Research on large language models will lead to new types of practical applications... We'll also see rapid growth in demand for the ability to customize models."
Outcome: Nailed the LLM-to-production transition. The "customize models" prediction was especially prescient — fine-tuning and RAG became the dominant enterprise pattern.
Sam Altman
OpenAI
0.88
"We believe that, in 2025, we may see the first AI agents 'join the workforce' and materially change the output of companies."
Outcome: Highly accurate. AI agents deeply integrated into workforce by late 2025. The hedge ("may see") helped. Timing, direction, and magnitude all correct.
Gary Marcus
NYU / Independent
0.88
"OpenAI's GPT-5 project thus far has been a failure... Scaling hasn't gotten them — or anyone else — to AGI."
Outcome: Highly accurate. By late 2025, industry broadly acknowledged diminishing returns on pure parameter scaling. His best-scored prediction after years of being dismissed.
Eric Schmidt
Ex-Google CEO
0.88
"There are scenarios not today but reasonably soon where these systems will be able to find zero-day exploits in cyber issues."
Outcome: By late 2025, security researchers demonstrated autonomous penetration testing agents finding zero-day vulnerabilities. Timeline and threat vector exactly correct.
Mark Zuckerberg
Meta
0.88
"Our long-term vision is to build general intelligence, open source it responsibly, and make it widely available."
Outcome: Llama 3 and 4 proved open-weights models can match closed model capabilities. Strategy reshaped the entire AI market, preventing proprietary monopoly.
David Ferris
Independent / AI Realist
0.86
"LLM progress will dramatically plateau, but vertical AI will still emerge as the next great software market."
Outcome: Called both the plateau AND the pivot to vertical AI copilots. One of the most balanced predictions in the database — and he's not a CEO, not a researcher. Just a blogger.
The pattern: Three NVIDIA engineers, one Stanford professor, one independent blogger, one skeptic, one VC, one CEO. The common thread isn't title or affiliation — it's proximity to real deployment and willingness to be specific about mechanisms.

The Hype Merchants (Below 0.25)

These predictions scored worst. The striking thing: they come from both extremes. Maximum optimism and maximum pessimism are equally wrong.

Ed Zitron
Better Offline (3 predictions)
0.13
"There is no killer app for generative AI. It doesn't do anything that well."
Outcome: Coding assistants, AI search, image generation, and customer support automation all became killer apps within months. Also called OpenAI "fraudulent" — they hit $5B+ annual revenue.
Eliezer Yudkowsky
MIRI
0.12
"The most likely result of building a superhumanly smart AI... is that literally everyone on Earth will die."
Outcome: No moratorium enacted. Massive data centers built worldwide. Models reached unprecedented capability. No catastrophic scenarios materialized. The "airstrikes on data centers" proposal damaged his credibility.
David Patterson
LessWrong predictor
0.10
"There is zero chance we won't reach AGI by the end of next year. My definition of AGI is the human-to-AI transition point — AI capable of doing all jobs."
Outcome: "Zero chance" is the reddest of red flags in prediction. As of April 2026, AI cannot do all jobs. Physical labor, novel research, and complex negotiation remain firmly human.
Emily Bender
University of Washington
0.14
"Scaling up language models will not lead to understanding or intelligence, just more fluent-sounding nonsense."
Outcome: Each generation showed clear capability jumps in reasoning, coding, math, and planning. The "stochastic parrot" framing became increasingly untenable as capabilities grew.
Emad Mostaque
Stability AI (former CEO)
0.22
"There are no programmers in five years."
Outcome: Three years in, human programmers still highly in demand. Failed to account for Jevons Paradox: making code cheaper increased complexity and demand for software.
Julia McCoy
First Movers
0.22
"We'll see examples of $100M+ companies operating with just two or three people."
Outcome: No verified examples exist. Midjourney does ~$200M with ~40 people, which is remarkable but still 13x more humans than predicted. Classic consultancy hype.
Aidan McLau
LessWrong predictor
0.14
"I think it's likely (p=.6) that an o-series model solves a millennium prize math problem in 2025."
Outcome: False. Millennium problems require novel mathematical insight, not pattern matching. No Millennium Prize problem solved by AI as of end 2025.
The lesson: Maximum confidence in either direction — hype or doom — scored worst. Patterson's "zero chance no AGI by 2026" (0.10) and Yudkowsky's "literally everyone dies" (0.12) are mirror images of the same failure mode: mistaking emotional conviction for analytical rigor.

The Skeptics' Report Card

Professional AI skeptics have built careers on "AI can't do X" claims. How did they actually perform?

Optimists
0.62
Neutral / Mixed
0.55
Skeptics
0.30

Skeptics as a group scored 0.30 — about half the accuracy of the optimists. But one skeptic stands apart from the rest.

The Exception: Arvind Narayanan (0.68)

Princeton's Narayanan was the only skeptic who scored well, and the reason is instructive. He was specific about what he criticized. He didn't make sweeping "AI is fake" claims. Instead, he targeted predictive AI in hiring and criminal justice — and he was right. Those products genuinely are unreliable.

His second prediction — that AI wouldn't cause mass unemployment — scored 0.80. By April 2026, no mass job apocalypse had materialized.

Why most skeptics failed: They predicted what AI couldn't do (create value, find killer apps, improve with scale) rather than what it shouldn't do (predict recidivism, automate hiring). The first is a capability prediction you'll probably lose. The second is a values judgment that ages better.

Historical Perspective: AGI Is Always "20 Years Away"

The most durable pattern in AI prediction history: every generation of researchers believes AGI is just around the corner. It never is.

1965Herbert A. Simon (Carnegie Mellon): "Machines will be capable, within twenty years, of doing any work a man can do."
AGI by 1985. Score: 0.08
1993Vernor Vinge: "Within thirty years, we will have the technological means to create superhuman intelligence."
Singularity by 2023. Score: 0.20
2008Gordon E. Moore (Intel founder): "I don't believe this kind of thing is likely to happen, at least for a long time."
AGI never/distant future. Score: 0.28
2017Ray Kurzweil (Google): "AGI will be achieved by 2029."
Pending — 3 years remain
2023Geoffrey Hinton: "I now think it's 5 to 20 years away."
AGI 2028-2043. Pending
2024Elon Musk: "AI will probably be smarter than any single human next year."
Wrong. Score: 0.28
2025David Patterson: "There is zero chance we won't reach AGI by the end of next year."
Almost certainly wrong. Score: 0.10
61 years. Same prediction. Same outcome. From Herbert Simon in 1965 to David Patterson in 2025, AGI has been perpetually "just around the corner." The optimists and the pessimists both get the timeline wrong — just in opposite directions.

Key Findings

40%
People closest to production outpredict CEOs by this margin
0.10 vs 0.12
Maximum hype (Patterson) and maximum doom (Yudkowsky) are equally wrong
0.86
Most accurate predictor with 2+ predictions is an independent blogger, not a CEO or researcher
0.18
Reddit crowd consensus performed worse than a coin flip
61 years
AGI has been "20 years away" since 1965
0.51
Average expert accuracy — barely better than flipping a coin

Methodology

Each prediction was scored on three dimensions, weighted and combined into a 0–1 overall accuracy score:

DimensionWeightWhat It Measures
Direction40%Did the predicted thing move in the predicted direction? Binary 0/1.
Timing30%How far off was the timeline? Measured in months early or late.
Magnitude30%How close was the scale of the prediction to reality? 0 = wildly off, 1 = nailed it.

Status Categories

Scored: Prediction window has passed, fully evaluable. Partial: Some evidence available, preliminary score assigned. Pending: Can't score yet — prediction window still open. Pending predictions are excluded from averages.

Data Sources

Predictions sourced from published interviews, blog posts, research papers, official company announcements, Forbes roundups, McKinsey surveys, LessWrong prediction markets, and Reddit threads. All quotes are verbatim with original source citations.

Outcomes assessed against publicly available data as of April 2026: company earnings reports, industry surveys, product launches, market data, and independent research.

Transparency note: Scoring inherently involves judgment calls, particularly on magnitude. We've published the full dataset with individual scores, quotes, and outcomes so readers can evaluate our scoring themselves. Where reasonable people could disagree on a score, we note it.

Want to Know Who's Actually Worth Listening to on AI?

Get updates when we score new predictions — plus free tools and research as I release them.

No spam. Unsubscribe anytime.