Daniel Kokotajlo

Karma: 36,449

Was a philosophy PhD student, left to work at AI Impacts, then Center on Long-Term Risk, then OpenAI. Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI. Now executive director of the AI Futures Project. I subscribe to Crocker’s Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html

Some of my favorite memes:

(by Rob Wiblin)

Comic. Megan & Cueball show White Hat a graph of a line going up, not yet at, but heading towards, a threshold labelled "BAD". White Hat: "So things will be bad?" Megan: "Unless someone stops it." White Hat: "Will someone do that?" Megan: "We don't know, that's why we're showing you." White Hat: "Well, let me know if that happens!" Megan: "Based on this conversation, it already has."

(xkcd)

My EA Journey, depicted on the whiteboard at CLR:

(h/t Scott Alexander)

Alex Blechman @AlexBlechman Sci-Fi Author: In my book I invented the Torment Nexus as a cautionary tale Tech Company: At long last, we have created the Torment Nexus from classic sci-fi novel Don't Create The Torment Nexus 5:49 PM Nov 8, 2021. Twitter Web App

AI 2040: Plan A

Daniel Kokotajlo, elifland, Thomas Larsen, romeo, bhalstead and ryan_greenblatt

9 Jul 2026 16:25 UTC

518 points

125 comments1 min readLW link

(www.ai-2040.com)

A Research Agenda for Secret Loyalties

Joe Kwon, Alfie Lamerton, draganover, Dave Banerjee, Bronson Schoen, Daniel Kokotajlo, ryan_greenblatt, Owain_Evans, Fabien Roger and Tom Davidson

13 May 2026 17:34 UTC

39 points

5 comments3 min readLW link

Q1 2026 Timelines Update

Daniel Kokotajlo, elifland and bhalstead

2 Apr 2026 22:23 UTC

91 points

19 comments2 min readLW link

(open.substack.com)

Types of Handoff to AIs

Daniel Kokotajlo16 Mar 2026 22:24 UTC

65 points

11 comments8 min readLW link

Grading AI 2027′s 2025 Predictions

Daniel Kokotajlo and elifland

13 Feb 2026 0:18 UTC

64 points

4 comments9 min readLW link

(blog.ai-futures.org)

Clarifying how our AI timelines forecasts have changed since AI 2027

elifland, Daniel Kokotajlo and bhalstead

27 Jan 2026 22:58 UTC

70 points

12 comments6 min readLW link

(blog.ai-futures.org)

AI Futures Timelines and Takeoff Model: Dec 2025 Update

elifland, bhalstead, Alex Kastner and Daniel Kokotajlo

31 Dec 2025 22:34 UTC

149 points

34 comments25 min readLW link

Response to titotal’s critique of our AI 2027 timelines model

elifland and Daniel Kokotajlo

16 Dec 2025 0:51 UTC

46 points

6 comments43 min readLW link

(aifuturesnotes.substack.com)

Vitalik’s Response to AI 2027

Daniel Kokotajlo11 Jul 2025 21:43 UTC

124 points

53 comments12 min readLW link

(vitalik.eth.limo)

My pitch for the AI Village

Daniel Kokotajlo24 Jun 2025 15:00 UTC

183 points

35 comments5 min readLW link

METR’s Observations of Reward Hacking in Recent Frontier Models

Daniel Kokotajlo9 Jun 2025 18:03 UTC

100 points

9 comments11 min readLW link

(metr.org)

Training AGI in Secret would be Unsafe and Unethical

Daniel Kokotajlo18 Apr 2025 12:27 UTC

149 points

16 comments6 min readLW link

AI 2027: What Superintelligence Looks Like

Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V and romeo

3 Apr 2025 16:23 UTC

697 points

223 comments41 min readLW link

(ai-2027.com)

OpenAI: Detecting misbehavior in frontier reasoning models

Daniel Kokotajlo11 Mar 2025 2:17 UTC

183 points

26 comments4 min readLW link

(openai.com)

What goals will AIs have? A list of hypotheses

Daniel Kokotajlo3 Mar 2025 20:08 UTC

91 points

20 comments18 min readLW link

Extended analogy between humans, corporations, and AIs.

Daniel Kokotajlo13 Feb 2025 0:03 UTC

36 points

2 comments6 min readLW link

Why Don’t We Just… Shoggoth+Face+Paraphraser?

Daniel Kokotajlo and abramdemski

19 Nov 2024 20:53 UTC

175 points

59 comments14 min readLW link 1 review

Self-Awareness: Taxonomy and eval suite proposal

Daniel Kokotajlo17 Feb 2024 1:47 UTC

76 points

2 comments11 min readLW link

AI Timelines

habryka, Daniel Kokotajlo, Ajeya Cotra and Ege Erdil

10 Nov 2023 5:28 UTC

302 points

144 comments51 min readLW link 2 reviews

Linkpost for Jan Leike on Self-Exfiltration

Daniel Kokotajlo13 Sep 2023 21:23 UTC

59 points

1 comment2 min readLW link

(aligned.substack.com)