Elliott Thornley (EJT)

Karma: 1,056

elliott-thornley.com

AI safety can be a Pascal’s mugging even if p(doom) is high

Elliott Thornley (EJT)25 Apr 2026 16:16 UTC

22 points

9 comments1 min readLW link

Preference gaps as a safeguard against AI self-replication

tbs and Elliott Thornley (EJT)

26 Nov 2025 14:49 UTC

10 points

2 comments11 min readLW link

Shutdownable Agents through POST-Agency

Elliott Thornley (EJT)16 Sep 2025 12:09 UTC

32 points

8 comments54 min readLW link

(arxiv.org)

Towards shutdownable agents via stochastic choice

Elliott Thornley (EJT), alexr, christosi and LAThomson

8 Jul 2024 10:14 UTC

59 points

11 comments23 min readLW link

(arxiv.org)

The Shutdown Problem: Incomplete Preferences as a Solution

Elliott Thornley (EJT)23 Feb 2024 16:01 UTC

62 points

33 comments41 min readLW link

The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists

Elliott Thornley (EJT)23 Oct 2023 21:00 UTC

79 points

28 comments39 min readLW link

(philpapers.org)

The price is right

Elliott Thornley (EJT)16 Oct 2023 16:34 UTC

42 points

3 comments4 min readLW link

(openairopensea.substack.com)

[Question] What are some examples of AIs instantiating the ‘nearest unblocked strategy problem’?

Elliott Thornley (EJT)4 Oct 2023 11:05 UTC

6 points

4 comments1 min readLW link

EJT’s Shortform

Elliott Thornley (EJT)26 Sep 2023 15:19 UTC

4 points

16 comments1 min readLW link

There are no coherence theorems

Dan H and Elliott Thornley (EJT)

20 Feb 2023 21:25 UTC

158 points

136 comments19 min readLW link 1 review