Sharp Left Turn

TagLast edit: 3 Sep 2022 3:54 UTC by Multicore

A Sharp Left Turn is a scenario where, as an AI trains, its capabilities generalize across many domains while the alignment properties that held at earlier stages fail to generalize to the new domains.

See also: Threat Models, AI Takeoff, AI Risk

A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res15 Jun 2022 13:10 UTC

279 points

53 comments10 min readLW link 1 review

Refining the Sharp Left Turn threat model, part 1: claims and mechanisms

Vika, Vikrant Varma, Ramana Kumar and Mary Phuong

12 Aug 2022 15:17 UTC

85 points

4 comments3 min readLW link 1 review

(vkrakovna.wordpress.com)

We may be able to see sharp left turns coming

Ethan Perez and Neel Nanda

3 Sep 2022 2:55 UTC

53 points

29 comments2 min readLW link

Reframing inner alignment

davidad11 Dec 2022 13:53 UTC

53 points

13 comments4 min readLW link

The Sharp Right Turn: sudden deceptive alignment as a convergent goal

avturchin6 Jun 2023 9:59 UTC

38 points

5 comments1 min readLW link

[Question] A few Alignment questions: utility optimizers, SLT, sharp left turn and identifiability

Igor Timofeev26 Sep 2023 0:27 UTC

5 points

1 comment2 min readLW link

We don’t understand what happened with culture enough

Jan_Kulveit9 Oct 2023 9:54 UTC

86 points

21 comments6 min readLW link

Evolution Solved Alignment (what sharp left turn?)

jacob_cannell12 Oct 2023 4:15 UTC

20 points

89 comments4 min readLW link

Refining the Sharp Left Turn threat model, part 2: applying alignment techniques

Vika, Vikrant Varma, Ramana Kumar and Rohin Shah

25 Nov 2022 14:36 UTC

39 points

9 comments6 min readLW link

(vkrakovna.wordpress.com)

[Question] How is the “sharp left turn defined”?

Chris_Leong9 Dec 2022 0:04 UTC

14 points

4 comments1 min readLW link

Victoria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC

40 points

3 comments4 min readLW link

(www.theinsideview.ai)

[Interview w/ Quintin Pope] Evolution, values, and AI Safety

fowlertm24 Oct 2023 13:53 UTC

11 points

0 comments1 min readLW link

Goal Alignment Is Robust To the Sharp Left Turn

Thane Ruthenis13 Jul 2022 20:23 UTC

47 points

16 comments4 min readLW link

It matters when the first sharp left turn happens

Adam Jermyn29 Sep 2022 20:12 UTC

44 points

9 comments4 min readLW link

Evolution provides no evidence for the sharp left turn

Quintin Pope11 Apr 2023 18:43 UTC

202 points

62 comments15 min readLW link

Smoke without fire is scary

Adam Jermyn4 Oct 2022 21:08 UTC

51 points

22 comments4 min readLW link

Disentangling inner alignment failures

Erik Jenner10 Oct 2022 18:50 UTC

20 points

5 comments4 min readLW link

A caveat to the Orthogonality Thesis

Wuschel Schulz9 Nov 2022 15:06 UTC

38 points

10 comments2 min readLW link

Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn

Zvi5 Oct 2023 11:39 UTC

129 points

29 comments9 min readLW link

A smart enough LLM might be deadly simply if you run it for long enough

Mikhail Samin5 May 2023 20:49 UTC

16 points

16 comments8 min readLW link

No comments.