RSS

Sharp Left Turn

TagLast edit: 30 Dec 2024 9:49 UTC by Dakara

Sharp Left Turn is a scenario where, as an AI trains, its capabilities generalize across many domains while the alignment properties that held at earlier stages fail to generalize to the new domains.

See also: Threat Models, AI Takeoff, AI Risk

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 13:10 UTC
275 points
55 comments10 min readLW link1 review

Refin­ing the Sharp Left Turn threat model, part 1: claims and mechanisms

12 Aug 2022 15:17 UTC
86 points
4 comments3 min readLW link1 review
(vkrakovna.wordpress.com)

We may be able to see sharp left turns coming

3 Sep 2022 2:55 UTC
54 points
29 comments1 min readLW link

Refram­ing in­ner alignment

davidad11 Dec 2022 13:53 UTC
53 points
13 comments4 min readLW link

[In­ter­view w/​ Quintin Pope] Evolu­tion, val­ues, and AI Safety

fowlertm24 Oct 2023 13:53 UTC
11 points
0 comments1 min readLW link

[Question] A few Align­ment ques­tions: util­ity op­ti­miz­ers, SLT, sharp left turn and identifiability

Igor Timofeev26 Sep 2023 0:27 UTC
6 points
1 comment2 min readLW link

We don’t un­der­stand what hap­pened with cul­ture enough

Jan_Kulveit9 Oct 2023 9:54 UTC
87 points
22 comments6 min readLW link1 review

“Sharp Left Turn” dis­course: An opinionated review

Steven Byrnes28 Jan 2025 18:47 UTC
216 points
31 comments31 min readLW link

Su­per­in­tel­li­gence’s goals are likely to be random

Mikhail Samin13 Mar 2025 22:41 UTC
6 points
6 comments5 min readLW link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
40 points
3 comments4 min readLW link
(www.theinsideview.ai)

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

25 Nov 2022 14:36 UTC
39 points
9 comments6 min readLW link
(vkrakovna.wordpress.com)

[Question] How is the “sharp left turn defined”?

Chris_Leong9 Dec 2022 0:04 UTC
14 points
4 comments1 min readLW link

Evolu­tion Solved Align­ment (what sharp left turn?)

jacob_cannell12 Oct 2023 4:15 UTC
23 points
89 comments4 min readLW link

The Sharp Right Turn: sud­den de­cep­tive al­ign­ment as a con­ver­gent goal

avturchin6 Jun 2023 9:59 UTC
38 points
5 comments1 min readLW link

Evolu­tion pro­vides no ev­i­dence for the sharp left turn

Quintin Pope11 Apr 2023 18:43 UTC
206 points
65 comments15 min readLW link1 review

[Question] Has Eliezer pub­li­cly and satis­fac­to­rily re­sponded to at­tempted re­but­tals of the anal­ogy to evolu­tion?

kaler28 Jul 2024 12:23 UTC
10 points
14 comments1 min readLW link

Mo­ral gauge the­ory: A spec­u­la­tive sug­ges­tion for AI alignment

James Diacoumis23 Feb 2025 11:42 UTC
6 points
2 comments8 min readLW link

Smoke with­out fire is scary

Adam Jermyn4 Oct 2022 21:08 UTC
52 points
22 comments4 min readLW link

A caveat to the Orthog­o­nal­ity Thesis

Wuschel Schulz9 Nov 2022 15:06 UTC
38 points
10 comments2 min readLW link

A smart enough LLM might be deadly sim­ply if you run it for long enough

Mikhail Samin5 May 2023 20:49 UTC
19 points
16 comments8 min readLW link

Agency over­hang as a proxy for Sharp left turn

7 Nov 2024 12:14 UTC
6 points
0 comments5 min readLW link

It mat­ters when the first sharp left turn happens

Adam Jermyn29 Sep 2022 20:12 UTC
45 points
9 comments4 min readLW link

Disen­tan­gling in­ner al­ign­ment failures

Erik Jenner10 Oct 2022 18:50 UTC
23 points
5 comments4 min readLW link

Re­sponse to Quintin Pope’s Evolu­tion Pro­vides No Ev­i­dence For the Sharp Left Turn

Zvi5 Oct 2023 11:39 UTC
129 points
29 comments9 min readLW link

Goal Align­ment Is Ro­bust To the Sharp Left Turn

Thane Ruthenis13 Jul 2022 20:23 UTC
43 points
16 comments4 min readLW link
No comments.