Sharp Left Turn

TagLast edit: 3 Sep 2022 3:54 UTC by Multicore

A Sharp Left Turn is a scenario where, as an AI trains, its capabilities generalize across many domains while the alignment properties that held at earlier stages fail to generalize to the new domains.

See also: Threat Models, AI Takeoff, AI Risk

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 13:10 UTC
279 points
53 comments10 min readLW link1 review

Refin­ing the Sharp Left Turn threat model, part 1: claims and mechanisms

12 Aug 2022 15:17 UTC
85 points
4 comments3 min readLW link1 review

We may be able to see sharp left turns coming

3 Sep 2022 2:55 UTC
53 points
29 comments2 min readLW link

Refram­ing in­ner alignment

davidad11 Dec 2022 13:53 UTC
53 points
13 comments4 min readLW link

The Sharp Right Turn: sud­den de­cep­tive al­ign­ment as a con­ver­gent goal

avturchin6 Jun 2023 9:59 UTC
38 points
5 comments1 min readLW link

[Question] A few Align­ment ques­tions: util­ity op­ti­miz­ers, SLT, sharp left turn and identifiability

Igor Timofeev26 Sep 2023 0:27 UTC
5 points
1 comment2 min readLW link

We don’t un­der­stand what hap­pened with cul­ture enough

Jan_Kulveit9 Oct 2023 9:54 UTC
86 points
21 comments6 min readLW link

Evolu­tion Solved Align­ment (what sharp left turn?)

jacob_cannell12 Oct 2023 4:15 UTC
20 points
89 comments4 min readLW link

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

25 Nov 2022 14:36 UTC
39 points
9 comments6 min readLW link

[Question] How is the “sharp left turn defined”?

Chris_Leong9 Dec 2022 0:04 UTC
14 points
4 comments1 min readLW link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
40 points
3 comments4 min readLW link

[In­ter­view w/​ Quintin Pope] Evolu­tion, val­ues, and AI Safety

fowlertm24 Oct 2023 13:53 UTC
11 points
0 comments1 min readLW link

Goal Align­ment Is Ro­bust To the Sharp Left Turn

Thane Ruthenis13 Jul 2022 20:23 UTC
47 points
16 comments4 min readLW link

It mat­ters when the first sharp left turn happens

Adam Jermyn29 Sep 2022 20:12 UTC
44 points
9 comments4 min readLW link

Evolu­tion pro­vides no ev­i­dence for the sharp left turn

Quintin Pope11 Apr 2023 18:43 UTC
201 points
62 comments15 min readLW link

Smoke with­out fire is scary

Adam Jermyn4 Oct 2022 21:08 UTC
51 points
22 comments4 min readLW link

Disen­tan­gling in­ner al­ign­ment failures

Erik Jenner10 Oct 2022 18:50 UTC
20 points
5 comments4 min readLW link

A caveat to the Orthog­o­nal­ity Thesis

Wuschel Schulz9 Nov 2022 15:06 UTC
38 points
10 comments2 min readLW link

Re­sponse to Quintin Pope’s Evolu­tion Pro­vides No Ev­i­dence For the Sharp Left Turn

Zvi5 Oct 2023 11:39 UTC
129 points
29 comments9 min readLW link

A smart enough LLM might be deadly sim­ply if you run it for long enough

Mikhail Samin5 May 2023 20:49 UTC
16 points
16 comments8 min readLW link
No comments.