RSS

Sharp Left Turn

TagLast edit: 3 Sep 2022 3:54 UTC by Multicore

A Sharp Left Turn is a scenario where, as an AI trains, its capabilities generalize across many domains while the alignment properties that held at earlier stages fail to generalize to the new domains.

See also: Threat Models, AI Takeoff, AI Risk

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 13:10 UTC
261 points
48 comments10 min readLW link

Refin­ing the Sharp Left Turn threat model, part 1: claims and mechanisms

12 Aug 2022 15:17 UTC
78 points
3 comments3 min readLW link
(vkrakovna.wordpress.com)

We may be able to see sharp left turns coming

3 Sep 2022 2:55 UTC
46 points
28 comments1 min readLW link

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

25 Nov 2022 14:36 UTC
39 points
8 comments6 min readLW link
(vkrakovna.wordpress.com)

[Question] How is the “sharp left turn defined”?

Chris_Leong9 Dec 2022 0:04 UTC
13 points
4 comments1 min readLW link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
39 points
3 comments4 min readLW link
(www.theinsideview.ai)

Refram­ing in­ner alignment

davidad11 Dec 2022 13:53 UTC
48 points
13 comments4 min readLW link

Goal Align­ment Is Ro­bust To the Sharp Left Turn

Thane Ruthenis13 Jul 2022 20:23 UTC
46 points
15 comments4 min readLW link

It mat­ters when the first sharp left turn happens

Adam Jermyn29 Sep 2022 20:12 UTC
35 points
9 comments4 min readLW link

Smoke with­out fire is scary

Adam Jermyn4 Oct 2022 21:08 UTC
49 points
22 comments4 min readLW link

Disen­tan­gling in­ner al­ign­ment failures

Erik Jenner10 Oct 2022 18:50 UTC
14 points
5 comments4 min readLW link

A caveat to the Orthog­o­nal­ity Thesis

Wuschel Schulz9 Nov 2022 15:06 UTC
37 points
10 comments2 min readLW link
No comments.