Archive Sequences About Log In Questions Events Shortform Alignment Forum Home Featured All Tags
AI Core Tag Last edit: 17 Jan 2025 22:24 UTC by Dakara Artificial Intelligence is the study of creating intelligence in algorithms. AI Alignment is the task of ensuring [powerful] AI system are aligned with human values and interests. The central concern is that a powerful enough AI, if not designed and implemented with sufficient understanding, would optimize something unintended by its creators and pose an existential threat to the future of humanity. This is known as the AI alignment problem.
Common terms in this space are superintelligence, AI Alignment, AI Safety, Friendly AI, Transformative AI, human-level-intelligence, AI Governance, and Beneficial AI. This entry and the associated tag roughly encompass all of these topics: anything part of the broad cluster of understanding AI and its future impacts on our civilization deserves this tag.
AI Alignment
There are narrow conceptions of alignment, where you’re trying to get it to do something like cure Alzheimer’s disease without destroying the rest of the world. And there’s much more ambitious notions of alignment, where you’re trying to get it to do the right thing and achieve a happy intergalactic civilization.
But both the narrow and the ambitious alignment have in common that you’re trying to have the AI do that thing rather than making a lot of paperclips.
See also General Intelligence .
Relevant New Old 15 Nov 2018 19:49 UTC 143 points
54 min read LW link hath 8 Dec 2021 17:06 UTC 86 points
1 min read LW link (deepmind.com)
9 Aug 2022 1:09 UTC 21 points
2 min read LW link gwern 2 Nov 2021 2:32 UTC 134 points
1 min read LW link (arxiv.org)
11 Nov 2021 3:01 UTC 325 points
34 min read LW link 3 Dec 2021 20:05 UTC 90 points
20 min read LW link 6 Jun 2022 21:59 UTC 126 points
7 min read LW link 16 Aug 2021 7:12 UTC 89 points
9 min read LW link lc 5 Apr 2022 0:19 UTC 554 points
6 min read LW link 28 Mar 2021 14:55 UTC 60 points
8 min read LW link 3 Jun 2021 20:37 UTC 70 points
3 min read LW link 20 Aug 2021 21:03 UTC 57 points
14 min read LW link gwern 9 Mar 2022 16:35 UTC 386 points
1 min read LW link (www.gwern.net)
21 Nov 2020 3:46 UTC 110 points
4 min read LW link 18 Jan 2021 11:15 UTC 69 points
31 min read LW link 18 Oct 2021 18:37 UTC 81 points
10 min read LW link 1 Jun 2019 20:52 UTC 75 points
12 min read LW link 6 May 2020 23:51 UTC 46 points
6 min read LW link 12 Dec 2020 0:34 UTC 62 points
2 min read LW link (avoiding-side-effects.github.io)
riceissa 7 Feb 2021 22:29 UTC 63 points
2 min read LW link (timelines.issarice.com)
27 Apr 2021 2:06 UTC 62 points
27 min read LW link evhub 17 May 2021 22:54 UTC 98 points
2 min read LW link (arxiv.org)
8 Jun 2021 12:36 UTC 20 points
9 min read LW link Ozyrus 3 Jun 2021 12:07 UTC 23 points
1 min read LW link (www.engadget.com)
15 Jul 2021 18:43 UTC 55 points
7 min read LW link 3 Aug 2021 22:41 UTC 50 points
15 min read LW link 20 Aug 2021 19:52 UTC 56 points
1 min read LW link 30 Sep 2021 13:50 UTC 62 points
15 min read LW link Ozyrus 11 Oct 2021 15:28 UTC 51 points
1 min read LW link (developer.nvidia.com)
15 Nov 2021 20:31 UTC 235 points
99 min read LW link D𝜋 1 Jan 2022 23:24 UTC 41 points
44 min read LW link 27 Jan 2022 13:13 UTC 27 points
1 min read LW link (arxiv.org)
Algon 2 Feb 2022 16:49 UTC 58 points
15 min read LW link (deepmind.com)
18 Feb 2022 3:30 UTC 67 points
23 min read LW link P. 6 Apr 2022 14:17 UTC 44 points
1 min read LW link (openai.com)
4 Jun 2022 4:10 UTC 79 points
5 min read LW link lc 14 Jun 2022 22:17 UTC 96 points
2 min read LW link Kevin 19 Jun 2022 11:13 UTC 7 points
1 min read LW link (www.wired.com)
27 Jun 2022 13:55 UTC 95 points
2 min read LW link (epochai.org)
1 Jul 2022 19:40 UTC 39 points
4 min read LW link 14 Jul 2022 2:31 UTC 175 points
10 min read LW link 15 Aug 2022 2:41 UTC 338 points
42 min read LW link (colab.research.google.com)
4 Sep 2022 4:28 UTC 202 points
24 min read LW link 7 Oct 2022 14:38 UTC 51 points
2 min read LW link (deepmindsafetyresearch.medium.com)
22 Nov 2022 18:57 UTC 103 points
24 min read LW link 1 Dec 2022 23:11 UTC 265 points
2 min read LW link gwern 29 May 2020 1:49 UTC 67 points
1 min read LW link (arxiv.org)
Cullen 7 Jul 2020 20:53 UTC 9 points
1 min read LW link (cullenokeefe.com)
17 Jul 2020 15:32 UTC 98 points
7 min read LW link Cullen 28 Jul 2020 18:34 UTC 2 points
1 min read LW link (www.fhi.ox.ac.uk)
15 Aug 2020 1:02 UTC 42 points
5 min read LW link 22 Aug 2020 2:33 UTC 133 points
2 min read LW link Quinn 7 Dec 2020 18:51 UTC 42 points
2 min read LW link (technical-ai-safety.libsyn.com)
15 Dec 2020 9:16 UTC 25 points
8 min read LW link 22 Jan 2021 17:17 UTC 32 points
1 min read LW link Quinn 11 Mar 2021 1:44 UTC 24 points
1 min read LW link (technical-ai-safety.libsyn.com)
6 Jun 2021 13:33 UTC 34 points
6 min read LW link 23 Jun 2021 23:25 UTC 70 points
9 min read LW link 8 Sep 2021 16:19 UTC 67 points
14 min read LW link Ozyrus 29 Oct 2021 11:55 UTC 6 points
1 min read LW link (blog.google)
Zvi 15 Nov 2021 3:50 UTC 204 points
16 min read LW link (thezvi.wordpress.com)
23 Nov 2021 19:16 UTC 64 points
3 min read LW link (aitracker.org)
Mau 30 Nov 2021 2:19 UTC 17 points
16 min read LW link 6 Dec 2021 0:03 UTC 82 points
2 min read LW link 6 Dec 2021 13:54 UTC 54 points
12 min read LW link 13 Dec 2021 8:54 UTC 17 points
10 min read LW link evhub 22 Dec 2021 21:09 UTC 142 points
3 min read LW link (transformer-circuits.pub)
8 Feb 2022 16:07 UTC 25 points
1 min read LW link (www.metaculus.com)
16 Feb 2022 14:18 UTC 91 points
2 min read LW link 7 Mar 2022 15:32 UTC 59 points
6 min read LW link 17 Mar 2022 17:22 UTC 44 points
9 min read LW link (www.metaculus.com)
4 Apr 2022 12:59 UTC 69 points
16 min read LW link 9 Apr 2022 14:56 UTC 70 points
13 min read LW link 26 Apr 2022 16:13 UTC 74 points
3 min read LW link 27 Apr 2022 0:43 UTC 56 points
3 min read LW link 29 Apr 2022 8:57 UTC 63 points
10 min read LW link 6 May 2022 9:37 UTC 12 points
17 min read LW link 9 May 2022 9:44 UTC 36 points
8 min read LW link 16 May 2022 21:21 UTC 57 points
6 min read LW link 11 Jun 2022 22:53 UTC 8 points
6 min read LW link 16 Jun 2022 9:17 UTC 59 points
9 min read LW link 1 Jul 2022 15:51 UTC 85 points
1 min read LW link (epochai.org)
RohanS 2 Jul 2022 17:45 UTC 3 points
2 min read LW link (forum.effectivealtruism.org)
6 Jul 2022 2:07 UTC 59 points
4 min read LW link 29 Jul 2022 19:07 UTC 119 points
19 min read LW link Yldedly 15 Aug 2022 8:42 UTC 4 points
1 min read LW link (deoxyribose.github.io)
17 Aug 2022 17:18 UTC 68 points
9 min read LW link (epochai.org)
19 Aug 2022 17:18 UTC 30 points
5 min read LW link 29 Aug 2022 1:23 UTC 345 points
38 min read LW link 1 Sep 2022 4:34 UTC 28 points
18 min read LW link 8 Sep 2022 2:25 UTC 43 points
14 min read LW link 3 Sep 2022 2:55 UTC 50 points
1 min read LW link 9 Sep 2022 21:38 UTC 4 points
9 min read LW link 14 Sep 2022 19:38 UTC 42 points
6 min read LW link John Nay 18 Sep 2022 20:39 UTC 11 points
3 min read LW link (forum.effectivealtruism.org)
20 Sep 2022 17:36 UTC 63 points
4 min read LW link 22 Sep 2022 13:25 UTC 114 points
2 min read LW link 27 Sep 2022 23:13 UTC 157 points
4 min read LW link 12 Oct 2022 21:25 UTC 49 points
4 min read LW link 19 Oct 2022 1:17 UTC 75 points
15 min read LW link sanxiyn 3 Nov 2022 13:27 UTC 31 points
1 min read LW link (goattack.alignmentfund.org)
28 Nov 2022 12:54 UTC 159 points
31 min read LW link 14 Dec 2022 14:33 UTC 22 points
11 min read LW link 7 Jun 2019 19:53 UTC 78 points
6 min read LW link 5 Jun 2019 20:16 UTC 97 points
17 min read LW link 4 Jun 2019 1:20 UTC 99 points
13 min read LW link 8 Nov 2018 14:19 UTC 88 points
2 min read LW link 6 Nov 2018 16:16 UTC 100 points
1 min read LW link 4 Nov 2018 16:38 UTC 110 points
1 min read LW link 2 Nov 2018 16:07 UTC 87 points
1 min read LW link 31 Oct 2018 18:41 UTC 114 points
1 min read LW link gwern 5 May 2020 16:32 UTC 47 points
1 min read LW link (openai.com)
1 Jun 2020 13:25 UTC 41 points
7 min read LW link Ben Pace 15 Jun 2020 21:49 UTC 43 points
5 min read LW link (www.openphilanthropy.org)
gwern 25 Jun 2020 16:12 UTC 63 points
1 min read LW link (www.gwern.net)
26 Jun 2020 22:14 UTC 46 points
6 min read LW link Vika 20 Nov 2018 17:29 UTC 34 points
1 min read LW link (medium.com)
11 Mar 2019 20:55 UTC 29 points
2 min read LW link (medium.com)
Vika 27 Sep 2018 16:28 UTC 43 points
1 min read LW link (medium.com)
Richard_Ngo 30 Sep 2018 15:48 UTC 85 points
13 min read LW link (thinkingcomplete.blogspot.com)
gwern 29 May 2017 21:09 UTC 7 points
1 min read LW link (arxiv.org)
8 Jul 2020 0:27 UTC 19 points
1 min read LW link SebastianG 6 Jul 2020 23:34 UTC 200 points
7 min read