RSS

Re­search Agendas

TagLast edit: 16 Sep 2021 15:08 UTC by plex

Research Agendas lay out the areas of research which individuals or groups are working on, or those that they believe would be valuable for others to work on. They help make research more legible and encourage discussion of priorities.

The Learn­ing-The­o­retic AI Align­ment Re­search Agenda

Vanessa Kosoy4 Jul 2018 9:53 UTC
64 points
39 comments32 min readLW link

New safety re­search agenda: scal­able agent al­ign­ment via re­ward modeling

Vika20 Nov 2018 17:29 UTC
34 points
13 comments1 min readLW link
(medium.com)

Re­search Agenda v0.9: Syn­the­sis­ing a hu­man’s prefer­ences into a util­ity function

Stuart_Armstrong17 Jun 2019 17:46 UTC
66 points
25 comments33 min readLW link

Paul’s re­search agenda FAQ

zhukeepa1 Jul 2018 6:25 UTC
123 points
69 comments19 min readLW link1 review

AI Gover­nance: A Re­search Agenda

habryka5 Sep 2018 18:00 UTC
25 points
3 comments1 min readLW link
(www.fhi.ox.ac.uk)

Embed­ded Agents

29 Oct 2018 19:53 UTC
194 points
41 comments1 min readLW link2 reviews

Our take on CHAI’s re­search agenda in un­der 1500 words

Alex Flint17 Jun 2020 12:24 UTC
101 points
19 comments5 min readLW link

De­con­fus­ing Hu­man Values Re­search Agenda v1

G Gordon Worley III23 Mar 2020 16:25 UTC
26 points
12 comments4 min readLW link

Thoughts on Hu­man Models

21 Feb 2019 9:10 UTC
124 points
32 comments10 min readLW link1 review

MIRI’s tech­ni­cal re­search agenda

So8res23 Dec 2014 18:45 UTC
54 points
52 comments3 min readLW link

Pre­face to CLR’s Re­search Agenda on Co­op­er­a­tion, Con­flict, and TAI

JesseClifton13 Dec 2019 21:02 UTC
56 points
10 comments2 min readLW link

Re­search agenda update

Steven Byrnes6 Aug 2021 19:24 UTC
53 points
40 comments7 min readLW link

The Plan

johnswentworth10 Dec 2021 23:41 UTC
213 points
77 comments14 min readLW link

Embed­ded Agency (full-text ver­sion)

15 Nov 2018 19:49 UTC
126 points
11 comments54 min readLW link

Us­ing GPT-N to Solve In­ter­pretabil­ity of Neu­ral Net­works: A Re­search Agenda

3 Sep 2020 18:27 UTC
64 points
12 comments2 min readLW link

New year, new re­search agenda post

Charlie Steiner12 Jan 2022 17:58 UTC
28 points
4 comments16 min readLW link

An­nounc­ing the Align­ment of Com­plex Sys­tems Re­search Group

4 Jun 2022 4:10 UTC
65 points
17 comments5 min readLW link

Ul­tra-sim­plified re­search agenda

Stuart_Armstrong22 Nov 2019 14:29 UTC
34 points
4 comments1 min readLW link

Embed­ded Curiosities

8 Nov 2018 14:19 UTC
86 points
1 comment2 min readLW link

Sub­sys­tem Alignment

6 Nov 2018 16:16 UTC
100 points
12 comments1 min readLW link

Ro­bust Delegation

4 Nov 2018 16:38 UTC
110 points
10 comments1 min readLW link

Embed­ded World-Models

2 Nov 2018 16:07 UTC
87 points
16 comments1 min readLW link

De­ci­sion Theory

31 Oct 2018 18:41 UTC
111 points
46 comments1 min readLW link

Sec­tions 1 & 2: In­tro­duc­tion, Strat­egy and Governance

JesseClifton17 Dec 2019 21:27 UTC
34 points
5 comments14 min readLW link

Sec­tions 3 & 4: Cred­i­bil­ity, Peace­ful Bar­gain­ing Mechanisms

JesseClifton17 Dec 2019 21:46 UTC
19 points
2 comments12 min readLW link

Sec­tions 5 & 6: Con­tem­po­rary Ar­chi­tec­tures, Hu­mans in the Loop

JesseClifton20 Dec 2019 3:52 UTC
27 points
4 comments10 min readLW link

Sec­tion 7: Foun­da­tions of Ra­tional Agency

JesseClifton22 Dec 2019 2:05 UTC
14 points
4 comments8 min readLW link

Ac­knowl­edge­ments & References

JesseClifton14 Dec 2019 7:04 UTC
6 points
0 comments14 min readLW link

Align­ment pro­pos­als and com­plex­ity classes

evhub16 Jul 2020 0:27 UTC
33 points
26 comments13 min readLW link

The Good­hart Game

John_Maxwell18 Nov 2019 23:22 UTC
13 points
5 comments5 min readLW link

Im­mo­bile AI makes a move: anti-wire­head­ing, on­tol­ogy change, and model splintering

Stuart_Armstrong17 Sep 2021 15:24 UTC
32 points
3 comments2 min readLW link

Test­ing The Nat­u­ral Ab­strac­tion Hy­poth­e­sis: Pro­ject Update

johnswentworth20 Sep 2021 3:44 UTC
83 points
15 comments8 min readLW link

AI, learn to be con­ser­va­tive, then learn to be less so: re­duc­ing side-effects, learn­ing pre­served fea­tures, and go­ing be­yond conservatism

Stuart_Armstrong20 Sep 2021 11:56 UTC
14 points
4 comments3 min readLW link

Paradigm-build­ing: Introduction

Cameron Berg8 Feb 2022 0:06 UTC
24 points
0 comments2 min readLW link

Re­sources for AI Align­ment Cartography

Gyrodiot4 Apr 2020 14:20 UTC
43 points
8 comments9 min readLW link

In­tro­duc­ing the Longevity Re­search Institute

sarahconstantin8 May 2018 3:30 UTC
53 points
20 comments1 min readLW link
(srconstantin.wordpress.com)

An­nounce­ment: AI al­ign­ment prize round 3 win­ners and next round

cousin_it15 Jul 2018 7:40 UTC
93 points
7 comments1 min readLW link

Ma­chine Learn­ing Pro­jects on IDA

24 Jun 2019 18:38 UTC
49 points
3 comments2 min readLW link

AI Align­ment Re­search Overview (by Ja­cob Stein­hardt)

Ben Pace6 Nov 2019 19:24 UTC
43 points
0 comments7 min readLW link
(docs.google.com)

Creat­ing Welfare Biol­ogy: A Re­search Proposal

ozymandias16 Nov 2017 19:06 UTC
20 points
5 comments4 min readLW link

Re­search Agenda in re­verse: what *would* a solu­tion look like?

Stuart_Armstrong25 Jun 2019 13:52 UTC
34 points
25 comments1 min readLW link

Fore­cast­ing AI Progress: A Re­search Agenda

10 Aug 2020 1:04 UTC
39 points
4 comments1 min readLW link

Tech­ni­cal AGI safety re­search out­side AI

Richard_Ngo18 Oct 2019 15:00 UTC
43 points
3 comments3 min readLW link

Why I am not cur­rently work­ing on the AAMLS agenda

jessicata1 Jun 2017 17:57 UTC
28 points
1 comment5 min readLW link

Which of these five AI al­ign­ment re­search pro­jects ideas are no good?

rmoehn8 Aug 2019 7:17 UTC
25 points
13 comments1 min readLW link

Re­search is polyg­a­mous! The im­por­tance of what you do needn’t be pro­por­tional to your awe­some­ness

diegocaleiro26 May 2013 22:29 UTC
35 points
43 comments2 min readLW link

Fund­ing Good Research

lukeprog27 May 2012 6:41 UTC
38 points
44 comments2 min readLW link

Please voice your sup­port for stem cell research

zaph22 May 2009 18:45 UTC
−5 points
4 comments1 min readLW link

Notes on effec­tive-al­tru­ism-re­lated re­search, writ­ing, test­ing fit, learn­ing, and the EA Forum

MichaelA28 Mar 2021 23:43 UTC
14 points
0 comments4 min readLW link

The Me­taethics and Nor­ma­tive Ethics of AGI Value Align­ment: Many Ques­tions, Some Implications

Dario Citrini16 Sep 2021 16:13 UTC
6 points
0 comments8 min readLW link

AI learns be­trayal and how to avoid it

Stuart_Armstrong30 Sep 2021 9:39 UTC
30 points
4 comments2 min readLW link

A FLI post­doc­toral grant ap­pli­ca­tion: AI al­ign­ment via causal anal­y­sis and de­sign of agents

PabloAMC13 Nov 2021 1:44 UTC
4 points
0 comments7 min readLW link

Fram­ing ap­proaches to al­ign­ment and the hard prob­lem of AI cognition

ryan_greenblatt15 Dec 2021 19:06 UTC
6 points
15 comments27 min readLW link

An Open Philan­thropy grant pro­posal: Causal rep­re­sen­ta­tion learn­ing of hu­man preferences

PabloAMC11 Jan 2022 11:28 UTC
18 points
6 comments8 min readLW link

Paradigm-build­ing: The hi­er­ar­chi­cal ques­tion framework

Cameron Berg9 Feb 2022 16:47 UTC
11 points
16 comments3 min readLW link

Ques­tion 1: Pre­dicted ar­chi­tec­ture of AGI learn­ing al­gorithm(s)

Cameron Berg10 Feb 2022 17:22 UTC
9 points
1 comment7 min readLW link

Ques­tion 2: Pre­dicted bad out­comes of AGI learn­ing architecture

Cameron Berg11 Feb 2022 22:23 UTC
5 points
1 comment10 min readLW link

Ques­tion 3: Con­trol pro­pos­als for min­i­miz­ing bad outcomes

Cameron Berg12 Feb 2022 19:13 UTC
5 points
1 comment7 min readLW link

Ques­tion 5: The timeline hyperparameter

Cameron Berg14 Feb 2022 16:38 UTC
5 points
3 comments7 min readLW link

Paradigm-build­ing: Con­clu­sion and prac­ti­cal takeaways

Cameron Berg15 Feb 2022 16:11 UTC
2 points
1 comment2 min readLW link

Elicit: Lan­guage Models as Re­search Assistants

9 Apr 2022 14:56 UTC
64 points
5 comments13 min readLW link
No comments.