Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Agent Foundations
Tag
Relevant
New
Old
Why Agent Foundations? An Overly Abstract Explanation
johnswentworth
25 Mar 2022 23:17 UTC
257
points
54
comments
8
min read
LW
link
The Rocket Alignment Problem
Eliezer Yudkowsky
4 Oct 2018 0:38 UTC
199
points
42
comments
15
min read
LW
link
2
reviews
Challenges with Breaking into MIRI-Style Research
Chris_Leong
17 Jan 2022 9:23 UTC
74
points
15
comments
3
min read
LW
link
Some AI research areas and their relevance to existential safety
Andrew_Critch
19 Nov 2020 3:18 UTC
198
points
40
comments
50
min read
LW
link
2
reviews
AXRP Episode 15 - Natural Abstractions with John Wentworth
DanielFilan
23 May 2022 5:40 UTC
32
points
1
comment
57
min read
LW
link
[Question]
Does agent foundations cover all future ML systems?
Jonas Hallgren
25 Jul 2022 1:17 UTC
2
points
0
comments
1
min read
LW
link
Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
Jack Parker
and
Connall Garrod
22 Sep 2022 13:25 UTC
136
points
6
comments
2
min read
LW
link
Prize and fast track to alignment research at ALTER
Vanessa Kosoy
17 Sep 2022 16:58 UTC
67
points
6
comments
3
min read
LW
link
Clarifying the Agent-Like Structure Problem
johnswentworth
29 Sep 2022 21:28 UTC
54
points
15
comments
6
min read
LW
link
You won’t solve alignment without agent foundations
Mikhail Samin
6 Nov 2022 8:07 UTC
21
points
3
comments
8
min read
LW
link
Contra “Strong Coherence”
DragonGod
4 Mar 2023 20:05 UTC
38
points
24
comments
1
min read
LW
link
Compositional language for hypotheses about computations
Vanessa Kosoy
11 Mar 2023 19:43 UTC
24
points
2
comments
12
min read
LW
link
Fixed points in mortal population games
ViktoriaMalyasova
14 Mar 2023 7:10 UTC
22
points
0
comments
12
min read
LW
link
(www.lesswrong.com)
[Question]
Critiques of the Agent Foundations agenda?
Jsevillamol
24 Nov 2020 16:11 UTC
16
points
3
comments
1
min read
LW
link
My take on agent foundations: formalizing metaphilosophical competence
zhukeepa
1 Apr 2018 6:33 UTC
20
points
6
comments
1
min read
LW
link
Another take on agent foundations: formalizing zero-shot reasoning
zhukeepa
1 Jul 2018 6:12 UTC
59
points
20
comments
12
min read
LW
link
Arguments about Highly Reliable Agent Designs as a Useful Path to Artificial Intelligence Safety
riceissa
and
Davidmanheim
27 Jan 2022 13:13 UTC
27
points
0
comments
1
min read
LW
link
(arxiv.org)
[Question]
Choice := Anthropics uncertainty? And potential implications for agency
Antoine de Scorraille
21 Apr 2022 16:38 UTC
6
points
1
comment
1
min read
LW
link
Understanding Selection Theorems
adamk
28 May 2022 1:49 UTC
41
points
3
comments
7
min read
LW
link
Bridging Expected Utility Maximization and Optimization
Whispermute
5 Aug 2022 8:18 UTC
24
points
5
comments
14
min read
LW
link
Discovering Agents
zac_kenton
18 Aug 2022 17:33 UTC
61
points
10
comments
6
min read
LW
link
Reward is not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning
Roman Leventov
12 Jan 2023 16:43 UTC
17
points
2
comments
2
min read
LW
link
(arxiv.org)
Normative vs Descriptive Models of Agency
mattmacdermott
2 Feb 2023 20:28 UTC
22
points
5
comments
4
min read
LW
link
A mostly critical review of infra-Bayesianism
matolcsid
28 Feb 2023 18:37 UTC
86
points
7
comments
28
min read
LW
link
Performance guarantees in classical learning theory and infra-Bayesianism
matolcsid
28 Feb 2023 18:37 UTC
8
points
4
comments
31
min read
LW
link
100 Dinners And A Workshop: Information Preservation And Goals
Stephen Fowler
28 Mar 2023 3:13 UTC
7
points
0
comments
7
min read
LW
link
No comments.
Back to top