Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Stuart_Armstrong
(Stuart Armstrong)
Karma:
17,676
All
Posts
Comments
New
Top
Old
Page
1
Alignment can improve generalisation through more robustly doing what a human wants—CoinRun example
Stuart_Armstrong
21 Nov 2023 11:41 UTC
68
points
9
comments
3
min read
LW
link
How toy models of ontology changes can be misleading
Stuart_Armstrong
21 Oct 2023 21:13 UTC
41
points
0
comments
2
min read
LW
link
Different views of alignment have different consequences for imperfect methods
Stuart_Armstrong
28 Sep 2023 16:31 UTC
31
points
0
comments
1
min read
LW
link
Avoiding xrisk from AI doesn’t mean focusing on AI xrisk
Stuart_Armstrong
2 May 2023 19:27 UTC
64
points
7
comments
3
min read
LW
link
What is a definition, how can it be extrapolated?
Stuart_Armstrong
14 Mar 2023 18:08 UTC
34
points
5
comments
7
min read
LW
link
You’re not a simulation, ’cause you’re hallucinating
Stuart_Armstrong
21 Feb 2023 12:12 UTC
25
points
6
comments
1
min read
LW
link
Large language models can provide “normative assumptions” for learning human preferences
Stuart_Armstrong
2 Jan 2023 19:39 UTC
29
points
12
comments
3
min read
LW
link
Concept extrapolation for hypothesis generation
Stuart_Armstrong
,
patrickleask
and
rgorman
12 Dec 2022 22:09 UTC
20
points
2
comments
3
min read
LW
link
Using GPT-Eliezer against ChatGPT Jailbreaking
Stuart_Armstrong
and
rgorman
6 Dec 2022 19:54 UTC
170
points
85
comments
9
min read
LW
link
Benchmark for successful concept extrapolation/avoiding goal misgeneralization
Stuart_Armstrong
4 Jul 2022 20:48 UTC
82
points
12
comments
4
min read
LW
link
Value extrapolation vs Wireheading
Stuart_Armstrong
17 Jun 2022 15:02 UTC
16
points
1
comment
1
min read
LW
link
Georgism, in theory
Stuart_Armstrong
15 Jun 2022 15:20 UTC
40
points
22
comments
4
min read
LW
link
How to get into AI safety research
Stuart_Armstrong
18 May 2022 18:05 UTC
44
points
7
comments
1
min read
LW
link
GPT-3 and concept extrapolation
Stuart_Armstrong
20 Apr 2022 10:39 UTC
19
points
27
comments
1
min read
LW
link
Concept extrapolation: key posts
Stuart_Armstrong
19 Apr 2022 10:01 UTC
13
points
2
comments
1
min read
LW
link
AIs should learn human preferences, not biases
Stuart_Armstrong
8 Apr 2022 13:45 UTC
10
points
0
comments
1
min read
LW
link
Different perspectives on concept extrapolation
Stuart_Armstrong
8 Apr 2022 10:42 UTC
48
points
8
comments
5
min read
LW
link
1
review
Value extrapolation, concept extrapolation, model splintering
Stuart_Armstrong
8 Mar 2022 22:50 UTC
16
points
1
comment
2
min read
LW
link
[Link] Aligned AI AMA
Stuart_Armstrong
1 Mar 2022 12:01 UTC
18
points
0
comments
1
min read
LW
link
More GPT-3 and symbol grounding
Stuart_Armstrong
23 Feb 2022 18:30 UTC
21
points
7
comments
3
min read
LW
link
Back to top
Next