Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Peter S. Park
Karma:
132
All
Posts
Comments
New
Top
Old
How Do We Align an AGI Without Getting Socially Engineered? (Hint: Box It)
Peter S. Park
,
NickyP
and
Stephen Fowler
10 Aug 2022 18:14 UTC
28
points
30
comments
11
min read
LW
link
Can We Align a Self-Improving AGI?
Peter S. Park
30 Aug 2022 0:14 UTC
8
points
5
comments
11
min read
LW
link
Why do we post our AI safety plans on the Internet?
Peter S. Park
3 Nov 2022 16:02 UTC
4
points
4
comments
11
min read
LW
link
The limited upside of interpretability
Peter S. Park
15 Nov 2022 18:46 UTC
13
points
11
comments
1
min read
LW
link
AI can exploit safety plans posted on the Internet
Peter S. Park
4 Dec 2022 12:17 UTC
−15
points
4
comments
1
min read
LW
link
Back to top