Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
FTPickle
Karma:
108
All
Posts
Comments
New
Top
Old
I think I’m just confused. Once a model exists, how do you “red-team” it to see whether it’s safe. Isn’t it already dangerous?
FTPickle
18 Nov 2023 14:16 UTC
21
points
14
comments
1
min read
LW
link
[Question]
Beginner’s question about RLHF
FTPickle
8 Aug 2023 15:48 UTC
1
point
4
comments
1
min read
LW
link
Random Observation on AI goals
FTPickle
8 Apr 2023 19:28 UTC
−11
points
2
comments
1
min read
LW
link
The alien simulation meme doesn’t make sense
FTPickle
24 Feb 2023 19:27 UTC
4
points
1
comment
1
min read
LW
link
I believe some AI doomers are overconfident
FTPickle
20 Dec 2022 17:09 UTC
8
points
15
comments
2
min read
LW
link
Back to top