Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
papetoast
Karma:
999
I have a bachelor’s in CS. Looking for a job!
find me anywhere in
linktr.ee/papetoast
All
Posts
Comments
New
Top
Old
Reinforcement learning towards broadly and persistently beneficial models
papetoast
18 Jun 2026 22:11 UTC
19
points
0
comments
1
min read
LW
link
(alignment.openai.com)
Can public chat data predict real-world AI misalignments?
papetoast
17 Jun 2026 3:53 UTC
7
points
0
comments
1
min read
LW
link
(alignment.openai.com)
Links #3: 2026/06 Part 1
papetoast
15 Jun 2026 12:53 UTC
9
points
0
comments
27
min read
LW
link
Links #2: 2026/05 Part 2
papetoast
31 May 2026 13:41 UTC
8
points
0
comments
20
min read
LW
link
Links #1: 2026/05 Part 1
papetoast
18 May 2026 5:04 UTC
10
points
0
comments
18
min read
LW
link
Investigating the consequences of accidentally grading CoT during RL
papetoast
8 May 2026 6:17 UTC
24
points
0
comments
1
min read
LW
link
(alignment.openai.com)
Auto-review of agent actions without synchronous human oversight
papetoast
4 May 2026 2:12 UTC
6
points
0
comments
1
min read
LW
link
(alignment.openai.com)
papetoast’s Shortforms
papetoast
20 Jan 2023 1:56 UTC
1
point
157
comments
1
min read
LW
link
Back to top