RSS

papetoast

Karma: 999

I have a bachelor’s in CS. Looking for a job!

find me anywhere in linktr.ee/​papetoast

Re­in­force­ment learn­ing to­wards broadly and per­sis­tently benefi­cial models

papetoast18 Jun 2026 22:11 UTC
19 points
0 comments1 min readLW link
(alignment.openai.com)

Can pub­lic chat data pre­dict real-world AI mis­al­ign­ments?

papetoast17 Jun 2026 3:53 UTC
7 points
0 comments1 min readLW link
(alignment.openai.com)

Links #3: 2026/​06 Part 1

papetoast15 Jun 2026 12:53 UTC
9 points
0 comments27 min readLW link

Links #2: 2026/​05 Part 2

papetoast31 May 2026 13:41 UTC
8 points
0 comments20 min readLW link

Links #1: 2026/​05 Part 1

papetoast18 May 2026 5:04 UTC
10 points
0 comments18 min readLW link

In­ves­ti­gat­ing the con­se­quences of ac­ci­den­tally grad­ing CoT dur­ing RL

papetoast8 May 2026 6:17 UTC
24 points
0 comments1 min readLW link
(alignment.openai.com)

Auto-re­view of agent ac­tions with­out syn­chronous hu­man oversight

papetoast4 May 2026 2:12 UTC
6 points
0 comments1 min readLW link
(alignment.openai.com)

pa­petoast’s Shortforms

papetoast20 Jan 2023 1:56 UTC
1 point
157 comments1 min readLW link