papetoast

Karma: 999

I have a bachelor’s in CS. Looking for a job!

find me anywhere in linktr.ee/papetoast

Reinforcement learning towards broadly and persistently beneficial models

papetoast18 Jun 2026 22:11 UTC

19 points

0 comments1 min readLW link

(alignment.openai.com)

Can public chat data predict real-world AI misalignments?

papetoast17 Jun 2026 3:53 UTC

7 points

0 comments1 min readLW link

(alignment.openai.com)

Links #3: 2026/06 Part 1

papetoast15 Jun 2026 12:53 UTC

9 points

0 comments27 min readLW link

Links #2: 2026/05 Part 2

papetoast31 May 2026 13:41 UTC

8 points

0 comments20 min readLW link

Links #1: 2026/05 Part 1

papetoast18 May 2026 5:04 UTC

10 points

0 comments18 min readLW link

Investigating the consequences of accidentally grading CoT during RL

papetoast8 May 2026 6:17 UTC

24 points

0 comments1 min readLW link

(alignment.openai.com)

Auto-review of agent actions without synchronous human oversight

papetoast4 May 2026 2:12 UTC

6 points

0 comments1 min readLW link

(alignment.openai.com)

papetoast’s Shortforms

papetoast20 Jan 2023 1:56 UTC

1 point

157 comments1 min readLW link