RSS

Tuna

Karma: 34

Ac­cess to agent CoT makes mon­i­tors vuln­er­a­ble to persuasion

25 Jul 2025 16:09 UTC
18 points
0 comments4 min readLW link

Les­sons from a year of uni­ver­sity AI safety field building

6 Jun 2025 14:35 UTC
28 points
3 comments7 min readLW link