RSS

jordine

Karma: 230

Here’s 18 Ap­pli­ca­tions of De­cep­tion Probes

28 Aug 2025 18:59 UTC
36 points
0 comments22 min readLW link

Can SAE steer­ing re­veal sand­bag­ging?

15 Apr 2025 12:33 UTC
35 points
3 comments4 min readLW link

Hanoi – ACX Mee­tups Every­where Spring 2025

jordine25 Mar 2025 23:50 UTC
3 points
0 comments1 min readLW link

Shal­low re­view of tech­ni­cal AI safety, 2024

29 Dec 2024 12:01 UTC
195 points
35 comments41 min readLW link

Hanoi Viet­nam—ACX Mee­tups Every­where Fall 2024

jordine29 Aug 2024 18:35 UTC
1 point
0 comments1 min readLW link

Re­sults from the AI x Democ­racy Re­search Sprint

14 Jun 2024 16:40 UTC
13 points
0 comments6 min readLW link

Hanoi – ACX Mee­tups Every­where Spring 2024

jordine30 Mar 2024 23:38 UTC
1 point
0 comments1 min readLW link