RSS

jordine

Karma: 234

Here’s 18 Ap­pli­ca­tions of De­cep­tion Probes

Aug 28, 2025, 6:59 PM
38 points

18 votes

Overall karma indicates overall quality.

0 comments22 min readLW link

Can SAE steer­ing re­veal sand­bag­ging?

Apr 15, 2025, 12:33 PM
35 points

11 votes

Overall karma indicates overall quality.

3 comments4 min readLW link

Hanoi – ACX Mee­tups Every­where Spring 2025

jordineMar 25, 2025, 11:50 PM
3 points

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

Shal­low re­view of tech­ni­cal AI safety, 2024

Dec 29, 2024, 12:01 PM
195 points

75 votes

Overall karma indicates overall quality.

35 comments41 min readLW link

Hanoi Viet­nam—ACX Mee­tups Every­where Fall 2024

jordineAug 29, 2024, 6:35 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

Re­sults from the AI x Democ­racy Re­search Sprint

Jun 14, 2024, 4:40 PM
13 points

7 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Hanoi – ACX Mee­tups Every­where Spring 2024

jordineMar 30, 2024, 11:38 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link