RSS

Sodium

Karma: 351

Trying to get into alignment. Have a low bar for reaching out!

247ca7912b6c1009065bade7c4ffbdb95ff4794b8dadaef41ba21238ef4af94b

AI Can be “Gra­di­ent Aware” Without Do­ing Gra­di­ent hack­ing.

Sodium20 Oct 2024 21:02 UTC
20 points
0 comments2 min readLW link

(Maybe) A Bag of Heuris­tics is All There Is & A Bag of Heuris­tics is All You Need

Sodium3 Oct 2024 19:11 UTC
34 points
17 comments16 min readLW link

Mira Mu­rati leaves OpenAI/​ OpenAI to re­move non-profit control

Sodium25 Sep 2024 21:15 UTC
58 points
4 comments2 min readLW link

Sodium’s Shortform

Sodium21 Sep 2024 4:45 UTC
3 points
8 comments1 min readLW link

John Schul­man leaves OpenAI for Anthropic

Sodium6 Aug 2024 1:23 UTC
57 points
0 comments1 min readLW link

Four ways I’ve made bad decisions

Sodium14 Jul 2024 22:18 UTC
17 points
1 comment3 min readLW link

(Non-de­cep­tive) Subop­ti­mal­ity Alignment

Sodium18 Oct 2023 2:07 UTC
5 points
1 comment9 min readLW link

Univer­sal and Trans­fer­able Ad­ver­sar­ial At­tacks on Aligned Lan­guage Models [pa­per link]

Sodium29 Jul 2023 3:21 UTC
16 points
0 comments1 min readLW link
(arxiv.org)

NYT: The Sur­pris­ing Thing A.I. Eng­ineers Will Tell You if You Let Them

Sodium17 Apr 2023 18:59 UTC
11 points
2 comments1 min readLW link
(www.nytimes.com)