Sodium

Karma: 351

Trying to get into alignment. Have a low bar for reaching out!

247ca7912b6c1009065bade7c4ffbdb95ff4794b8dadaef41ba21238ef4af94b

AI Can be “Gradient Aware” Without Doing Gradient hacking.

Sodium20 Oct 2024 21:02 UTC

20 points

0 comments2 min readLW link

(Maybe) A Bag of Heuristics is All There Is & A Bag of Heuristics is All You Need

Sodium3 Oct 2024 19:11 UTC

34 points

17 comments16 min readLW link

Mira Murati leaves OpenAI/ OpenAI to remove non-profit control

Sodium25 Sep 2024 21:15 UTC

58 points

4 comments2 min readLW link

Sodium’s Shortform

Sodium21 Sep 2024 4:45 UTC

3 points

8 comments1 min readLW link

John Schulman leaves OpenAI for Anthropic

Sodium6 Aug 2024 1:23 UTC

57 points

0 comments1 min readLW link

Four ways I’ve made bad decisions

Sodium14 Jul 2024 22:18 UTC

17 points

1 comment3 min readLW link

(Non-deceptive) Suboptimality Alignment

Sodium18 Oct 2023 2:07 UTC

5 points

1 comment9 min readLW link

Universal and Transferable Adversarial Attacks on Aligned Language Models [paper link]

Sodium29 Jul 2023 3:21 UTC

16 points

0 comments1 min readLW link

(arxiv.org)

NYT: The Surprising Thing A.I. Engineers Will Tell You if You Let Them

Sodium17 Apr 2023 18:59 UTC

11 points

2 comments1 min readLW link

(www.nytimes.com)