Steven Byrnes

Karma: 25,989

I’m an AGI safety / AI alignment researcher in Boston with a particular focus on brain algorithms. Research Fellow at Astera. See https://sjbyrnes.com/agi.html for a summary of my research and sorted list of writing. Physicist by training. Email: steven.byrnes@gmail.com. Leave me anonymous feedback here. I’m also at: RSS feed, X/Twitter, Bluesky, Substack, LinkedIn, and more at my website.

My AGI safety research—2025 review, ’26 plans

Steven Byrnes11 Dec 2025 17:05 UTC

127 points

4 comments12 min readLW link

Reward Function Design: a starter pack

Steven Byrnes8 Dec 2025 19:15 UTC

79 points

7 comments16 min readLW link

We need a field of Reward Function Design

Steven Byrnes8 Dec 2025 19:15 UTC

107 points

8 comments5 min readLW link

6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa

Steven Byrnes3 Dec 2025 18:37 UTC

287 points

54 comments17 min readLW link

Social drives 2: “Approval Reward”, from norm-enforcement to status-seeking

Steven Byrnes12 Nov 2025 20:40 UTC

39 points

6 comments17 min readLW link

Social drives 1: “Sympathy Reward”, from compassion to dehumanization

Steven Byrnes10 Nov 2025 14:53 UTC

36 points

3 comments13 min readLW link

Excerpts from my neuroscience to-do list

Steven Byrnes6 Oct 2025 21:05 UTC

28 points

2 comments4 min readLW link

Optical rectennas are not a promising clean energy technology

Steven Byrnes11 Sep 2025 23:08 UTC

90 points

2 comments6 min readLW link

Neuroscience of human sexual attraction triggers (3 hypotheses)

Steven Byrnes25 Aug 2025 17:51 UTC

58 points

6 comments12 min readLW link

Four ways learning Econ makes people dumber re: future AI

Steven Byrnes21 Aug 2025 17:52 UTC

357 points

49 comments6 min readLW link

(x.com)

[Question] Inscrutability was always inevitable, right?

Steven Byrnes6 Aug 2025 21:57 UTC

99 points

33 comments2 min readLW link

Perils of under- vs over-sculpting AGI desires

Steven Byrnes5 Aug 2025 18:13 UTC

58 points

13 comments23 min readLW link

Interview with Steven Byrnes on Brain-like AGI, Foom & Doom, and Solving Technical Alignment

Liron and Steven Byrnes

5 Aug 2025 0:05 UTC

52 points

1 comment89 min readLW link

(lironshapira.substack.com)

Teaching kids to swim

Steven Byrnes29 Jul 2025 3:10 UTC

55 points

12 comments3 min readLW link

“Behaviorist” RL reward functions lead to scheming

Steven Byrnes23 Jul 2025 16:55 UTC

56 points

6 comments12 min readLW link

Foom & Doom 2: Technical alignment is hard

Steven Byrnes23 Jun 2025 17:19 UTC

161 points

65 comments28 min readLW link

Foom & Doom 1: “Brain in a box in a basement”

Steven Byrnes23 Jun 2025 17:18 UTC

282 points

120 comments29 min readLW link

Reward button alignment

Steven Byrnes22 May 2025 17:36 UTC

52 points

15 comments12 min readLW link

Re SMTM: negative feedback on negative feedback

Steven Byrnes14 May 2025 19:50 UTC

56 points

1 comment22 min readLW link

Video & transcript: Challenges for Safe & Beneficial Brain-Like AGI

Steven Byrnes8 May 2025 21:11 UTC

26 points

0 comments18 min readLW link