All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161718 19 20 21 22 23 24 25 26 27 28 29 30

AXRP Episode 38.1 - Alan Chan on Agent Infrastructure

DanielFilan16 Nov 2024 23:30 UTC

12 points

0 comments14 min readLW link

Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data

Sohaib Imran16 Nov 2024 23:22 UTC

36 points

11 comments14 min readLW link

Why We Wouldn’t Build Aligned AI Even If We Could

Snowyiu16 Nov 2024 20:19 UTC

10 points

7 comments10 min readLW link

Which evals resources would be good?

Marius Hobbhahn16 Nov 2024 14:24 UTC

51 points

4 comments5 min readLW link

Private Capabilities, Public Alignment: De-escalating Without Disadvantage

wassname16 Nov 2024 7:26 UTC

6 points

0 comments5 min readLW link

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

habryka16 Nov 2024 6:38 UTC

548 points

82 comments51 min readLW link

Using Dangerous AI, But Safely?

habryka16 Nov 2024 4:29 UTC

17 points

2 comments43 min readLW link

Ayn Rand’s model of “living money”; and an upside of burnout

AnnaSalamon16 Nov 2024 2:59 UTC

246 points

64 comments5 min readLW link 2 reviews

Fundamental Uncertainty: Epilogue

Gordon Seidoh Worley16 Nov 2024 0:57 UTC

10 points

0 comments1 min readLW link

Making a conservative case for alignment

Cameron Berg, Kvee, phgubbins and Trent Hodgeson

15 Nov 2024 18:55 UTC

208 points

67 comments7 min readLW link

The Case For Giving To The Shrimp Welfare Project

Bentham's Bulldog15 Nov 2024 16:03 UTC

3 points

14 comments7 min readLW link

Win/continue/lose scenarios and execute/replace/audit protocols

Buck15 Nov 2024 15:47 UTC

64 points

3 comments7 min readLW link 1 review

Antonym Heads Predict Semantic Opposites in Language Models

Jake Ward15 Nov 2024 15:32 UTC

3 points

0 comments5 min readLW link

Proposing the Conditional AI Safety Treaty (linkpost TIME)

otto.barten15 Nov 2024 13:59 UTC

11 points

9 comments3 min readLW link

(time.com)

A Theory of Equilibrium in the Offense-Defense Balance

Maxwell Tabarrok15 Nov 2024 13:51 UTC

25 points

6 comments2 min readLW link

(www.maximum-progress.com)

Boston Secular Solstice 2024: Call for Singers and Musicans

jefftk15 Nov 2024 13:50 UTC

22 points

0 comments1 min readLW link

(www.jefftk.com)

An Uncanny Moat

Adam Newgas15 Nov 2024 11:39 UTC

14 points

0 comments4 min readLW link

(www.boristhebrave.com)

If I care about measure, choices have additional burden (+AI generated LW-comments)

avturchin15 Nov 2024 10:27 UTC

5 points

11 comments2 min readLW link

What are Emotions?

Myles H15 Nov 2024 4:20 UTC

5 points

13 comments8 min readLW link

The Third Fundamental Question

Screwtape15 Nov 2024 4:01 UTC

88 points

17 comments6 min readLW link 1 review

Dance Differentiation

jefftk15 Nov 2024 2:30 UTC

14 points

0 comments1 min readLW link

(www.jefftk.com)

Breaking beliefs about saving the world

Oxidize15 Nov 2024 0:46 UTC

−1 points

3 comments9 min readLW link

College technical AI safety hackathon retrospective—Georgia Tech

yix15 Nov 2024 0:22 UTC

44 points

2 comments5 min readLW link

(open.substack.com)

Gwern Branwen interview on Dwarkesh Patel’s podcast: “How an Anonymous Researcher Predicted AI’s Trajectory”

Said Achmiz14 Nov 2024 23:53 UTC

91 points

0 comments1 min readLW link

(www.dwarkeshpatel.com)

Internal music player: phenomenology of earworms

dkl914 Nov 2024 23:29 UTC

6 points

4 comments2 min readLW link

(dkl9.net)

The Foraging (Ex-)Bandit [Ruleset & Reflections]

abstractapplic14 Nov 2024 20:16 UTC

27 points

3 comments2 min readLW link

Seven lessons I didn’t learn from election day

Eric Neyman14 Nov 2024 18:39 UTC

99 points

33 comments13 min readLW link

(ericneyman.wordpress.com)

Effects of Non-Uniform Sparsity on Superposition in Toy Models

Shreyans Jain14 Nov 2024 16:59 UTC

4 points

3 comments6 min readLW link

AI #90: The Wall

Zvi14 Nov 2024 14:10 UTC

32 points

8 comments42 min readLW link

(thezvi.wordpress.com)

Evolutionary prompt optimization for SAE feature visualization

neverix, Daniel Tan, Dmitrii Kharlapenko, Neel Nanda and Arthur Conmy

14 Nov 2024 13:06 UTC

28 points

0 comments9 min readLW link

AXRP Episode 38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems

DanielFilan14 Nov 2024 7:00 UTC

14 points

0 comments12 min readLW link

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

Tamay14 Nov 2024 6:13 UTC

33 points

0 comments3 min readLW link

(epoch.ai)

Concrete Methods for Heuristic Estimation on Neural Networks

Oliver Daniels14 Nov 2024 5:07 UTC

35 points

0 comments27 min readLW link

Heresies in the Shadow of the Sequences

Cole Wyeth14 Nov 2024 5:01 UTC

19 points

12 comments2 min readLW link

Thoughts after the Wolfram and Yudkowsky discussion

Tahp14 Nov 2024 1:43 UTC

25 points

13 comments6 min readLW link

Neutrality

sarahconstantin13 Nov 2024 23:10 UTC

162 points

29 comments11 min readLW link 2 reviews

(sarahconstantin.substack.com)

Anvil Shortage

Screwtape13 Nov 2024 22:57 UTC

133 points

19 comments4 min readLW link 3 reviews

[Question] Using hex to get murder advice from GPT-4o

Laurence Freeman13 Nov 2024 18:30 UTC

10 points

5 comments6 min readLW link

Confronting the legion of doom.

Spiritus Dei13 Nov 2024 17:03 UTC

−20 points

3 comments5 min readLW link

Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever’s Recent Claims

garrison13 Nov 2024 17:00 UTC

84 points

14 comments8 min readLW link

(garrisonlovely.substack.com)

MIT FutureTech are hiring ‍a Product and Data Visualization Designer

peterslattery13 Nov 2024 14:48 UTC

2 points

0 comments4 min readLW link

Sparks of Consciousness

Charlie Sanders13 Nov 2024 4:58 UTC

2 points

0 comments3 min readLW link

(www.dailymicrofiction.com)

Contra Musician Gender II

jefftk13 Nov 2024 3:30 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy

Joe Rogero12 Nov 2024 23:55 UTC

34 points

17 comments4 min readLW link

Incentive design and capability elicitation

Joe Carlsmith12 Nov 2024 20:56 UTC

31 points

0 comments12 min readLW link

The Humanitarian Economy

Kyle Furlong12 Nov 2024 18:25 UTC

1 point

14 comments6 min readLW link

Current Attitudes Toward AI Provide Little Data Relevant to Attitudes Toward AGI

Seth Herd12 Nov 2024 18:23 UTC

19 points

2 comments4 min readLW link

Basics of Handling Disagreements with People

Camille B. 12 Nov 2024 17:55 UTC

35 points

4 comments6 min readLW link

Registrations Open for 2024 NYC Secular Solstice & Megameetup

Joe Rogero and Screwtape

12 Nov 2024 17:50 UTC

13 points

0 comments1 min readLW link

2024 NYC Secular Solstice & Megameetup

Joe Rogero and Screwtape

12 Nov 2024 17:46 UTC

18 points

0 comments1 min readLW link