All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 181920 21 22 23 24 25 26 27 28 29 30

Proactive ‘If-Then’ Safety Cases

Nathan Helm-Burger18 Nov 2024 21:16 UTC

10 points

0 comments4 min readLW link

[Question] Will Orion/Gemini 2/Llama-4 outperform o1

LuigiPagani18 Nov 2024 21:15 UTC

2 points

3 comments1 min readLW link

How to use bright light to improve your life.

Nat Martin18 Nov 2024 19:32 UTC

40 points

10 comments10 min readLW link

How likely is brain preservation to work?

Andy_McKenzie18 Nov 2024 16:58 UTC

26 points

3 comments6 min readLW link

Why imperfect adversarial robustness doesn’t doom AI control

Buck and Claude+

18 Nov 2024 16:05 UTC

62 points

25 comments2 min readLW link

Ethical Implications of the Quantum Multiverse

Jonah Wilberg18 Nov 2024 16:00 UTC

7 points

22 comments6 min readLW link

Reducing x-risk might be actively harmful

MountainPath18 Nov 2024 14:25 UTC

5 points

5 comments1 min readLW link

Monthly Roundup #24: November 2024

Zvi18 Nov 2024 13:20 UTC

44 points

14 comments50 min readLW link

(thezvi.wordpress.com)

A Straightforward Explanation of the Good Regulator Theorem

Alfred Harwood18 Nov 2024 12:45 UTC

82 points

29 comments14 min readLW link

The Choice Transition

owencb and Raymond Douglas

18 Nov 2024 12:30 UTC

54 points

4 comments15 min readLW link

(strangecities.substack.com)

Chat Bankman-Fried: an Exploration of LLM Alignment in Finance

claudia.biancotti18 Nov 2024 9:38 UTC

26 points

4 comments1 min readLW link

Proposal to increase fertility: University parent clubs

Fluffnutt18 Nov 2024 4:21 UTC

17 points

3 comments1 min readLW link

A small improvement to Wikipedia page on Pareto Efficiency

Edwin Evans18 Nov 2024 2:13 UTC

8 points

0 comments1 min readLW link

[Question] Why is Gemini telling the user to die?

Burny18 Nov 2024 1:44 UTC

13 points

1 comment1 min readLW link

“It’s a 10% chance which I did 10 times, so it should be 100%”

egor.timatkov18 Nov 2024 1:14 UTC

159 points

59 comments2 min readLW link

The Catastrophe of Shiny Objects

mindprison18 Nov 2024 0:24 UTC

−11 points

0 comments3 min readLW link

Do Deep Neural Networks Have Brain-like Representations?: A Summary of Disagreements

Joseph Emerson18 Nov 2024 0:07 UTC

9 points

0 comments26 min readLW link

Truth Terminal: A reconstruction of events

crvr.fr and MTorrents

17 Nov 2024 23:51 UTC

5 points

1 comment7 min readLW link

Which AI Safety Benchmark Do We Need Most in 2025?

Loïc Cabannes and William Ludington

17 Nov 2024 23:50 UTC

2 points

2 comments8 min readLW link

“The Solomonoff Prior is Malign” is a special case of a simpler argument

David Matolcsi17 Nov 2024 21:32 UTC

131 points

46 comments12 min readLW link

Chess As The Model Game

criticalpoints17 Nov 2024 19:45 UTC

19 points

0 comments8 min readLW link

(eregis.github.io)

The grass is always greener in the environment that shaped your values

Karl Faulks17 Nov 2024 18:00 UTC

8 points

0 comments3 min readLW link

Announcing turntrout.com, my new digital home

TurnTrout17 Nov 2024 17:42 UTC

108 points

33 comments1 min readLW link

(turntrout.com)

Secular Solstice Songbook Update

jefftk17 Nov 2024 17:30 UTC

14 points

2 comments1 min readLW link

(www.jefftk.com)

Germany-wide ACX Meetup

Fernand017 Nov 2024 10:08 UTC

4 points

0 comments1 min readLW link

Project Adequate: Seeking Cofounders/Funders

Lorec17 Nov 2024 3:12 UTC

5 points

7 comments8 min readLW link

Trying Bluesky

jefftk17 Nov 2024 2:50 UTC

26 points

16 comments1 min readLW link

(www.jefftk.com)

AXRP Episode 38.1 - Alan Chan on Agent Infrastructure

DanielFilan16 Nov 2024 23:30 UTC

12 points

0 comments14 min readLW link

Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data

Sohaib Imran16 Nov 2024 23:22 UTC

36 points

11 comments14 min readLW link

Why We Wouldn’t Build Aligned AI Even If We Could

Snowyiu16 Nov 2024 20:19 UTC

10 points

7 comments10 min readLW link

Which evals resources would be good?

Marius Hobbhahn16 Nov 2024 14:24 UTC

51 points

4 comments5 min readLW link

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

habryka16 Nov 2024 6:38 UTC

533 points

81 comments51 min readLW link

Using Dangerous AI, But Safely?

habryka16 Nov 2024 4:29 UTC

17 points

2 comments43 min readLW link

Ayn Rand’s model of “living money”; and an upside of burnout

AnnaSalamon16 Nov 2024 2:59 UTC

237 points

59 comments5 min readLW link

Fundamental Uncertainty: Epilogue

Gordon Seidoh Worley16 Nov 2024 0:57 UTC

10 points

0 comments1 min readLW link

Making a conservative case for alignment

Cameron Berg, Judd Rosenblatt, phgubbins and Trent Hodgeson

15 Nov 2024 18:55 UTC

208 points

67 comments7 min readLW link

The Case For Giving To The Shrimp Welfare Project

Bentham's Bulldog15 Nov 2024 16:03 UTC

−4 points

14 comments7 min readLW link

Win/continue/lose scenarios and execute/replace/audit protocols

Buck15 Nov 2024 15:47 UTC

64 points

2 comments7 min readLW link

Antonym Heads Predict Semantic Opposites in Language Models

Jake Ward15 Nov 2024 15:32 UTC

3 points

0 comments5 min readLW link

Proposing the Conditional AI Safety Treaty (linkpost TIME)

otto.barten15 Nov 2024 13:59 UTC

11 points

9 comments3 min readLW link

(time.com)

A Theory of Equilibrium in the Offense-Defense Balance

Maxwell Tabarrok15 Nov 2024 13:51 UTC

25 points

6 comments2 min readLW link

(www.maximum-progress.com)

Boston Secular Solstice 2024: Call for Singers and Musicans

jefftk15 Nov 2024 13:50 UTC

22 points

0 comments1 min readLW link

(www.jefftk.com)

An Uncanny Moat

Adam Newgas15 Nov 2024 11:39 UTC

13 points

0 comments4 min readLW link

(www.boristhebrave.com)

If I care about measure, choices have additional burden (+AI generated LW-comments)

avturchin15 Nov 2024 10:27 UTC

5 points

11 comments2 min readLW link

What are Emotions?

Myles H15 Nov 2024 4:20 UTC

5 points

13 comments8 min readLW link

The Third Fundamental Question

Screwtape15 Nov 2024 4:01 UTC

66 points

7 comments6 min readLW link

Dance Differentiation

jefftk15 Nov 2024 2:30 UTC

14 points

0 comments1 min readLW link

(www.jefftk.com)

Breaking beliefs about saving the world

Oxidize15 Nov 2024 0:46 UTC

−1 points

3 comments9 min readLW link

College technical AI safety hackathon retrospective—Georgia Tech

yix15 Nov 2024 0:22 UTC

44 points

2 comments5 min readLW link

(open.substack.com)

Gwern Branwen interview on Dwarkesh Patel’s podcast: “How an Anonymous Researcher Predicted AI’s Trajectory”

Said Achmiz14 Nov 2024 23:53 UTC

87 points

0 comments1 min readLW link

(www.dwarkeshpatel.com)