All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30

Value/Utility: A History

Lorec19 Nov 2024 23:01 UTC

9 points

0 comments10 min readLW link

Why Don’t We Just… Shoggoth+Face+Paraphraser?

Daniel Kokotajlo and abramdemski

19 Nov 2024 20:53 UTC

175 points

59 comments14 min readLW link 1 review

Every niche event should also be a meetup

DMMF19 Nov 2024 20:47 UTC

18 points

0 comments3 min readLW link

(danfrank.ca)

U.S.-China Economic and Security Review Commission pushes Manhattan Project-style AI initiative

worse19 Nov 2024 18:42 UTC

56 points

7 comments1 min readLW link

Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake

TurnTrout19 Nov 2024 18:36 UTC

40 points

5 comments1 min readLW link

(turntrout.com)

Evolution’s selection target depends on your weighting

tailcalled19 Nov 2024 18:24 UTC

23 points

22 comments1 min readLW link

AISN #44: The Trump Circle on AI Safety Plus, Chinese researchers used Llama to create a military tool for the PLA, a Google AI system discovered a zero-day cybersecurity vulnerability, and Complex Systems

Corin Katzke, Julius, andrewz and Dan H

19 Nov 2024 16:36 UTC

9 points

0 comments5 min readLW link

(newsletter.safe.ai)

Jakarta ACX December 2024 Meetup

Aud19 Nov 2024 15:01 UTC

1 point

0 comments1 min readLW link

Visualizing small Attention-only Transformers

Léo Dana19 Nov 2024 9:37 UTC

4 points

0 comments8 min readLW link

Stop blaming individuals for being obese—it’s a systemic problem

Declan Molony19 Nov 2024 6:41 UTC

13 points

6 comments7 min readLW link

Announcing the CLR Foundations Course and CLR S-Risk Seminars

JamesFaville19 Nov 2024 1:18 UTC

18 points

0 comments3 min readLW link

No Electricity in Manchuria

winstonBosan19 Nov 2024 1:11 UTC

25 points

0 comments5 min readLW link

Looking back on the Future of Humanity Institute—Asterisk

jakeeaton19 Nov 2024 0:44 UTC

48 points

0 comments1 min readLW link

Don’t Dismiss on Epistemics

ggex19 Nov 2024 0:44 UTC

8 points

3 comments2 min readLW link

Training AI agents to solve hard problems could lead to Scheming

Marius Hobbhahn and AlexMeinke

19 Nov 2024 0:10 UTC

73 points

12 comments28 min readLW link

Proactive ‘If-Then’ Safety Cases

Nathan Helm-Burger18 Nov 2024 21:16 UTC

10 points

0 comments4 min readLW link

[Question] Will Orion/Gemini 2/Llama-4 outperform o1

LuigiPagani18 Nov 2024 21:15 UTC

2 points

3 comments1 min readLW link

How to use bright light to improve your life.

Nat Martin18 Nov 2024 19:32 UTC

41 points

10 comments10 min readLW link

How likely is brain preservation to work?

Andy_McKenzie18 Nov 2024 16:58 UTC

26 points

3 comments6 min readLW link

Why imperfect adversarial robustness doesn’t doom AI control

Buck and Claude+

18 Nov 2024 16:05 UTC

62 points

25 comments2 min readLW link

Ethical Implications of the Quantum Multiverse

Jonah Wilberg18 Nov 2024 16:00 UTC

7 points

22 comments6 min readLW link

Reducing x-risk might be actively harmful

MountainPath18 Nov 2024 14:25 UTC

5 points

5 comments1 min readLW link

Monthly Roundup #24: November 2024

Zvi18 Nov 2024 13:20 UTC

44 points

14 comments50 min readLW link

(thezvi.wordpress.com)

A Straightforward Explanation of the Good Regulator Theorem

Alfred Harwood18 Nov 2024 12:45 UTC

91 points

30 comments14 min readLW link

The Choice Transition

owencb and Raymond Douglas

18 Nov 2024 12:30 UTC

54 points

6 comments15 min readLW link 2 reviews

(strangecities.substack.com)

Chat Bankman-Fried: an Exploration of LLM Alignment in Finance

claudia.biancotti18 Nov 2024 9:38 UTC

26 points

4 comments1 min readLW link

Proposal to increase fertility: University parent clubs

Fluffnutt18 Nov 2024 4:21 UTC

17 points

3 comments1 min readLW link

A small improvement to Wikipedia page on Pareto Efficiency

Edwin Evans18 Nov 2024 2:13 UTC

8 points

0 comments1 min readLW link

[Question] Why is Gemini telling the user to die?

Burny18 Nov 2024 1:44 UTC

13 points

1 comment1 min readLW link

“It’s a 10% chance which I did 10 times, so it should be 100%”

egor.timatkov18 Nov 2024 1:14 UTC

170 points

61 comments2 min readLW link 1 review

The Catastrophe of Shiny Objects

mindprison18 Nov 2024 0:24 UTC

−11 points

0 comments3 min readLW link

Do Deep Neural Networks Have Brain-like Representations?: A Summary of Disagreements

Joseph Emerson18 Nov 2024 0:07 UTC

9 points

0 comments26 min readLW link

Truth Terminal: A reconstruction of events

crvr.fr and MTorrents

17 Nov 2024 23:51 UTC

6 points

1 comment7 min readLW link

Which AI Safety Benchmark Do We Need Most in 2025?

Loïc Cabannes and William Ludington

17 Nov 2024 23:50 UTC

2 points

2 comments8 min readLW link

“The Solomonoff Prior is Malign” is a special case of a simpler argument

David Matolcsi17 Nov 2024 21:32 UTC

135 points

46 comments12 min readLW link

Chess As The Model Game

criticalpoints17 Nov 2024 19:45 UTC

19 points

0 comments8 min readLW link

(eregis.github.io)

The grass is always greener in the environment that shaped your values

Karl Faulks17 Nov 2024 18:00 UTC

8 points

0 comments3 min readLW link

Announcing turntrout.com, my new digital home

TurnTrout17 Nov 2024 17:42 UTC

108 points

33 comments1 min readLW link

(turntrout.com)

Secular Solstice Songbook Update

jefftk17 Nov 2024 17:30 UTC

14 points

2 comments1 min readLW link

(www.jefftk.com)

Germany-wide ACX Meetup

Fernand017 Nov 2024 10:08 UTC

4 points

0 comments1 min readLW link

Project Adequate: Seeking Cofounders/Funders

Lorec17 Nov 2024 3:12 UTC

5 points

7 comments8 min readLW link

Trying Bluesky

jefftk17 Nov 2024 2:50 UTC

26 points

16 comments1 min readLW link

(www.jefftk.com)

AXRP Episode 38.1 - Alan Chan on Agent Infrastructure

DanielFilan16 Nov 2024 23:30 UTC

12 points

0 comments14 min readLW link

Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data

Sohaib Imran16 Nov 2024 23:22 UTC

36 points

11 comments14 min readLW link

Why We Wouldn’t Build Aligned AI Even If We Could

Snowyiu16 Nov 2024 20:19 UTC

10 points

7 comments10 min readLW link

Which evals resources would be good?

Marius Hobbhahn16 Nov 2024 14:24 UTC

51 points

4 comments5 min readLW link

Private Capabilities, Public Alignment: De-escalating Without Disadvantage

wassname16 Nov 2024 7:26 UTC

6 points

0 comments5 min readLW link

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

habryka16 Nov 2024 6:38 UTC

548 points

82 comments51 min readLW link

Using Dangerous AI, But Safely?

habryka16 Nov 2024 4:29 UTC

17 points

2 comments43 min readLW link

Ayn Rand’s model of “living money”; and an upside of burnout

AnnaSalamon16 Nov 2024 2:59 UTC

246 points

64 comments5 min readLW link 2 reviews