All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Schelling game evaluations for AI control

Olli JärviniemiOct 8, 2024, 12:01 PM

71 points

5 comments11 min readLW link

If far-UV is so great, why isn’t it everywhere?

Austin ChenOct 19, 2024, 6:56 PM

70 points

23 comments LW link

(strainhardening.substack.com)

EIS XIV: Is mechanistic interpretability about to be practically useful?

scasperOct 11, 2024, 10:13 PM

68 points

4 comments7 min readLW link

On Shifgrethor

JustisMillsOct 27, 2024, 3:30 PM

67 points

18 comments2 min readLW link

(justismills.substack.com)

An Opinionated Evals Reading List

Marius Hobbhahn and Jérémy Scheurer

Oct 15, 2024, 2:38 PM

65 points

0 comments13 min readLW link

(www.apolloresearch.ai)

Occupational Licensing Roundup #1

ZviOct 30, 2024, 11:00 AM

65 points

11 comments11 min readLW link

(thezvi.wordpress.com)

AI research assistants competition 2024Q3: Tie between Elicit and You.com

ElizabethOct 12, 2024, 3:10 PM

64 points

4 comments3 min readLW link

(acesounderglass.com)

[Intuitive self-models] 6. Awakening / Enlightenment / PNSE

Steven ByrnesOct 22, 2024, 1:23 PM

64 points

8 comments21 min readLW link

Electrostatic Airships?

DaemonicSigilOct 27, 2024, 4:32 AM

64 points

13 comments3 min readLW link

(pbement.com)

Slightly More Than You Wanted To Know: Pregnancy Length Effects

JustisMillsOct 21, 2024, 1:26 AM

63 points

4 comments5 min readLW link

(justismills.substack.com)

Dario Amodei — Machines of Loving Grace

Matrice JacobineOct 11, 2024, 9:43 PM

63 points

26 comments1 min readLW link

(darioamodei.com)

Linkpost: Memorandum on Advancing the United States’ Leadership in Artificial Intelligence

NisanOct 25, 2024, 4:37 AM

60 points

2 comments1 min readLW link

(www.whitehouse.gov)

Against empathy-by-default

Steven ByrnesOct 16, 2024, 4:38 PM

60 points

24 comments7 min readLW link

AI Alignment via Slow Substrates: Early Empirical Results With StarCraft II

Lester LeongOct 14, 2024, 4:05 AM

60 points

9 comments12 min readLW link

How much I’m paying for AI productivity software (and the future of AI use)

jacquesthibsOct 11, 2024, 5:11 PM

59 points

18 comments8 min readLW link

(jacquesthibodeau.com)

[Intuitive self-models] 5. Dissociative Identity (Multiple Personality) Disorder

Steven ByrnesOct 15, 2024, 1:31 PM

59 points

7 comments11 min readLW link

AI #86: Just Think of the Potential

ZviOct 17, 2024, 3:10 PM

58 points

8 comments57 min readLW link

(thezvi.wordpress.com)

The Alignment Trap: AI Safety as Path to Power

crispweedOct 29, 2024, 3:21 PM

57 points

17 comments5 min readLW link

(upcoder.com)

AI #87: Staying in Character

ZviOct 29, 2024, 7:10 AM

57 points

3 comments33 min readLW link

(thezvi.wordpress.com)

AI #84: Better Than a Podcast

ZviOct 3, 2024, 3:00 PM

56 points

7 comments52 min readLW link

(thezvi.wordpress.com)

Safe Predictive Agents with Joint Scoring Rules

Rubi J. HudsonOct 9, 2024, 4:38 PM

55 points

10 comments17 min readLW link

How Likely Are Various Precursors of Existential Risk?

NunoSempereOct 28, 2024, 1:27 PM

55 points

4 comments15 min readLW link

(blog.sentinel-team.org)

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)

Joe CarlsmithOct 28, 2024, 9:57 PM

54 points

5 comments32 min readLW link

A path to human autonomy

Nathan Helm-BurgerOct 29, 2024, 3:02 AM

53 points

16 comments20 min readLW link

Can AI Outpredict Humans? Results From Metaculus’s Q3 AI Forecasting Benchmark

ChristianWilliamsOct 10, 2024, 6:58 PM

53 points

2 comments LW link

(www.metaculus.com)

cancer rates after gene therapy

bhauthOct 16, 2024, 3:32 PM

53 points

2 comments3 min readLW link

(bhauth.com)

The Mysterious Trump Buyers on Polymarket

AnnapurnaOct 18, 2024, 1:26 PM

52 points

10 comments2 min readLW link

(jorgevelez.substack.com)

Parental Writing Selection Bias

jefftkOct 13, 2024, 2:00 PM

52 points

3 comments1 min readLW link

(www.jefftk.com)

Prices are Bounties

Maxwell TabarrokOct 12, 2024, 2:51 PM

51 points

13 comments2 min readLW link

(www.maximum-progress.com)

[Intuitive self-models] 7. Hearing Voices, and Other Hallucinations

Steven ByrnesOct 29, 2024, 1:36 PM

51 points

2 comments16 min readLW link

Claude Sonnet 3.5.1 and Haiku 3.5

ZviOct 24, 2024, 2:50 PM

51 points

9 comments16 min readLW link

(thezvi.wordpress.com)

[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF

Leon LangOct 22, 2024, 1:57 PM

51 points

2 comments18 min readLW link

(arxiv.org)

Low Probability Estimation in Language Models

Gabriel WuOct 18, 2024, 3:50 PM

50 points

0 comments10 min readLW link

(www.alignment.org)

Toy Models of Feature Absorption in SAEs

chanind, hrdkbhatnagar, TomasD and Joseph Bloom

Oct 7, 2024, 9:56 AM

49 points

8 comments10 min readLW link

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

Oct 27, 2024, 6:46 PM

48 points

4 comments5 min readLW link

Demis Hassabis and Geoffrey Hinton Awarded Nobel Prizes

Anna GajdovaOct 9, 2024, 12:56 PM

48 points

14 comments1 min readLW link

Evaluating the truth of statements in a world of ambiguous language.

HastingsOct 7, 2024, 6:08 PM

48 points

19 comments2 min readLW link

D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset

aphyerOct 29, 2024, 1:21 AM

47 points

13 comments6 min readLW link

~80 Interesting Questions about Foundation Model Agent Safety

RohanS and Govind Pimpale

Oct 28, 2024, 4:37 PM

46 points

4 comments15 min readLW link

Minimal Motivation of Natural Latents

johnswentworth and David Lorell

Oct 14, 2024, 10:51 PM

46 points

14 comments3 min readLW link

AI as a powerful meme, via CGP Grey

TheManxLoinerOct 30, 2024, 6:31 PM

46 points

8 comments4 min readLW link

Anthropic rewrote its RSP

Zach Stein-PerlmanOct 15, 2024, 2:25 PM

46 points

19 comments6 min readLW link

Motivation control

Joe CarlsmithOct 30, 2024, 5:15 PM

45 points

7 comments52 min readLW link

Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence

EuanMcLeanOct 29, 2024, 12:16 PM

45 points

9 comments26 min readLW link

5 ways to improve CoT faithfulness

Caleb BiddulphOct 5, 2024, 8:17 PM

44 points

40 comments6 min readLW link

Open Thread Fall 2024

habrykaOct 5, 2024, 10:28 PM

44 points

193 comments1 min readLW link

Start an Upper-Room UV Installation Company?

jefftkOct 19, 2024, 2:00 AM

44 points

9 comments1 min readLW link

(www.jefftk.com)

MATS AI Safety Strategy Curriculum v2

DanielFilan and Ryan Kidd

Oct 7, 2024, 10:44 PM

43 points

6 comments13 min readLW link

Startup Success Rates Are So Low Because the Rewards Are So Large

AppliedDivinityStudiesOct 10, 2024, 8:22 PM

42 points

6 comments2 min readLW link

IAPS: Mapping Technical Safety Research at AI Companies

Zach Stein-PerlmanOct 24, 2024, 8:30 PM

42 points

13 comments LW link

(www.iaps.ai)

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer