All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Simulators

janus2 Sep 2022 12:45 UTC

718 points

170 comments41 min readLW link 8 reviews

(generative.ink)

The Redaction Machine

Ben20 Sep 2022 22:03 UTC

545 points

48 comments27 min readLW link 1 review

Losing the root for the tree

Biff Wiff20 Sep 2022 4:53 UTC

512 points

31 comments9 min readLW link 1 review

You Are Not Measuring What You Think You Are Measuring

johnswentworth20 Sep 2022 20:04 UTC

444 points

45 comments8 min readLW link 2 reviews

Why I think strong general AI is coming soon

porby28 Sep 2022 5:40 UTC

344 points

141 comments34 min readLW link 1 review

The shard theory of human values

Quintin Pope and TurnTrout

4 Sep 2022 4:28 UTC

266 points

67 comments24 min readLW link 2 reviews

How I buy things when Lightcone wants them fast

Bird Concept26 Sep 2022 5:02 UTC

241 points

21 comments8 min readLW link

Announcing Balsa Research

Zvi25 Sep 2022 22:50 UTC

235 points

64 comments2 min readLW link 1 review

(thezvi.wordpress.com)

How my team at Lightcone sometimes gets stuff done

Bird Concept19 Sep 2022 5:47 UTC

204 points

43 comments7 min readLW link 1 review

7 traps that (we think) new alignment researchers often fall into

Orpheus16 and Thomas Larsen

27 Sep 2022 23:13 UTC

180 points

10 comments4 min readLW link

Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

Diffractor28 Sep 2022 1:20 UTC

177 points

21 comments53 min readLW link 2 reviews

The Onion Test for Personal and Institutional Honesty

chanamessinger and Andrew_Critch

27 Sep 2022 15:26 UTC

177 points

32 comments3 min readLW link 3 reviews

Most People Start With The Same Few Bad Ideas

johnswentworth9 Sep 2022 0:29 UTC

177 points

31 comments3 min readLW link

Do bamboos set themselves on fire?

Malmesbury19 Sep 2022 15:34 UTC

173 points

14 comments6 min readLW link 1 review

Interpreting Neural Networks through the Polytope Lens

Sid Black, Lee Sharkey, Connor Leahy, beren, CRG, merizian, Eric Winsor and Dan Braun

23 Sep 2022 17:58 UTC

149 points

29 comments33 min readLW link

Public-facing Censorship Is Safety Theater, Causing Reputational Damage

Yitz23 Sep 2022 5:08 UTC

149 points

42 comments6 min readLW link

AI coordination needs clear wins

evhub1 Sep 2022 23:41 UTC

148 points

16 comments2 min readLW link 1 review

Orexin and the Quest for more Waking Hours

ChristianKl24 Sep 2022 19:54 UTC

145 points

51 comments5 min readLW link

Takeaways from our robust injury classifier project [Redwood Research]

dmz17 Sep 2022 3:55 UTC

143 points

12 comments6 min readLW link 1 review

Understanding Infra-Bayesianism: A Beginner-Friendly Video Series

Jack Parker and Connall Garrod

22 Sep 2022 13:25 UTC

140 points

6 comments2 min readLW link

Monitoring for deceptive alignment

evhub8 Sep 2022 23:07 UTC

130 points

8 comments9 min readLW link

Gene drives: why the wait?

Metacelsus19 Sep 2022 23:37 UTC

125 points

50 comments3 min readLW link

(denovo.substack.com)

LW Petrov Day 2022 (Monday, 9/26)

Ruby22 Sep 2022 2:56 UTC

122 points

111 comments5 min readLW link

An Update on Academia vs. Industry (one year into my faculty job)

David Scott Krueger3 Sep 2022 20:43 UTC

122 points

18 comments4 min readLW link

Quintin’s alignment papers roundup—week 1

Quintin Pope10 Sep 2022 6:39 UTC

122 points

6 comments9 min readLW link

Rejected Early Drafts of Newcomb’s Problem

zahmahkibo6 Sep 2022 19:04 UTC

116 points

5 comments3 min readLW link

Announcing $5,000 bounty for (responsibly) ending malaria

lc24 Sep 2022 4:28 UTC

116 points

40 comments4 min readLW link

Petrov Day Retrospective: 2022

Ruby28 Sep 2022 22:16 UTC

109 points

41 comments4 min readLW link

My emotional reaction to the current funding situation

Sam F. Brown9 Sep 2022 22:02 UTC

108 points

36 comments5 min readLW link

(sambrown.eu)

Understanding Conjecture: Notes from Connor Leahy interview

Orpheus1615 Sep 2022 18:37 UTC

107 points

23 comments15 min readLW link

Funding is All You Need: Getting into Grad School by Hacking the NSF GRFP Fellowship

hapanin22 Sep 2022 21:39 UTC

106 points

9 comments13 min readLW link

Ukraine Post #12

Zvi22 Sep 2022 14:40 UTC

104 points

3 comments16 min readLW link

(thezvi.wordpress.com)

Evaluations project @ ARC is hiring a researcher and a webdev/engineer

Beth Barnes9 Sep 2022 22:46 UTC

99 points

7 comments10 min readLW link

[Linkpost] A survey on over 300 works about interpretability in deep networks

scasper12 Sep 2022 19:07 UTC

97 points

7 comments2 min readLW link

(arxiv.org)

The ethics of reclining airplane seats

braces4 Sep 2022 17:59 UTC

95 points

72 comments1 min readLW link

Inverse Scaling Prize: Round 1 Winners

Ethan Perez and Ian McKenzie

26 Sep 2022 19:57 UTC

93 points

16 comments4 min readLW link

(irmckenzie.co.uk)

Why we’re not founding a human-data-for-alignment org

L Rudolf L and Matt Putz

27 Sep 2022 20:14 UTC

88 points

6 comments29 min readLW link

(forum.effectivealtruism.org)

Let’s Terraform West Texas

blackstampede4 Sep 2022 16:24 UTC

88 points

33 comments5 min readLW link

Linkpost: Github Copilot productivity experiment

Daniel Kokotajlo8 Sep 2022 4:41 UTC

88 points

4 comments1 min readLW link

(github.blog)

Nearcast-based “deployment problem” analysis

HoldenKarnofsky21 Sep 2022 18:52 UTC

87 points

2 comments26 min readLW link

Dath Ilan’s Views on Stopgap Corrigibility

David Udell22 Sep 2022 16:16 UTC

87 points

19 comments13 min readLW link

(www.glowfic.com)

Towards deconfusing wireheading and reward maximization

leogao21 Sep 2022 0:36 UTC

81 points

7 comments4 min readLW link

AI Safety and Neighboring Communities: A Quick-Start Guide, as of Summer 2022

Sam Bowman1 Sep 2022 19:15 UTC

76 points

2 comments7 min readLW link

Builder/Breaker for Deconfusion

abramdemski29 Sep 2022 17:36 UTC

73 points

9 comments9 min readLW link

Bugs or Features?

qbolec3 Sep 2022 7:04 UTC

73 points

9 comments2 min readLW link

Stop Discouraging Microwave Formula Preparation

jefftk2 Sep 2022 2:10 UTC

69 points

12 comments2 min readLW link

(www.jefftk.com)

Solar Blackout Resistance

jefftk8 Sep 2022 13:30 UTC

69 points

32 comments3 min readLW link

(www.jefftk.com)

Ambiguity in Prediction Market Resolution is Harmful

aphyer26 Sep 2022 16:22 UTC

69 points

17 comments5 min readLW link

Toy Models of Superposition

evhub21 Sep 2022 23:48 UTC

69 points

4 comments5 min readLW link 1 review

(transformer-circuits.pub)

Self-Control Secrets of the Puritan Masters

David Hugh-Jones26 Sep 2022 9:04 UTC

68 points

3 comments5 min readLW link

(wyclif.substack.com)