All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Pessimistic Shard Theory

Garrett BakerJan 25, 2023, 12:59 AM

72 points

13 comments3 min readLW link

A general comment on discussions of genetic group differences

anonymous8101Jan 14, 2023, 2:11 AM

71 points

46 comments3 min readLW link

“Status” can be corrosive; here’s how I handle it

Orpheus16Jan 24, 2023, 1:25 AM

71 points

8 comments6 min readLW link

How we could stumble into AI catastrophe

HoldenKarnofskyJan 13, 2023, 4:20 PM

71 points

18 comments18 min readLW link

(www.cold-takes.com)

Opportunity Cost Blackmail

adamShimiJan 2, 2023, 1:48 PM

70 points

11 comments2 min readLW link

(epistemologicalvigilance.substack.com)

Some of my disagreements with List of Lethalities

TurnTroutJan 24, 2023, 12:25 AM

70 points

7 comments10 min readLW link

Investing for a World Transformed by AI

PeterMcCluskeyJan 1, 2023, 2:47 AM

70 points

24 comments6 min readLW link 1 review

(bayesianinvestor.com)

AGI safety field building projects I’d like to see

Severin T. SeehrichJan 19, 2023, 10:40 PM

68 points

28 comments9 min readLW link

Infohazards vs Fork Hazards

jimrandomhJan 5, 2023, 9:45 AM

68 points

16 comments1 min readLW link

Thoughts on hardware / compute requirements for AGI

Steven ByrnesJan 24, 2023, 2:03 PM

63 points

32 comments24 min readLW link

Simulacra are Things

janusJan 8, 2023, 11:03 PM

63 points

7 comments2 min readLW link

Dangers of deference

TsviBTJan 8, 2023, 2:36 PM

62 points

5 comments2 min readLW link

Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind

DragonGodJan 13, 2023, 4:53 PM

62 points

12 comments1 min readLW link

(arxiv.org)

Escape Velocity from Bullshit Jobs

ZviJan 10, 2023, 2:30 PM

61 points

18 comments5 min readLW link

(thezvi.wordpress.com)

My first year in AI alignment

Alex_AltairJan 2, 2023, 1:28 AM

61 points

10 comments7 min readLW link

Announcing aisafety.training

JJ HepburnJan 21, 2023, 1:01 AM

61 points

4 comments1 min readLW link

Spooky action at a distance in the loss landscape

Jesse Hoogland and Filip Sondej

Jan 28, 2023, 12:22 AM

61 points

4 comments7 min readLW link

(www.jessehoogland.com)

Movie Review: Megan

ZviJan 23, 2023, 12:50 PM

60 points

19 comments24 min readLW link

(thezvi.wordpress.com)

LW Filter Tags (Rationality/World Modeling now promoted in Latest Posts)

Ruby and RobertM

Jan 28, 2023, 10:14 PM

60 points

4 comments3 min readLW link

Assigning Praise and Blame: Decoupling Epistemology and Decision Theory

adamShimi and Gabriel Alfour

Jan 27, 2023, 6:16 PM

59 points

5 comments3 min readLW link

Conversational canyons

Henrik KarlssonJan 4, 2023, 6:55 PM

59 points

4 comments7 min readLW link

(escapingflatland.substack.com)

Announcing Cavendish Labs

Jan 19, 2023, 8:15 PM

59 points

5 comments2 min readLW link

(forum.effectivealtruism.org)

Consequentialists: One-Way Pattern Traps

David UdellJan 16, 2023, 8:48 PM

59 points

3 comments14 min readLW link

[Linkpost] TIME article: DeepMind’s CEO Helped Take AI Mainstream. Now He’s Urging Caution

Orpheus16Jan 21, 2023, 4:51 PM

58 points

2 comments3 min readLW link

(time.com)

Inverse Scaling Prize: Second Round Winners

Ian McKenzie, Sam Bowman and Ethan Perez

Jan 24, 2023, 8:12 PM

58 points

17 comments15 min readLW link

My Advice for Incoming SERI MATS Scholars

Johannes C. MayerJan 3, 2023, 7:25 PM

58 points

6 comments4 min readLW link

Linear Algebra Done Right, Axler

David UdellJan 2, 2023, 10:54 PM

57 points

6 comments9 min readLW link

Evidence under Adversarial Conditions

PeterMcCluskeyJan 9, 2023, 4:21 PM

57 points

1 comment3 min readLW link

(bayesianinvestor.com)

Consider paying for literature or book reviews using bounties and dominant assurance contracts

Arjun PanicksseryJan 15, 2023, 3:56 AM

57 points

7 comments2 min readLW link

Gradient Filtering

Jozdien and janus

Jan 18, 2023, 8:09 PM

56 points

16 comments13 min readLW link

What’s going on with ‘crunch time’?

rosehadsharJan 20, 2023, 9:42 AM

54 points

6 comments4 min readLW link

Reflections on Deception & Generality in Scalable Oversight (Another OpenAI Alignment Review)

Shoshannah TekofskyJan 28, 2023, 5:26 AM

53 points

7 comments7 min readLW link

Why you should learn sign language

Noah TopperJan 18, 2023, 5:03 PM

53 points

23 comments7 min readLW link

(naivebayes.substack.com)

Why and How to Graduate Early [U.S.]

TegoJan 29, 2023, 1:28 AM

53 points

9 comments8 min readLW link 1 review

Paper: Superposition, Memorization, and Double Descent (Anthropic)

LawrenceCJan 5, 2023, 5:54 PM

53 points

11 comments1 min readLW link

(transformer-circuits.pub)

Critique of some recent philosophy of LLMs’ minds

Roman LeventovJan 20, 2023, 12:53 PM

52 points

8 comments20 min readLW link

Contra Common Knowledge

abramdemskiJan 4, 2023, 10:50 PM

52 points

31 comments16 min readLW link

How Likely is Losing a Google Account?

jefftkJan 30, 2023, 12:20 AM

52 points

12 comments3 min readLW link

(www.jefftk.com)

Beware safety-washing

LizkaJan 13, 2023, 1:59 PM

51 points

2 comments4 min readLW link

The Thingness of Things

TsviBTJan 1, 2023, 10:19 PM

51 points

35 comments10 min readLW link

11 heuristics for choosing (alignment) research projects

Orpheus16 and danesherbs

Jan 27, 2023, 12:36 AM

50 points

5 comments1 min readLW link

[Simulators seminar sequence] #1 Background & shared assumptions

Jan, Charlie Steiner, Logan Riggs, janus, jacquesthibs, metasemi, Michael Oesterle, Lucas Teixeira, peligrietzer and remember

Jan 2, 2023, 11:48 PM

50 points

4 comments3 min readLW link

[Question] Would it be good or bad for the US military to get involved in AI risk?

Grant DemareeJan 1, 2023, 7:02 PM

50 points

12 comments1 min readLW link

Trying to isolate objectives: approaches toward high-level interpretability

JozdienJan 9, 2023, 6:33 PM

49 points

14 comments8 min readLW link

Citability of Lesswrong and the Alignment Forum

Leon LangJan 8, 2023, 10:12 PM

48 points

2 comments1 min readLW link

Language models can generate superior text compared to their input

ChristianKlJan 17, 2023, 10:57 AM

48 points

28 comments1 min readLW link

[Crosspost] ACX 2022 Prediction Contest Results

Scott Alexander, Eric Neyman and Sam Marks

Jan 24, 2023, 6:56 AM

48 points

6 comments8 min readLW link

[RFC] Possible ways to expand on “Discovering Latent Knowledge in Language Models Without Supervision”.

gekaklam, Walter Laurito , Kaarel and Kay Kozaronek

Jan 25, 2023, 7:03 PM

48 points

6 comments12 min readLW link

How-to Transformer Mechanistic Interpretability—in 50 lines of code or less!

StefanHexJan 24, 2023, 6:45 PM

47 points

5 comments13 min readLW link

[Question] What specific thing would you do with AI Alignment Research Assistant GPT?

quetzal_rainbowJan 8, 2023, 7:24 PM

47 points

9 comments1 min readLW link