All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

AI #15: The Principle of Charity

ZviJun 8, 2023, 12:10 PM

73 points

16 comments44 min readLW link

(thezvi.wordpress.com)

A plea for solutionism on AI safety

jasoncrawfordJun 9, 2023, 4:29 PM

72 points

6 comments6 min readLW link

(rootsofprogress.org)

MetaAI: less is less for alignment.

Cleo NardoJun 13, 2023, 2:08 PM

71 points

17 comments5 min readLW link

Manifold Predicted the AI Extinction Statement and CAIS Wanted it Deleted

David CheeJun 12, 2023, 3:54 PM

71 points

15 comments12 min readLW link

LEAst-squares Concept Erasure (LEACE)

tricky_labyrinthJun 7, 2023, 9:51 PM

68 points

10 comments1 min readLW link

(twitter.com)

Adventist Health Study-2 supports pescetarianism more than veganism

ElizabethJun 17, 2023, 8:10 PM

67 points

11 comments6 min readLW link

(acesounderglass.com)

Introduction to Towards Causal Foundations of Safe AGI

tom4everitt, Lewis Hammond, Francis Rhys Ward, RyanCarey, James Fox, mattmacdermott and sbenthall

Jun 12, 2023, 5:55 PM

67 points

6 comments4 min readLW link

“textbooks are all you need”

bhauthJun 21, 2023, 5:06 PM

66 points

18 comments2 min readLW link

(arxiv.org)

Short timelines and slow, continuous takeoff as the safest path to AGI

rosehadshar and LintzA

Jun 21, 2023, 8:56 AM

65 points

15 comments7 min readLW link

The ones who endure

Richard_NgoJun 16, 2023, 2:40 PM

65 points

16 comments5 min readLW link

(www.thinkingcomplete.com)

Man in the Arena

Richard_NgoJun 26, 2023, 9:57 PM

65 points

6 comments8 min readLW link

A Friendly Face (Another Failure Story)

Karl von Wendt, Sofia Bharadia, PeterDrotos, Artem Korotkov, mespa and mruwnik

Jun 20, 2023, 10:31 AM

65 points

21 comments16 min readLW link

Which personality traits are real? Stress-testing the lexical hypothesis

tailcalledJun 21, 2023, 7:46 PM

65 points

5 comments9 min readLW link 1 review

TASRA: A Taxonomy and Analysis of Societal-Scale Risks from AI

Andrew_CritchJun 13, 2023, 5:04 AM

64 points

1 comment1 min readLW link

UK Foundation Model Task Force—Expression of Interest

ojorgensenJun 18, 2023, 9:43 AM

64 points

2 comments1 min readLW link

(twitter.com)

Uncertainty about the future does not imply that AGI will go well

Lauro LangoscoJun 1, 2023, 5:38 PM

62 points

11 comments7 min readLW link

AISafety.info “How can I help?” FAQ

steven0461 and Severin T. Seehrich

Jun 5, 2023, 10:09 PM

59 points

0 comments2 min readLW link

A Double-Feature on The Extropians

Maxwell TabarrokJun 3, 2023, 6:27 PM

59 points

4 comments1 min readLW link

Ages Survey: Results

jefftkJun 5, 2023, 2:10 AM

57 points

10 comments5 min readLW link

(www.jefftk.com)

Contingency: A Conceptual Tool from Evolutionary Biology for Alignment

clem_acsJun 12, 2023, 8:54 PM

57 points

2 comments14 min readLW link

(acsresearch.org)

[Request]: Use “Epilogenics” instead of “Eugenics” in most circumstances

GeneSmithJun 1, 2023, 3:36 PM

56 points

49 comments1 min readLW link

A “weak” AGI may attempt an unlikely-to-succeed takeover

RobertMJun 28, 2023, 8:31 PM

56 points

17 comments3 min readLW link

The Control Problem: Unsolved or Unsolvable?

RemmeltJun 2, 2023, 3:42 PM

55 points

46 comments14 min readLW link

formalizing the QACI alignment formal-goal

Tamsin Leake and JuliaHP

Jun 10, 2023, 3:28 AM

54 points

6 comments13 min readLW link

(carado.moe)

Improvement on MIRI’s Corrigibility

WCargo and Charbel-Raphaël

Jun 9, 2023, 4:10 PM

54 points

8 comments13 min readLW link

DSLT 1. The RLCT Measures the Effective Dimension of Neural Networks

Liam CarrollJun 16, 2023, 9:50 AM

54 points

10 comments13 min readLW link

Mode collapse in RL may be fueled by the update equation

TurnTrout and MichaelEinhorn

Jun 19, 2023, 9:51 PM

53 points

10 comments8 min readLW link

[Replication] Conjecture’s Sparse Coding in Small Transformers

Hoagy and Logan Riggs

Jun 16, 2023, 6:02 PM

52 points

0 comments5 min readLW link

An Exercise to Build Intuitions on AGI Risk

Lauro LangoscoJun 7, 2023, 6:35 PM

52 points

3 comments8 min readLW link

Are Bayesian methods guaranteed to overfit?

Ege ErdilJun 17, 2023, 12:52 PM

52 points

5 comments3 min readLW link

(www.yulingyao.com)

AXRP Episode 22 - Shard Theory with Quintin Pope

DanielFilanJun 15, 2023, 7:00 PM

52 points

11 comments93 min readLW link

InternLM—China’s Best (Unverified)

Lao MeinJun 9, 2023, 7:39 AM

51 points

4 comments1 min readLW link

A moral backlash against AI will probably slow down AGI development

geoffreymillerJun 7, 2023, 8:39 PM

51 points

10 comments14 min readLW link

How to Think About Activation Patching

Neel NandaJun 4, 2023, 2:17 PM

50 points

5 comments20 min readLW link

(www.neelnanda.io)

Crystal Healing — or the Origins of Expected Utility Maximizers

Alexander Gietelink Oldenziel, RP and Kaarel

Jun 25, 2023, 3:18 AM

50 points

11 comments5 min readLW link

The Case for Overconfidence is Overstated

Kevin DorstJun 28, 2023, 5:21 PM

50 points

13 comments8 min readLW link

(kevindorst.substack.com)

Causality: A Brief Introduction

tom4everitt, Lewis Hammond, Jonathan Richens, Francis Rhys Ward, RyanCarey, sbenthall and James Fox

Jun 20, 2023, 3:01 PM

49 points

18 comments6 min readLW link

Instrumental Convergence? [Draft]

J. Dmitri GallowJun 14, 2023, 8:21 PM

48 points

20 comments33 min readLW link

Elon talked with senior Chinese leadership about AI X-risk

ChristianKlJun 7, 2023, 3:02 PM

47 points

2 comments1 min readLW link

(www.youtube.com)

“Safety Culture for AI” is important, but isn’t going to be easy

DavidmanheimJun 26, 2023, 12:52 PM

47 points

2 comments2 min readLW link

(forum.effectivealtruism.org)

My impression of singular learning theory

Ege ErdilJun 18, 2023, 3:34 PM

47 points

30 comments2 min readLW link

AI #18: The Great Debate Debate

ZviJun 29, 2023, 4:20 PM

47 points

9 comments52 min readLW link

(thezvi.wordpress.com)

Updating Drexler’s CAIS model

Matthew BarnettJun 16, 2023, 10:53 PM

47 points

32 comments4 min readLW link

AI #16: AI in the UK

ZviJun 15, 2023, 1:20 PM

46 points

20 comments54 min readLW link

(thezvi.wordpress.com)

Agentic Mess (A Failure Story)

Karl von Wendt, Sofia Bharadia, PeterDrotos, Artem Korotkov, mespa and mruwnik

Jun 6, 2023, 1:09 PM

46 points

5 comments13 min readLW link

I can see how I am Dumb

Johannes C. MayerJun 10, 2023, 7:18 PM

46 points

11 comments5 min readLW link

Ban development of unpredictable powerful models?

TurnTroutJun 20, 2023, 1:43 AM

46 points

25 comments4 min readLW link

We Are Less Wrong than E. T. Jaynes on Loss Functions in Human Society

Zack_M_DavisJun 5, 2023, 5:34 AM

46 points

14 comments2 min readLW link

Why am I Me?

dadadarrenJun 25, 2023, 12:07 PM

45 points

46 comments3 min readLW link

Self-Blinded Caffeine RCT

niplavJun 27, 2023, 12:38 PM

45 points

9 comments8 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer