All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 232425 26 27 28 29 30 31

Manifold Halloween Hackathon

Austin ChenOct 23, 2023, 10:47 PM

8 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Open Source Replication & Commentary on Anthropic’s Dictionary Learning Paper

Neel NandaOct 23, 2023, 10:38 PM

93 points

42 votes

Overall karma indicates overall quality.

12 comments9 min readLW link

The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists

EJTOct 23, 2023, 9:00 PM

79 points

29 votes

Overall karma indicates overall quality.

22 comments39 min readLW link

(philpapers.org)

AI Alignment [Incremental Progress Units] this Week (10/22/23)

Logan ZoellnerOct 23, 2023, 8:32 PM

22 points

10 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

(midwitalignment.substack.com)

z is not the cause of x

hrbigelowOct 23, 2023, 5:43 PM

6 points

5 votes

Overall karma indicates overall quality.

2 comments9 min readLW link

Some of my predictable updates on AI

Aaron_ScherOct 23, 2023, 5:24 PM

32 points

15 votes

Overall karma indicates overall quality.

8 comments9 min readLW link

Programmatic backdoors: DNNs can use SGD to run arbitrary stateful computation

Fabien Roger and Buck

Oct 23, 2023, 4:37 PM

107 points

46 votes

Overall karma indicates overall quality.

3 comments8 min readLW link

Machine Unlearning Evaluations as Interpretability Benchmarks

NickyP and Nandi

Oct 23, 2023, 4:33 PM

33 points

17 votes

Overall karma indicates overall quality.

2 comments11 min readLW link

VLM-RM: Specifying Rewards with Natural Language

ChengCheng, David Lindner and Ethan Perez

Oct 23, 2023, 2:11 PM

20 points

6 votes

Overall karma indicates overall quality.

2 comments5 min readLW link

(far.ai)

Contra Dance Dialect Survey

jefftkOct 23, 2023, 1:40 PM

11 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

(www.jefftk.com)

[Question] Which LessWrongers are (aspiring) YouTubers?

Mati_RoyOct 23, 2023, 1:21 PM

22 points

8 votes

Overall karma indicates overall quality.

13 comments1 min readLW link

[Question] What is an “anti-Occamian prior”?

ZaneOct 23, 2023, 2:26 AM

35 points

18 votes

Overall karma indicates overall quality.

22 comments1 min readLW link

Announcing Timaeus

Jesse Hoogland, Daniel Murfet, Alexander Gietelink Oldenziel and Stan van Wingerden

Oct 22, 2023, 11:59 AM

188 points

83 votes

Overall karma indicates overall quality.

15 comments4 min readLW link

Into AI Safety—Episode 0

jacobhaimesOct 22, 2023, 3:30 AM

5 points

4 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

(into-ai-safety.github.io)

Thoughts On (Solving) Deep Deception

JozdienOct 21, 2023, 10:40 PM

72 points

34 votes

Overall karma indicates overall quality.

6 comments6 min readLW link

Best effort beliefs

Adam ZernerOct 21, 2023, 10:05 PM

14 points

11 votes

Overall karma indicates overall quality.

9 comments4 min readLW link

How toy models of ontology changes can be misleading

Stuart_ArmstrongOct 21, 2023, 9:13 PM

42 points

16 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Soups as Spreads

jefftkOct 21, 2023, 8:30 PM

22 points

14 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

(www.jefftk.com)

Which COVID booster to get?

SameerishereOct 21, 2023, 7:43 PM

8 points

3 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Alignment Implications of LLM Successes: a Debate in One Act

Zack_M_DavisOct 21, 2023, 3:22 PM

266 points

124 votes

Overall karma indicates overall quality.

56 comments13 min readLW link 2 reviews

How to find a good moving service

Ziyue WangOct 21, 2023, 4:59 AM

8 points

8 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Apply for MATS Winter 2023-24!

utilistrutil, Ryan Kidd and LauraVaughan

Oct 21, 2023, 2:27 AM

104 points

35 votes

Overall karma indicates overall quality.

6 comments5 min readLW link

[Question] Can we isolate neurons that recognize features vs. those which have some other role?

Joshua ClancyOct 21, 2023, 12:30 AM

4 points

4 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

Muddling Along Is More Likely Than Dystopia

Jeffrey HeningerOct 20, 2023, 9:25 PM

88 points

42 votes

Overall karma indicates overall quality.

10 comments8 min readLW link

What’s Hard About The Shutdown Problem

johnswentworthOct 20, 2023, 9:13 PM

98 points

37 votes

Overall karma indicates overall quality.

33 comments4 min readLW link

Holly Elmore and Rob Miles dialogue on AI Safety Advocacy

Bird Concept, Robert Miles and Holly_Elmore

Oct 20, 2023, 9:04 PM

163 points

60 votes

Overall karma indicates overall quality.

30 comments27 min readLW link

TOMORROW: the largest AI Safety protest ever!

Holly_ElmoreOct 20, 2023, 6:15 PM

105 points

56 votes

Overall karma indicates overall quality.

26 comments2 min readLW link

The Overkill Conspiracy Hypothesis

ymeskhoutOct 20, 2023, 4:51 PM

27 points

14 votes

Overall karma indicates overall quality.

9 comments7 min readLW link

I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines

307thOct 20, 2023, 4:37 PM

125 points

86 votes

Overall karma indicates overall quality.

33 comments9 min readLW link

Internal Target Information for AI Oversight

Paul CologneseOct 20, 2023, 2:53 PM

15 points

5 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

On the proper date for solstice celebrations

jchanOct 20, 2023, 1:55 PM

16 points

4 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

Are (at least some) Large Language Models Holographic Memory Stores?

Bill BenzonOct 20, 2023, 1:07 PM

11 points

6 votes

Overall karma indicates overall quality.

4 comments6 min readLW link

Mechanistic interpretability of LLM analogy-making

SergiiOct 20, 2023, 12:53 PM

2 points

1 vote

Overall karma indicates overall quality.

0 comments4 min readLW link

(grgv.xyz)

How To Socialize With Psycho(logist)s

SableOct 20, 2023, 11:33 AM

37 points

17 votes

Overall karma indicates overall quality.

11 comments3 min readLW link

(affablyevil.substack.com)

Revealing Intentionality In Language Models Through AdaVAE Guided Sampling

jdpOct 20, 2023, 7:32 AM

119 points

50 votes

Overall karma indicates overall quality.

15 comments22 min readLW link

Features and Adversaries in MemoryDT

Joseph Bloom and Jay Bailey

Oct 20, 2023, 7:32 AM

31 points

15 votes

Overall karma indicates overall quality.

6 comments25 min readLW link

AI Safety Hub Serbia Soft Launch

DusanDNesicOct 20, 2023, 7:11 AM

64 points

35 votes

Overall karma indicates overall quality.

1 comment3 min readLW link

(forum.effectivealtruism.org)

Announcing new round of “Key Phenomena in AI Risk” Reading Group

DusanDNesic and Nora_Ammann

Oct 20, 2023, 7:11 AM

15 points

7 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

Unpacking the dynamics of AGI conflict that suggest the necessity of a premptive pivotal act

Eli TyreOct 20, 2023, 6:48 AM

63 points

18 votes

Overall karma indicates overall quality.

2 comments8 min readLW link

Genocide isn’t Decolonization

robotelvisOct 20, 2023, 4:14 AM

33 points

62 votes

Overall karma indicates overall quality.

20 comments5 min readLW link

(messyprogress.substack.com)

Trying to understand John Wentworth’s research agenda

johnswentworth, habryka and David Lorell

Oct 20, 2023, 12:05 AM

96 points

42 votes

Overall karma indicates overall quality.

13 comments12 min readLW link

Boost your productivity, happiness and health with this one weird trick

ajc586Oct 19, 2023, 11:30 PM

9 points

8 votes

Overall karma indicates overall quality.

9 comments1 min readLW link

A Good Explanation of Differential Gears

Johannes C. MayerOct 19, 2023, 11:07 PM

48 points

19 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

(youtu.be)

Evening Wiki(pedia) Workout

mcintOct 19, 2023, 9:29 PM

1 point

1 vote

Overall karma indicates overall quality.

1 comment1 min readLW link

New roles on my team: come build Open Phil’s technical AI safety program with me!

Ajeya CotraOct 19, 2023, 4:47 PM

83 points

32 votes

Overall karma indicates overall quality.

6 comments4 min readLW link

[Question] Infinite tower of meta-probability

fryolysisOct 19, 2023, 4:44 PM

6 points

7 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

A NotKillEveryoneIsm Argument for Accelerating Deep Learning Research

Logan ZoellnerOct 19, 2023, 4:28 PM

−6 points

8 votes

Overall karma indicates overall quality.

6 comments5 min readLW link

(midwitalignment.substack.com)

Knowledge Base 5: Business model

iwisOct 19, 2023, 4:06 PM

−4 points

3 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

AI #34: Chipping Away at Chip Exports

ZviOct 19, 2023, 3:00 PM

36 points

26 votes

Overall karma indicates overall quality.

19 comments59 min readLW link

(thezvi.wordpress.com)

Is Yann LeCun strawmanning AI x-risks?

Chris_LeongOct 19, 2023, 11:35 AM

26 points

18 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer