All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30 31

METR is hiring!

Beth BarnesDec 26, 2023, 9:00 PM

65 points

1 comment1 min readLW link

Environmental allergies are curable? (Sublingual immunotherapy)

ChipmonkDec 26, 2023, 7:05 PM

47 points

10 comments1 min readLW link

Picasso in the Gallery of Babel

samhealyDec 26, 2023, 4:25 PM

12 points

12 comments4 min readLW link

Flagging Potentially Unfair Parenting

jefftkDec 26, 2023, 12:40 PM

69 points

1 comment1 min readLW link

(www.jefftk.com)

Link Collection: Impact Markets

Saul MunnDec 26, 2023, 9:01 AM

27 points

0 comments2 min readLW link

(www.brasstacks.blog)

How Emergency Medicine Solves the Alignment Problem

StrivingForLegibilityDec 26, 2023, 5:24 AM

41 points

4 comments6 min readLW link

Rationality outreach vs. rationality teaching

LenmarDec 26, 2023, 12:37 AM

7 points

2 comments1 min readLW link

Exploring the Residual Stream of Transformers for Mechanistic Interpretability — Explained

Zeping YuDec 26, 2023, 12:36 AM

7 points

1 comment11 min readLW link

[Question] Anki setup best practices?

Sinclair ChenDec 25, 2023, 10:34 PM

11 points

4 comments1 min readLW link

[Question] Why does expected utility matter?

Marco DiscendentiDec 25, 2023, 2:47 PM

18 points

21 comments4 min readLW link

Freeze Dried Raspberry Truffles

jefftkDec 25, 2023, 2:10 PM

14 points

0 comments1 min readLW link

(www.jefftk.com)

Pornographic and semi-pornographic ads on mainstream websites as an instance of the AI alignment problem?

greenrdDec 25, 2023, 1:19 PM

−1 points

5 comments12 min readLW link

Defense Against The Dark Arts: An Introduction

LyrongolemDec 25, 2023, 6:36 AM

24 points

36 comments20 min readLW link

Occlusions of Moral Knowledge

herschelDec 25, 2023, 5:55 AM

−1 points

0 comments2 min readLW link

(brothernin.substack.com)

[Question] Would you have a baby in 2024?

martinkunevDec 25, 2023, 1:52 AM

24 points

76 comments1 min readLW link

align your latent spaces

bhauthDec 24, 2023, 4:30 PM

27 points

8 comments2 min readLW link

(www.bhauth.com)

Viral Guessing Game

jefftkDec 24, 2023, 1:10 PM

19 points

0 comments1 min readLW link

(www.jefftk.com)

The Sugar Alignment Problem

Adam ZernerDec 24, 2023, 1:35 AM

5 points

3 comments7 min readLW link

A Crisper Explanation of Simulacrum Levels

Thane RuthenisDec 23, 2023, 10:13 PM

92 points

13 comments13 min readLW link

Hyperbolic Discounting and Pascal’s Mugging

Andrew Keenan RichardsonDec 23, 2023, 9:55 PM

9 points

0 comments7 min readLW link

AISN #28: Center for AI Safety 2023 Year in Review

Dan HDec 23, 2023, 9:31 PM

30 points

1 comment5 min readLW link

(newsletter.safe.ai)

“Inftoxicity” and other new words to describe malicious information and communication thereof

Jáchym FibírDec 23, 2023, 6:15 PM

−1 points

6 comments3 min readLW link

AI’s impact on biology research: Part I, today

octopoctaDec 23, 2023, 4:29 PM

31 points

6 comments2 min readLW link

AI Girlfriends Won’t Matter Much

Maxwell TabarrokDec 23, 2023, 3:58 PM

42 points

22 comments2 min readLW link

(maximumprogress.substack.com)

The Next Right Token

jefftkDec 23, 2023, 3:20 AM

14 points

0 comments1 min readLW link

(www.jefftk.com)

Fact Finding: Do Early Layers Specialise in Local Processing? (Post 5)

Neel Nanda, Senthooran Rajamanoharan, János Kramár and Rohin Shah

Dec 23, 2023, 2:46 AM

18 points

0 comments4 min readLW link

Fact Finding: How to Think About Interpreting Memorisation (Post 4)

Senthooran Rajamanoharan, Neel Nanda, János Kramár and Rohin Shah

Dec 23, 2023, 2:46 AM

22 points

0 comments9 min readLW link

Fact Finding: Trying to Mechanistically Understanding Early MLPs (Post 3)

Neel Nanda, Senthooran Rajamanoharan, János Kramár and Rohin Shah

Dec 23, 2023, 2:46 AM

10 points

1 comment16 min readLW link

Fact Finding: Simplifying the Circuit (Post 2)

Senthooran Rajamanoharan, Neel Nanda, János Kramár and Rohin Shah

Dec 23, 2023, 2:45 AM

25 points

3 comments14 min readLW link

Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1)

Neel Nanda, Senthooran Rajamanoharan, János Kramár and Rohin Shah

Dec 23, 2023, 2:44 AM

106 points

10 comments22 min readLW link 2 reviews

Measurement tampering detection as a special case of weak-to-strong generalization

ryan_greenblatt, Fabien Roger and Buck

Dec 23, 2023, 12:05 AM

57 points

10 comments4 min readLW link

How does a toy 2 digit subtraction transformer predict the difference?

Evan AndersDec 22, 2023, 9:17 PM

12 points

0 comments10 min readLW link

(evanhanders.blog)

Thoughts on Max Tegmark’s AI verification

Johannes C. MayerDec 22, 2023, 8:38 PM

10 points

0 comments3 min readLW link

Idealized Agents Are Approximate Causal Mirrors (+ Radical Optimism on Agent Foundations)

Thane RuthenisDec 22, 2023, 8:19 PM

75 points

14 comments6 min readLW link

AI safety advocates should consider providing gentle pushback following the events at OpenAI

civilsocietyDec 22, 2023, 6:55 PM

16 points

5 comments3 min readLW link

“Destroy humanity” as an immediate subgoal

Seth AhrenbachDec 22, 2023, 6:52 PM

3 points

13 comments3 min readLW link

Synthetic Restrictions

nano_brascaDec 22, 2023, 6:50 PM

10 points

0 comments4 min readLW link

Review Report of Davidson on Takeoff Speeds (2023)

Trent KannegieterDec 22, 2023, 6:48 PM

37 points

11 comments38 min readLW link

The problems with the concept of an infohazard as used by the LW community [Linkpost]

Noosphere89Dec 22, 2023, 4:13 PM

75 points

43 comments3 min readLW link

(www.beren.io)

Employee Incentives Make AGI Lab Pauses More Costly

Nikola JurkovicDec 22, 2023, 5:04 AM

28 points

12 comments3 min readLW link

The LessWrong 2022 Review: Review Phase

RobertMDec 22, 2023, 3:23 AM

58 points

7 comments2 min readLW link

The absence of self-rejection is self-acceptance

ChipmonkDec 21, 2023, 9:54 PM

24 points

1 comment1 min readLW link

(chipmonk.substack.com)

A Decision Theory Can Be Rational or Computable, but Not Both

StrivingForLegibilityDec 21, 2023, 9:02 PM

9 points

4 comments1 min readLW link

Most People Don’t Realize We Have No Idea How Our AIs Work

Thane RuthenisDec 21, 2023, 8:02 PM

159 points

42 comments1 min readLW link

Pseudonymity and Accusations

jefftkDec 21, 2023, 7:20 PM

52 points

20 comments3 min readLW link

(www.jefftk.com)

Attention on AI X-Risk Likely Hasn’t Distracted from Current Harms from AI

Erich_GrunewaldDec 21, 2023, 5:24 PM

26 points

2 comments17 min readLW link

(www.erichgrunewald.com)

“Alignment” is one of six words of the year in the Harvard Gazette

Nikola JurkovicDec 21, 2023, 3:54 PM

14 points

1 comment1 min readLW link

(news.harvard.edu)

AI #43: Functional Discoveries

ZviDec 21, 2023, 3:50 PM

52 points

26 comments49 min readLW link

(thezvi.wordpress.com)

Rating my AI Predictions

Robert_AIZIDec 21, 2023, 2:07 PM

22 points

5 comments2 min readLW link

(aizi.substack.com)

AI Safety Chatbot

markov and Robert Miles

Dec 21, 2023, 2:06 PM

61 points

11 comments4 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer