All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 242526 27 28 29

Extinction-level Goodhart’s Law as a Property of the Environment

VojtaKovarik and Ida Mattsson

Feb 21, 2024, 5:56 PM

23 points

0 comments10 min readLW link

Dynamics Crucial to AI Risk Seem to Make for Complicated Models

VojtaKovarik and Ida Mattsson

Feb 21, 2024, 5:54 PM

19 points

0 comments9 min readLW link

Which Model Properties are Necessary for Evaluating an Argument?

VojtaKovarik and Ida Mattsson

Feb 21, 2024, 5:52 PM

18 points

2 comments7 min readLW link

Weak vs Quantitative Extinction-level Goodhart’s Law

VojtaKovarik and Ida Mattsson

Feb 21, 2024, 5:38 PM

27 points

1 comment2 min readLW link

Dual Wielding Kindle Scribes

mesaoptimizerFeb 21, 2024, 5:17 PM

57 points

18 comments6 min readLW link

A Tale of Two Restaurant Types

ZviFeb 21, 2024, 1:50 PM

15 points

0 comments6 min readLW link

(thezvi.wordpress.com)

Less Wrong automated systems are inadvertently Censoring me

RokoFeb 21, 2024, 12:57 PM

6 points

52 comments1 min readLW link

[Question] What is the research speed multiplier of the most advanced current LLMs?

wunanFeb 21, 2024, 12:39 PM

6 points

2 comments1 min readLW link

Jailbreaking GPT-4 with the tool API

mishajwFeb 21, 2024, 11:16 AM

20 points

2 comments4 min readLW link

Gut Renovating Another Bathroom

jefftkFeb 21, 2024, 3:00 AM

22 points

0 comments2 min readLW link

(www.jefftk.com)

Thoughts for and against an ASI figuring out ethics for itself

sweenesmFeb 20, 2024, 11:40 PM

6 points

10 comments3 min readLW link

AI #51: Altman’s Ambition

ZviFeb 20, 2024, 7:50 PM

83 points

5 comments38 min readLW link

(thezvi.wordpress.com)

The Third Gemini

ZviFeb 20, 2024, 7:50 PM

30 points

2 comments9 min readLW link

(thezvi.wordpress.com)

Why does generalization work?

Martín SotoFeb 20, 2024, 5:51 PM

43 points

16 comments4 min readLW link

ChatGPT refuses to accept a challenge where it would get shot between the eyes [game theory]

Bill BenzonFeb 20, 2024, 4:55 PM

4 points

6 comments4 min readLW link

Inducing human-like biases in moral reasoning LMs

Artyom Karpov, Austin Meek, Bogdan Ionut Cirstea and SCho

Feb 20, 2024, 4:28 PM

23 points

3 comments14 min readLW link

Monthly Roundup #15: February 2024

ZviFeb 20, 2024, 1:10 PM

22 points

7 comments32 min readLW link

(thezvi.wordpress.com)

Selections From “The Trouble With Being Born”

Arjun PanicksseryFeb 20, 2024, 10:07 AM

23 points

2 comments2 min readLW link

(arjunpanickssery.substack.com)

Difficulty classes for alignment properties

JozdienFeb 20, 2024, 9:08 AM

34 points

5 comments2 min readLW link

Lessons from Failed Attempts to Model Sleeping Beauty Problem

Ape in the coatFeb 20, 2024, 6:43 AM

13 points

16 comments14 min readLW link

flowing like water; hard like stone

lsusr and SilverFlame

Feb 20, 2024, 3:20 AM

27 points

4 comments4 min readLW link

Theism Isn’t So Crazy

omnizoidFeb 20, 2024, 3:20 AM

−31 points

11 comments19 min readLW link

[Question] Getting started at distillations: can critique mine?

Joyee ChenFeb 20, 2024, 12:49 AM

2 points

0 comments1 min readLW link

Auditing LMs with counterfactual search: a tool for control and ELK

Jacob PfauFeb 20, 2024, 12:02 AM

28 points

6 comments10 min readLW link

Rationalist Storytelling (French)

Camille Berger Feb 19, 2024, 10:25 PM

3 points

0 comments1 min readLW link

Abs-E (or, speak only in the positive)

dkl9Feb 19, 2024, 9:14 PM

29 points

24 comments2 min readLW link

(dkl9.net)

Retirement Accounts and Short Timelines

jefftkFeb 19, 2024, 6:50 PM

83 points

35 comments2 min readLW link

(www.jefftk.com)

How Technical AI Safety Researchers Can Help Implement Punitive Damages to Mitigate Catastrophic AI Risk

Gabriel WeilFeb 19, 2024, 6:00 PM

20 points

0 comments4 min readLW link

Protocol evaluations: good analogies vs control

Fabien RogerFeb 19, 2024, 6:00 PM

42 points

10 comments11 min readLW link

When Should Copyright Get Shorter?

Maxwell TabarrokFeb 19, 2024, 4:03 PM

11 points

14 comments4 min readLW link

(www.maximum-progress.com)

Auto-matching hidden layers in Pytorch LLMs

chanindFeb 19, 2024, 12:40 PM

2 points

0 comments3 min readLW link

I’d also take $7 trillion

bhauthFeb 19, 2024, 3:31 AM

47 points

12 comments10 min readLW link

(www.bhauth.com)

On coincidences and Bayesian reasoning, as applied to the origins of COVID-19

viking_mathFeb 19, 2024, 1:14 AM

62 points

28 comments14 min readLW link

Solution to the two envelopes problem for moral weights

MichaelStJulesFeb 19, 2024, 12:15 AM

9 points

1 comment LW link

Conspiracy Investigation Done Right

ymeskhoutFeb 19, 2024, 12:09 AM

24 points

0 comments6 min readLW link

Scientific Method

Andrij “Androniq” GhorbunovFeb 18, 2024, 9:06 PM

24 points

4 comments30 min readLW link

[Question] Weighing reputational and moral consequences of leaving Russia or staying

spzaFeb 18, 2024, 7:36 PM

29 points

24 comments1 min readLW link

Things I’ve Grieved

RaemonFeb 18, 2024, 7:32 PM

125 points

6 comments2 min readLW link

Senses of “knowing” a person

dkl9Feb 18, 2024, 7:13 PM

3 points

0 comments1 min readLW link

(dkl9.net)

The Jolly Green Giant Chronicles [ChatGPT]

Bill BenzonFeb 18, 2024, 5:28 PM

4 points

0 comments8 min readLW link

Intuition for 1 + 2 + 3 + … = −1/12

Shankar SivarajanFeb 18, 2024, 4:46 PM

18 points

28 comments3 min readLW link

No Clickbait—Misalignment Database

Kabir KumarFeb 18, 2024, 5:35 AM

6 points

10 comments1 min readLW link

Idea: NV⁻ Centers for Brain Interpretability

James CamachoFeb 18, 2024, 5:28 AM

6 points

1 comment3 min readLW link

Celiacs don’t need to live in fear

JarrahFeb 18, 2024, 2:30 AM

16 points

4 comments4 min readLW link

“What if we could redesign society from scratch? The promise of charter cities.” [Rational Animations video]

Jackson WagnerFeb 18, 2024, 12:57 AM

40 points

7 comments LW link

(www.youtube.com)

Evaluating Solar

jefftkFeb 17, 2024, 9:50 PM

26 points

5 comments2 min readLW link

(www.jefftk.com)

Opinions survey 2 (with rationalism score at the end)

tailcalledFeb 17, 2024, 12:03 PM

2 points

11 comments1 min readLW link

(docs.google.com)

Achieving AI Alignment through Deliberate Uncertainty in Multiagent Systems

Florian_DietzFeb 17, 2024, 8:45 AM

4 points

0 comments13 min readLW link

Communication, consciousness, and belief strength measures

Jakub SmékalFeb 17, 2024, 5:45 AM

1 point

0 comments3 min readLW link

San Fernando Valley Rationality: February 22, 2024

Thomas BroadleyFeb 17, 2024, 1:58 AM

3 points

0 comments1 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer