All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30

Cheap Model → Big Model design

Maxwell PetersonNov 19, 2023, 10:50 PM

15 points

4 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

Human-like systematic generalization through a meta-learning neural network

BurnyNov 19, 2023, 9:41 PM

8 points

5 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

(twitter.com)

“Benevolent [ie, Ruler] AI is a bad idea” and a suggested alternative (not author)

the gears to ascensionNov 19, 2023, 8:22 PM

22 points

12 votes

Overall karma indicates overall quality.

11 comments1 min readLW link

(www.palladiummag.com)

Alignment is Hard: An Uncomputable Alignment Problem

Alexander BistagneNov 19, 2023, 7:38 PM

−5 points

9 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

(github.com)

New paper shows truthfulness & instruction-following don’t generalize by default

joshcNov 19, 2023, 7:27 PM

60 points

33 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

In favour of a sovereign state of Gaza

Yair HalberstadtNov 19, 2023, 4:08 PM

8 points

13 votes

Overall karma indicates overall quality.

3 comments4 min readLW link

My Criticism of Singular Learning Theory

Joar SkalseNov 19, 2023, 3:19 PM

83 points

52 votes

Overall karma indicates overall quality.

56 comments12 min readLW link

“Why can’t you just turn it off?”

RokoNov 19, 2023, 2:46 PM

48 points

46 votes

Overall karma indicates overall quality.

25 comments1 min readLW link

Spaciousness In Partner Dance: A Naturalism Demo

LoganStrohlNov 19, 2023, 7:00 AM

78 points

22 votes

Overall karma indicates overall quality.

6 comments19 min readLW link 1 review

Altman firing retaliation incoming?

trevorNov 19, 2023, 12:10 AM

50 points

47 votes

Overall karma indicates overall quality.

23 comments5 min readLW link

When Will AIs Develop Long-Term Planning?

PeterMcCluskeyNov 19, 2023, 12:08 AM

18 points

7 votes

Overall karma indicates overall quality.

5 comments4 min readLW link

(bayesianinvestor.com)

Killswitch

JunioNov 18, 2023, 10:53 PM

2 points

2 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Superalignment

Douglas_ReayNov 18, 2023, 10:37 PM

−4 points

10 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

(openai.com)

Predictable Defect-Cooperate?

quetzal_rainbowNov 18, 2023, 3:38 PM

7 points

3 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

I think I’m just confused. Once a model exists, how do you “red-team” it to see whether it’s safe. Isn’t it already dangerous?

FTPickleNov 18, 2023, 2:16 PM

21 points

11 votes

Overall karma indicates overall quality.

13 comments1 min readLW link

AI Safety Camp 2024

Linda LinseforsNov 18, 2023, 10:37 AM

15 points

9 votes

Overall karma indicates overall quality.

1 comment4 min readLW link

(aisafety.camp)

Post-EAG Music Party

jefftkNov 18, 2023, 3:00 AM

14 points

4 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

(www.jefftk.com)

Letter to a Sonoma County Jail Cell

MadHatterNov 18, 2023, 2:24 AM

9 points

13 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

(open.substack.com)

1. A Sense of Fairness: Deconfusing Ethics

RogerDearnaleyNov 17, 2023, 8:55 PM

17 points

10 votes

Overall karma indicates overall quality.

8 comments15 min readLW link

Sam Altman fired from OpenAI

LawrenceCNov 17, 2023, 8:42 PM

192 points

93 votes

Overall karma indicates overall quality.

75 comments1 min readLW link

(openai.com)

On the lethality of biased human reward ratings

Eli Tyre and johnswentworth

Nov 17, 2023, 6:59 PM

48 points

18 votes

Overall karma indicates overall quality.

10 comments37 min readLW link

Coup probes: Catching catastrophes with probes trained off-policy

Fabien RogerNov 17, 2023, 5:58 PM

93 points

31 votes

Overall karma indicates overall quality.

9 comments11 min readLW link 1 review

On Lies and Liars

Gabriel AlfourNov 17, 2023, 5:13 PM

31 points

25 votes

Overall karma indicates overall quality.

4 comments14 min readLW link

(cognition.cafe)

Classifying representations of sparse autoencoders (SAEs)

AnnahNov 17, 2023, 1:54 PM

15 points

7 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

R&D is a Huge Externality, So Why Do Markets Do So Much of it?

Maxwell TabarrokNov 17, 2023, 1:14 PM

15 points

8 votes

Overall karma indicates overall quality.

14 comments3 min readLW link

(maximumprogress.substack.com)

On excluding dangerous information from training

ShayBenMosheNov 17, 2023, 11:14 AM

23 points

11 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

The dangers of reproducing while old

garymmNov 17, 2023, 5:55 AM

23 points

6 votes

Overall karma indicates overall quality.

6 comments1 min readLW link

(www.garymm.org)

I put odds on ends with Nathan Young

KatjaGraceNov 17, 2023, 5:40 AM

8 points

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

(worldspiritsockpuppet.com)

Debate helps supervise human experts [Paper]

habrykaNov 17, 2023, 5:25 AM

29 points

10 votes

Overall karma indicates overall quality.

6 comments1 min readLW link

(github.com)

A to Z of things

KatjaGraceNov 17, 2023, 5:20 AM

71 points

25 votes

Overall karma indicates overall quality.

8 comments1 min readLW link 1 review

(worldspiritsockpuppet.com)

On Tapping Out

ScrewtapeNov 17, 2023, 3:23 AM

52 points

29 votes

Overall karma indicates overall quality.

14 comments8 min readLW link 1 review

Eliciting Latent Knowledge in Comprehensive AI Services Models

acabodiNov 17, 2023, 2:36 AM

6 points

3 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Some Rules for an Algebra of Bayes Nets

johnswentworth and David Lorell

Nov 16, 2023, 11:53 PM

98 points

24 votes

Overall karma indicates overall quality.

45 comments14 min readLW link 1 review

How much to update on recent AI governance moves?

habryka and So8res

Nov 16, 2023, 11:46 PM

112 points

41 votes

Overall karma indicates overall quality.

5 comments29 min readLW link

New LessWrong feature: Dialogue Matching

Bird ConceptNov 16, 2023, 9:27 PM

106 points

35 votes

Overall karma indicates overall quality.

22 comments3 min readLW link

Towards Evaluating AI Systems for Moral Status Using Self-Reports

Ethan Perez and Robbo

Nov 16, 2023, 8:18 PM

45 points

14 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

(arxiv.org)

Social Dark Matter

Duncan Sabien (Inactive)Nov 16, 2023, 8:00 PM

367 points

271 votes

Overall karma indicates overall quality.

129 comments34 min readLW link 2 reviews

AI #38: Let’s Make a Deal

ZviNov 16, 2023, 7:50 PM

44 points

24 votes

Overall karma indicates overall quality.

2 comments55 min readLW link

(thezvi.wordpress.com)

Forecasting AI (Overview)

jsteinhardtNov 16, 2023, 7:00 PM

35 points

11 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

(bounded-regret.ghost.io)

We Should Talk About This More. Epistemic World Collapse as Imminent Safety Risk of Generative AI.

Joerg WeissNov 16, 2023, 6:46 PM

11 points

5 votes

Overall karma indicates overall quality.

2 comments29 min readLW link

Intelligence in systems (human, AI) can be conceptualized as the resolution and throughput at which a system can process and affect Shannon information.

AiresJLNov 16, 2023, 5:46 PM

0 points

2 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Life on the Grid (Part 2)

rogersbaconNov 16, 2023, 5:22 PM

7 points

3 votes

Overall karma indicates overall quality.

0 comments15 min readLW link

(www.secretorum.life)

The impossibility of rationally analyzing partisan news

RationalDinoNov 16, 2023, 4:19 PM

4 points

7 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

We are Peacecraft.ai!

MadHatterNov 16, 2023, 2:15 PM

15 points

15 votes

Overall karma indicates overall quality.

20 comments2 min readLW link

A dialectical view of the history of AI, Part 1: We’re only in the antithesis phase. [A synthesis is in the future.]

Bill BenzonNov 16, 2023, 12:34 PM

6 points

5 votes

Overall karma indicates overall quality.

0 comments12 min readLW link

[Question] How much fraud is there in academia?

ChristianKlNov 16, 2023, 11:50 AM

23 points

9 votes

Overall karma indicates overall quality.

10 comments1 min readLW link

Learning coefficient estimation: the details

Zach FurmanNov 16, 2023, 3:19 AM

36 points

13 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

(colab.research.google.com)

[Question] AI Safety orgs- what’s your biggest bottleneck right now?

Kabir KumarNov 16, 2023, 2:02 AM

1 point

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

My critique of Eliezer’s deeply irrational beliefs

JorterderNov 16, 2023, 12:34 AM

−35 points

13 votes

Overall karma indicates overall quality.

1 comment9 min readLW link

(docs.google.com)

Extrapolating from Five Words

Gordon Seidoh WorleyNov 15, 2023, 11:21 PM

40 points

21 votes

Overall karma indicates overall quality.

11 comments2 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer