All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 678 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

In defense of probably wrong mechanistic models

evhubDec 6, 2022, 11:24 PM

55 points

33 votes

Overall karma indicates overall quality.

10 comments2 min readLW link

AI Safety in a Vulnerable World: Requesting Feedback on Preliminary Thoughts

Jordan ArelDec 6, 2022, 10:35 PM

4 points

3 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

ChatGPT and the Human Race

Ben ReillyDec 6, 2022, 9:38 PM

6 points

8 votes

Overall karma indicates overall quality.

1 comment3 min readLW link

[Question] How do finite factored sets compare with phase space?

Alex_AltairDec 6, 2022, 8:05 PM

15 points

5 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Mesa-Optimizers via Grokking

orthonormalDec 6, 2022, 8:05 PM

36 points

18 votes

Overall karma indicates overall quality.

4 comments6 min readLW link

Using GPT-Eliezer against ChatGPT Jailbreaking

Stuart_Armstrong and rgorman

Dec 6, 2022, 7:54 PM

170 points

134 votes

Overall karma indicates overall quality.

85 comments9 min readLW link

The Parable of the Crimp

PhosphorousDec 6, 2022, 6:41 PM

11 points

7 votes

Overall karma indicates overall quality.

3 comments3 min readLW link

The Categorical Imperative Obscures

Gordon Seidoh WorleyDec 6, 2022, 5:48 PM

17 points

10 votes

Overall karma indicates overall quality.

17 comments2 min readLW link

MIRI’s “Death with Dignity” in 60 seconds.

Cleo NardoDec 6, 2022, 5:18 PM

60 points

37 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Things roll downhill

awenonianDec 6, 2022, 3:27 PM

19 points

13 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

EA & LW Forums Weekly Summary (28th Nov − 4th Dec 22′)

Zoe WilliamsDec 6, 2022, 9:38 AM

10 points

3 votes

Overall karma indicates overall quality.

1 comment17 min readLW link

Take 5: Another problem for natural abstractions is laziness.

Charlie SteinerDec 6, 2022, 7:00 AM

31 points

9 votes

Overall karma indicates overall quality.

4 comments3 min readLW link

Verification Is Not Easier Than Generation In General

johnswentworthDec 6, 2022, 5:20 AM

74 points

51 votes

Overall karma indicates overall quality.

27 comments1 min readLW link

Shh, don’t tell the AI it’s likely to be evil

naterushDec 6, 2022, 3:35 AM

19 points

9 votes

Overall karma indicates overall quality.

9 comments1 min readLW link

[Question] What are the major underlying divisions in AI safety?

Chris_LeongDec 6, 2022, 3:28 AM

5 points

3 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

[Link] Why I’m optimistic about OpenAI’s alignment approach

janleikeDec 5, 2022, 10:51 PM

98 points

48 votes

Overall karma indicates overall quality.

15 comments1 min readLW link

(aligned.substack.com)

The No Free Lunch theorem for dummies

Steven ByrnesDec 5, 2022, 9:46 PM

37 points

17 votes

Overall karma indicates overall quality.

16 comments3 min readLW link

ChatGPT and Ideological Turing Test

ViliamDec 5, 2022, 9:45 PM

42 points

17 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

ChatGPT on Spielberg’s A.I. and AI Alignment

Bill BenzonDec 5, 2022, 9:10 PM

5 points

5 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

Updating my AI timelines

Matthew BarnettDec 5, 2022, 8:46 PM

145 points

81 votes

Overall karma indicates overall quality.

50 comments2 min readLW link

Steering Behaviour: Testing for (Non-)Myopia in Language Models

Evan R. Murphy and Megan Kinniment

Dec 5, 2022, 8:28 PM

40 points

19 votes

Overall karma indicates overall quality.

19 comments10 min readLW link

College Admissions as a Brutal One-Shot Game

devanshDec 5, 2022, 8:05 PM

8 points

34 votes

Overall karma indicates overall quality.

26 comments2 min readLW link

Analysis of AI Safety surveys for field-building insights

Ash JafariDec 5, 2022, 7:21 PM

11 points

5 votes

Overall karma indicates overall quality.

2 comments5 min readLW link

Testing Ways to Bypass ChatGPT’s Safety Features

Robert_AIZIDec 5, 2022, 6:50 PM

7 points

7 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

(aizi.substack.com)

Foresight for AGI Safety Strategy: Mitigating Risks and Identifying Golden Opportunities

jacquesthibsDec 5, 2022, 4:09 PM

28 points

16 votes

Overall karma indicates overall quality.

6 comments8 min readLW link

Aligned Behavior is not Evidence of Alignment Past a Certain Level of Intelligence

Ronny FernandezDec 5, 2022, 3:19 PM

19 points

9 votes

Overall karma indicates overall quality.

5 comments7 min readLW link

[Question] How should I judge the impact of giving $5k to a family of three kids and two mentally ill parents?

BlakeDec 5, 2022, 1:42 PM

12 points

6 votes

Overall karma indicates overall quality.

10 comments1 min readLW link

Is the “Valley of Confused Abstractions” real?

jacquesthibsDec 5, 2022, 1:36 PM

20 points

15 votes

Overall karma indicates overall quality.

11 comments2 min readLW link

Take 4: One problem with natural abstractions is there’s too many of them.

Charlie SteinerDec 5, 2022, 10:39 AM

37 points

16 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

[Question] What are some good Lesswrong-related accounts or hashtags on Mastodon that I should follow?

SpectrumDTDec 5, 2022, 9:42 AM

2 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

[Question] Who are some prominent reasonable people who are confident that AI won’t kill everyone?

Optimization ProcessDec 5, 2022, 9:12 AM

72 points

36 votes

Overall karma indicates overall quality.

54 comments1 min readLW link

Monthly Shorts 11/22

CelerDec 5, 2022, 7:30 AM

8 points

2 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

(keller.substack.com)

A ChatGPT story about ChatGPT doom

Matt HeDec 5, 2022, 5:40 AM

6 points

7 votes

Overall karma indicates overall quality.

2 comments4 min readLW link

A Tentative Timeline of The Near Future (2022-2025) for Self-Accountability

YitzDec 5, 2022, 5:33 AM

26 points

12 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

Nook Nature

Duncan Sabien (Inactive)Dec 5, 2022, 4:10 AM

54 points

38 votes

Overall karma indicates overall quality.

18 comments10 min readLW link

Probably good projects for the AI safety ecosystem

Ryan KiddDec 5, 2022, 2:26 AM

78 points

51 votes

Overall karma indicates overall quality.

40 comments2 min readLW link

Historical Notes on Charitable Funds

jefftkDec 4, 2022, 11:30 PM

28 points

8 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

(www.jefftk.com)

AGI as a Black Swan Event

Stephen McAleeseDec 4, 2022, 11:00 PM

8 points

9 votes

Overall karma indicates overall quality.

8 comments7 min readLW link

South Bay ACX/LW Pre-Holiday Get-Together

ISDec 4, 2022, 10:57 PM

10 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

ChatGPT is settling the Chinese Room argument

averrosDec 4, 2022, 8:25 PM

−7 points

10 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

Race to the Top: Benchmarks for AI Safety

Isabella DuanDec 4, 2022, 6:48 PM

29 points

11 votes

Overall karma indicates overall quality.

6 comments1 min readLW link

Open & Welcome Thread—December 2022

niplavDec 4, 2022, 3:06 PM

8 points

4 votes

Overall karma indicates overall quality.

22 comments1 min readLW link

AI can exploit safety plans posted on the Internet

Peter S. ParkDec 4, 2022, 12:17 PM

−15 points

12 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

ChatGPT seems overconfident to me

qbolecDec 4, 2022, 8:03 AM

19 points

6 votes

Overall karma indicates overall quality.

3 comments16 min readLW link

Could an AI be Religious?

mk54Dec 4, 2022, 5:00 AM

−12 points

7 votes

Overall karma indicates overall quality.

14 comments1 min readLW link

Can GPT-3 Write Contra Dances?

jefftkDec 4, 2022, 3:00 AM

6 points

4 votes

Overall karma indicates overall quality.

4 comments10 min readLW link

(www.jefftk.com)

Take 3: No indescribable heavenworlds.

Charlie SteinerDec 4, 2022, 2:48 AM

23 points

11 votes

Overall karma indicates overall quality.

12 comments2 min readLW link

Summary of a new study on out-group hate (and how to fix it)

DirectedEvolutionDec 4, 2022, 1:53 AM

60 points

22 votes

Overall karma indicates overall quality.

30 comments3 min readLW link

(www.pnas.org)

[Question] Will the first AGI agent have been designed as an agent (in addition to an AGI)?

nahojDec 3, 2022, 8:32 PM

1 point

2 votes

Overall karma indicates overall quality.

8 comments1 min readLW link

Logical induction for software engineers

Alex FlintDec 3, 2022, 7:55 PM

163 points

60 votes

Overall karma indicates overall quality.

8 comments27 min readLW link 1 review

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer