All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Announcing Athena—Women in AI Alignment Research

Claire ShortNov 7, 2023, 9:46 PM

80 points

60 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

Vote on Interesting Disagreements

Ben PaceNov 7, 2023, 9:35 PM

159 points

62 votes

Overall karma indicates overall quality.

131 comments1 min readLW link

What is democracy for?

JohnstoneNov 7, 2023, 6:17 PM

−5 points

10 votes

Overall karma indicates overall quality.

10 comments7 min readLW link

Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation

Soroush Pour, rusheb, Quentin FEUILLADE--MONTIXI, Arush and scasper

Nov 7, 2023, 5:59 PM

38 points

22 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

(arxiv.org)

Implementing Decision Theory

justinpombrioNov 7, 2023, 5:55 PM

26 points

11 votes

Overall karma indicates overall quality.

12 comments3 min readLW link

Mirror, Mirror on the Wall: How Do Forecasters Fare by Their Own Call?

nikosNov 7, 2023, 5:39 PM

14 points

8 votes

Overall karma indicates overall quality.

5 comments14 min readLW link

Symbiotic self-alignment of AIs.

Spiritus DeiNov 7, 2023, 5:18 PM

1 point

3 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

AMA: Earning to Give

jefftkNov 7, 2023, 4:20 PM

53 points

17 votes

Overall karma indicates overall quality.

8 comments1 min readLW link

(www.jefftk.com)

The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs

Quentin FEUILLADE--MONTIXI and Pierre Peigné

Nov 7, 2023, 4:12 PM

52 points

28 votes

Overall karma indicates overall quality.

21 comments6 min readLW link

Preface to the Sequence on LLM Psychology

Quentin FEUILLADE--MONTIXINov 7, 2023, 4:12 PM

33 points

21 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

What I’ve been reading, November 2023

jasoncrawfordNov 7, 2023, 1:37 PM

23 points

4 votes

Overall karma indicates overall quality.

1 comment5 min readLW link

(rootsofprogress.org)

AI Alignment [Progress] this Week (11/05/2023)

Logan ZoellnerNov 7, 2023, 1:26 PM

24 points

7 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

(midwitalignment.substack.com)

On the UK Summit

ZviNov 7, 2023, 1:10 PM

74 points

27 votes

Overall karma indicates overall quality.

6 comments30 min readLW link

(thezvi.wordpress.com)

Box inversion revisited

Jan_KulveitNov 7, 2023, 11:09 AM

40 points

23 votes

Overall karma indicates overall quality.

3 comments8 min readLW link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

CallumMcDougallNov 7, 2023, 9:43 AM

56 points

28 votes

Overall karma indicates overall quality.

0 comments10 min readLW link

The Perils of Professionalism

ScrewtapeNov 7, 2023, 12:07 AM

49 points

22 votes

Overall karma indicates overall quality.

1 comment10 min readLW link

How to (hopefully ethically) make money off of AGI

habryka, Zvi, Cosmos and NoahK

Nov 6, 2023, 11:35 PM

175 points

101 votes

Overall karma indicates overall quality.

95 comments32 min readLW link 1 review

cost estimation for 2 grid energy storage systems

bhauthNov 6, 2023, 11:32 PM

16 points

5 votes

Overall karma indicates overall quality.

12 comments7 min readLW link

(www.bhauth.com)

A bet on critical periods in neural networks

kave and Garrett Baker

Nov 6, 2023, 11:21 PM

24 points

10 votes

Overall karma indicates overall quality.

1 comment6 min readLW link

Job listing: Communications Generalist / Project Manager

Gretta DulebaNov 6, 2023, 8:21 PM

49 points

12 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

Askesis: a model of the cerebellum

MadHatterNov 6, 2023, 8:19 PM

7 points

4 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

(github.com)

LQPR: An Algorithm for Reinforcement Learning with Provable Safety Guarantees

MadHatterNov 6, 2023, 8:17 PM

6 points

11 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

(github.com)

ACX Meetup Leipzig

Roman LeipeNov 6, 2023, 6:33 PM

1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

[Question] Does bulemia work?

lcNov 6, 2023, 5:58 PM

5 points

11 votes

Overall karma indicates overall quality.

18 comments1 min readLW link

Why building ventures in AI Safety is particularly challenging

HerambNov 6, 2023, 4:27 PM

1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

(forum.effectivealtruism.org)

What is true is already so. Owning up to it doesn’t make it worse.

RamblinDashNov 6, 2023, 3:49 PM

20 points

10 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

An illustrative model of backfire risks from pausing AI research

Maxime RichéNov 6, 2023, 2:30 PM

33 points

17 votes

Overall karma indicates overall quality.

3 comments11 min readLW link

Proposal for improving state of alignment research

IknownothingNov 6, 2023, 1:55 PM

2 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

 Are language models good at making predictions?

dynomightNov 6, 2023, 1:10 PM

76 points

32 votes

Overall karma indicates overall quality.

14 comments4 min readLW link

(dynomight.net)

Tips, tricks, lessons and thoughts on hosting hackathons

gergogasparNov 6, 2023, 11:03 AM

3 points

2 votes

Overall karma indicates overall quality.

0 comments11 min readLW link

Announcing TAIS 2024

BlaineNov 6, 2023, 8:38 AM

23 points

11 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

(tais2024.cc)

Taboo Wall

ScrewtapeNov 6, 2023, 3:51 AM

19 points

5 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

When and why should you use the Kelly criterion?

Garrett Baker, philh and River

Nov 5, 2023, 11:26 PM

27 points

8 votes

Overall karma indicates overall quality.

25 comments16 min readLW link

On Overhangs and Technological Change

RokoNov 5, 2023, 10:58 PM

50 points

24 votes

Overall karma indicates overall quality.

19 comments2 min readLW link

xAI announces Grok, beats GPT-3.5

Nikola JurkovicNov 5, 2023, 10:11 PM

10 points

5 votes

Overall karma indicates overall quality.

6 comments1 min readLW link

(x.ai)

Disentangling four motivations for acting in accordance with UDT

Julian StastnyNov 5, 2023, 9:26 PM

35 points

20 votes

Overall karma indicates overall quality.

4 comments7 min readLW link

AI as Super-Demagogue

RationalDinoNov 5, 2023, 9:21 PM

11 points

10 votes

Overall karma indicates overall quality.

12 comments9 min readLW link

EA orgs’ legal structure inhibits risk taking and information sharing on the margin

ElizabethNov 5, 2023, 7:13 PM

136 points

55 votes

Overall karma indicates overall quality.

17 comments4 min readLW link

Eric Schmidt on recursive self-improvement

Nikola JurkovicNov 5, 2023, 7:05 PM

24 points

14 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

(www.youtube.com)

Pivotal Acts might Not be what You Think they are

Johannes C. MayerNov 5, 2023, 5:23 PM

41 points

22 votes

Overall karma indicates overall quality.

13 comments3 min readLW link

The Assumed Intent Bias

silentbobNov 5, 2023, 4:28 PM

51 points

27 votes

Overall karma indicates overall quality.

13 comments6 min readLW link

Go flash blinking lights at printed text right now

lemonhopeNov 5, 2023, 7:29 AM

15 points

10 votes

Overall karma indicates overall quality.

9 comments1 min readLW link

Life of GPT

Odd anonNov 5, 2023, 4:55 AM

6 points

7 votes

Overall karma indicates overall quality.

2 comments5 min readLW link

Lightning Talks

ScrewtapeNov 5, 2023, 3:27 AM

6 points

1 vote

Overall karma indicates overall quality.

3 comments4 min readLW link

Utility is not the selection target

tailcalledNov 4, 2023, 10:48 PM

24 points

18 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Stuxnet, not Skynet: Humanity’s disempowerment by AI

RokoNov 4, 2023, 10:23 PM

107 points

50 votes

Overall karma indicates overall quality.

24 comments6 min readLW link

The 6D effect: When companies take risks, one email can be very powerful.

scasperNov 4, 2023, 8:08 PM

286 points

150 votes

Overall karma indicates overall quality.

42 comments3 min readLW link

Genetic fitness is a measure of selection strength, not the selection target

Kaj_SotalaNov 4, 2023, 7:02 PM

58 points

40 votes

Overall karma indicates overall quality.

44 comments18 min readLW link

The Soul Key

Richard_NgoNov 4, 2023, 5:51 PM

114 points

46 votes

Overall karma indicates overall quality.

10 comments8 min readLW link 1 review

(www.narrativeark.xyz)

[Linkpost] Concept Alignment as a Prerequisite for Value Alignment

Bogdan Ionut CirsteaNov 4, 2023, 5:34 PM

27 points

7 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

(arxiv.org)

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer