All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 345 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Three pillars for avoiding AGI catastrophe: Technical alignment, deployment decisions, and coordination

LintzA3 Aug 2022 23:15 UTC

24 points

0 comments11 min readLW link

Precursor checking for deceptive alignment

evhub3 Aug 2022 22:56 UTC

24 points

0 comments14 min readLW link

Transformer language models are doing something more general

Numendil3 Aug 2022 21:13 UTC

53 points

6 comments2 min readLW link

[Question] Some doubts about Non Superintelligent AIs

aditya malik3 Aug 2022 19:55 UTC

0 points

4 comments1 min readLW link

Announcing Squiggle: Early Access

ozziegooen3 Aug 2022 19:48 UTC

51 points

7 comments7 min readLW link

(forum.effectivealtruism.org)

Survey: What (de)motivates you about AI risk?

Daniel_Friedrich3 Aug 2022 19:17 UTC

1 point

0 comments1 min readLW link

(forms.gle)

Externalized reasoning oversight: a research direction for language model alignment

tamera3 Aug 2022 12:03 UTC

140 points

23 comments6 min readLW link

Open & Welcome Thread—Aug/Sep 2022

Thomas3 Aug 2022 10:22 UTC

9 points

32 comments1 min readLW link

[Question] How does one recognize information and differentiate it from noise?

M. Y. Zuo3 Aug 2022 3:57 UTC

4 points

29 comments1 min readLW link

Law-Following AI 4: Don’t Rely on Vicarious Liability

Cullen2 Aug 2022 23:26 UTC

5 points

2 comments3 min readLW link

Two-year update on my personal AI timelines

Ajeya Cotra2 Aug 2022 23:07 UTC

293 points

60 comments16 min readLW link

What are the Red Flags for Neural Network Suffering? - Seeds of Science call for reviewers

rogersbacon2 Aug 2022 22:37 UTC

24 points

6 comments1 min readLW link

Againstness

CFAR!Duncan2 Aug 2022 19:29 UTC

50 points

8 comments9 min readLW link

(Summary) Sequence Highlights—Thinking Better on Purpose

qazzquimby2 Aug 2022 17:45 UTC

33 points

3 comments11 min readLW link

Progress links and tweets, 2022-08-02

jasoncrawford2 Aug 2022 17:03 UTC

9 points

0 comments1 min readLW link

(rootsofprogress.org)

[Question] I want to donate some money (not much, just what I can afford) to AGI Alignment research, to whatever organization has the best chance of making sure that AGI goes well and doesn’t kill us all. What are my best options, where can I make the most difference per dollar?

lumenwrites2 Aug 2022 12:08 UTC

15 points

9 comments1 min readLW link

Thinking without priors?

Q Home2 Aug 2022 9:17 UTC

7 points

0 comments9 min readLW link

[Question] Would quantum immortality mean subjective immortality?

n0ah2 Aug 2022 4:54 UTC

2 points

10 comments1 min readLW link

Turbocharging

CFAR!Duncan2 Aug 2022 0:01 UTC

55 points

5 comments9 min readLW link

Letter from leading Soviet Academicians to party and government leaders of the Soviet Union regarding signs of decline and structural problems of the economic-political system (1970)

M. Y. Zuo1 Aug 2022 22:35 UTC

20 points

10 comments16 min readLW link

Technical AI Alignment Study Group

Eric K1 Aug 2022 18:33 UTC

5 points

0 comments1 min readLW link

[Question] Is there any writing about prompt engineering for humans?

Alex Hollow1 Aug 2022 12:52 UTC

18 points

8 comments1 min readLW link

Meditation course claims 65% enlightenment rate: my review

KatWoods1 Aug 2022 11:25 UTC

112 points

35 comments14 min readLW link

[Question] Which intro-to-AI-risk text would you recommend to...

Sherrinford1 Aug 2022 9:36 UTC

12 points

1 comment1 min readLW link

Polaris, Five-Second Versions, and Thought Lengths

CFAR!Duncan1 Aug 2022 7:14 UTC

50 points

12 comments8 min readLW link

A Word is Worth 1,000 Pictures

Kully1 Aug 2022 4:08 UTC

1 point

0 comments2 min readLW link

On akrasia: starting at the bottom

seecrow1 Aug 2022 4:08 UTC

37 points

2 comments3 min readLW link

[Question] How likely do you think worse-than-extinction type fates to be?

span11 Aug 2022 4:08 UTC

3 points

3 comments1 min readLW link

Abstraction sacrifices causal clarity

Marv K31 Jul 2022 19:24 UTC

2 points

0 comments3 min readLW link

Time-logging programs and/or spreadsheets (2022)

mikbp31 Jul 2022 18:18 UTC

3 points

3 comments1 min readLW link

Conservatism is a rational response to epistemic uncertainty

contrarianbrit31 Jul 2022 18:04 UTC

2 points

11 comments9 min readLW link

(thomasprosser.substack.com)

South Bay ACX/LW Meetup

IS31 Jul 2022 15:30 UTC

2 points

0 comments1 min readLW link

Perverse Independence Incentives

jefftk31 Jul 2022 14:40 UTC

61 points

3 comments1 min readLW link

(www.jefftk.com)

Wolfram Research v Cook

Kenny31 Jul 2022 13:35 UTC

7 points

3 comments8 min readLW link

Wanted: Notation for credal resilience

peter_hartree31 Jul 2022 7:35 UTC

21 points

12 comments1 min readLW link

Anatomy of a Dating Document

squidious31 Jul 2022 2:40 UTC

31 points

24 comments4 min readLW link

(opalsandbonobos.blogspot.com)

chinchilla’s wild implications

nostalgebraist31 Jul 2022 1:18 UTC

425 points

128 comments10 min readLW link 1 review

AGI-level reasoner will appear sooner than an agent; what the humanity will do with this reasoner is critical

Roman Leventov30 Jul 2022 20:56 UTC

24 points

10 comments1 min readLW link

[Question] What job should I do?

Tom Paine30 Jul 2022 9:15 UTC

2 points

8 comments1 min readLW link

How transparency changed over time

ViktoriaMalyasova30 Jul 2022 4:36 UTC

21 points

0 comments6 min readLW link

Translating between Latent Spaces

JamesH, Jeremy Gillen and NickyP

30 Jul 2022 3:25 UTC

27 points

2 comments8 min readLW link

Drexler’s Nanotech Forecast

PeterMcCluskey30 Jul 2022 0:45 UTC

25 points

28 comments3 min readLW link

(www.bayesianinvestor.com)

Humans Reflecting on HRH

leogao29 Jul 2022 21:56 UTC

27 points

4 comments2 min readLW link

Comparing Four Approaches to Inner Alignment

Lucas Teixeira29 Jul 2022 21:06 UTC

38 points

1 comment9 min readLW link

Questions for a Theory of Narratives

Marv K29 Jul 2022 19:31 UTC

5 points

4 comments4 min readLW link

Focusing

CFAR!Duncan29 Jul 2022 19:15 UTC

126 points

25 comments14 min readLW link

Conjecture: Internal Infohazard Policy

Connor Leahy, Sid Black, Chris Scammell and Andrea_Miotti

29 Jul 2022 19:07 UTC

131 points

6 comments19 min readLW link

Abstracting The Hardness of Alignment: Unbounded Atomic Optimization

adamShimi29 Jul 2022 18:59 UTC

75 points

3 comments16 min readLW link

Bucket Errors

CFAR!Duncan29 Jul 2022 18:50 UTC

46 points

8 comments11 min readLW link

Distillation Contest—Results and Recap

Aris29 Jul 2022 17:40 UTC

34 points

0 comments7 min readLW link