All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161718 19 20 21 22 23 24 25 26 27 28 29 30 31

Resolve Cycles

CFAR!Duncan16 Jul 2022 23:17 UTC

149 points

8 comments10 min readLW link

Alignment as Game Design

Shoshannah Tekofsky16 Jul 2022 22:36 UTC

11 points

7 comments2 min readLW link

Risk Management from a Climbers Perspective

Annapurna16 Jul 2022 21:14 UTC

5 points

0 comments6 min readLW link

(jorgevelez.substack.com)

Cognitive Instability, Physicalism, and Free Will

dadadarren16 Jul 2022 13:13 UTC

5 points

27 comments2 min readLW link

(www.sleepingbeautyproblem.com)

All AGI safety questions welcome (especially basic ones) [July 2022]

plex and Robert Miles

16 Jul 2022 12:57 UTC

84 points

132 comments3 min readLW link

QNR Prospects

PeterMcCluskey16 Jul 2022 2:03 UTC

40 points

3 comments8 min readLW link

(www.bayesianinvestor.com)

To-do waves

Paweł Sysiak16 Jul 2022 1:19 UTC

3 points

0 comments3 min readLW link

Moneypumping Bryan Caplan’s Belief in Free Will

Morpheus16 Jul 2022 0:46 UTC

5 points

9 comments1 min readLW link

A summary of every “Highlights from the Sequences” post

Orpheus1615 Jul 2022 23:01 UTC

99 points

7 comments17 min readLW link

Safety Implications of LeCun’s path to machine intelligence

Ivan Vendrov15 Jul 2022 21:47 UTC

103 points

18 comments6 min readLW link

Comfort Zone Exploration

CFAR!Duncan15 Jul 2022 21:18 UTC

55 points

2 comments12 min readLW link

A time-invariant version of Laplace’s rule

Jsevillamol and Ege Erdil

15 Jul 2022 19:28 UTC

76 points

13 comments17 min readLW link

(epochai.org)

An attempt to break circularity in science

bilibili15 Jul 2022 18:32 UTC

3 points

5 comments1 min readLW link

A story about a duplicitous API

LiLiLi15 Jul 2022 18:26 UTC

2 points

0 comments1 min readLW link

Highlights from the memoirs of Vannevar Bush

jasoncrawford15 Jul 2022 18:08 UTC

11 points

0 comments13 min readLW link

(rootsofprogress.org)

Notes on Learning the Prior

carboniferous_umbraculum 15 Jul 2022 17:28 UTC

25 points

2 comments25 min readLW link

Review of The Engines of Cognition

William Gasarch15 Jul 2022 14:13 UTC

14 points

5 comments15 min readLW link

A review of Nate Hilger’s The Parent Trap

David Hugh-Jones15 Jul 2022 9:30 UTC

15 points

8 comments4 min readLW link

(wyclif.substack.com)

Musings on the Human Objective Function

Michael Soareverix15 Jul 2022 7:13 UTC

3 points

0 comments3 min readLW link

Peter Singer’s first published piece on AI

Fai15 Jul 2022 6:18 UTC

20 points

5 comments1 min readLW link

(link.springer.com)

Don’t use ‘infohazard’ for collectively destructive info

Eliezer Yudkowsky15 Jul 2022 5:13 UTC

89 points

34 comments1 min readLW link 2 reviews

(www.facebook.com)

Upcoming heatwave: advice

stavros15 Jul 2022 5:03 UTC

16 points

13 comments3 min readLW link

A note about differential technological development

So8res15 Jul 2022 4:46 UTC

200 points

34 comments6 min readLW link

Inward and outward steelmanning

Q Home14 Jul 2022 23:32 UTC

13 points

6 comments18 min readLW link

Potato diet: A post mortem and an answer to SMTM’s article

Épiphanie Gédéon14 Jul 2022 23:18 UTC

48 points

34 comments16 min readLW link

Proposed Orthogonality Theses #2-5

rjbg14 Jul 2022 22:59 UTC

8 points

0 comments2 min readLW link

Better Quiddler

jefftk14 Jul 2022 17:40 UTC

17 points

0 comments1 min readLW link

(www.jefftk.com)

Circumventing interpretability: How to defeat mind-readers

Lee Sharkey14 Jul 2022 16:59 UTC

119 points

15 comments33 min readLW link

Covid 7/14/22: BA.2.75 Plus Tax

Zvi14 Jul 2022 14:40 UTC

39 points

9 comments8 min readLW link

(thezvi.wordpress.com)

Criticism of EA Criticism Contest

Zvi14 Jul 2022 14:30 UTC

108 points

17 comments31 min readLW link 1 review

(thezvi.wordpress.com)

Humans provide an untapped wealth of evidence about alignment

TurnTrout and Quintin Pope

14 Jul 2022 2:31 UTC

213 points

94 comments9 min readLW link 1 review

[Question] How to impress students with recent advances in ML?

Charbel-Raphaël14 Jul 2022 0:03 UTC

12 points

2 comments1 min readLW link

Notes on Love

David Gross13 Jul 2022 23:35 UTC

18 points

3 comments29 min readLW link

Deep learning curriculum for large language model alignment

Jacob_Hilton13 Jul 2022 21:58 UTC

57 points

3 comments1 min readLW link

(github.com)

Artificial Sandwiching: When can we test scalable alignment protocols without humans?

Sam Bowman13 Jul 2022 21:14 UTC

42 points

6 comments5 min readLW link

[Question] Any tips for eliciting one’s own latent knowledge?

MSRayne13 Jul 2022 21:12 UTC

16 points

20 comments2 min readLW link

Goal Alignment Is Robust To the Sharp Left Turn

Thane Ruthenis13 Jul 2022 20:23 UTC

43 points

16 comments4 min readLW link

Making decisions using multiple worldviews

Richard_Ngo13 Jul 2022 19:15 UTC

50 points

10 comments11 min readLW link

[Question] App idea to help with reading STEM textbooks (feedback request)

DirectedEvolution13 Jul 2022 18:28 UTC

16 points

8 comments2 min readLW link

MIRI Conversations: Technology Forecasting & Gradualism (Distillation)

CallumMcDougall13 Jul 2022 15:55 UTC

31 points

1 comment20 min readLW link

Passing Up Pay

jefftk13 Jul 2022 14:10 UTC

29 points

8 comments5 min readLW link

(www.jefftk.com)

[Question] How could the universe be infinitely large?

amarai13 Jul 2022 13:45 UTC

0 points

8 comments1 min readLW link

John von Neumann on how to safely progress with technology

Dalton Mabery13 Jul 2022 11:07 UTC

14 points

0 comments1 min readLW link

Everyone is an Imposter

Tharin13 Jul 2022 8:46 UTC

19 points

1 comment9 min readLW link

(echoesandchimes.com)

[Question] Which AI Safety research agendas are the most promising?

Chris_Leong13 Jul 2022 7:54 UTC

27 points

5 comments1 min readLW link

Straw-Steelmanning

Chris van Merwijk13 Jul 2022 5:48 UTC

29 points

2 comments1 min readLW link

Alien Message Contest: Solution

DaemonicSigil13 Jul 2022 4:07 UTC

29 points

2 comments4 min readLW link

[Question] What is wrong with this approach to corrigibility?

Rafael Cosman12 Jul 2022 22:55 UTC

7 points

9 comments1 min readLW link

Acceptability Verification: A Research Agenda

David Udell and evhub

12 Jul 2022 20:11 UTC

50 points

0 comments1 min readLW link

(docs.google.com)

Progress links and tweets, 2022-07-12

jasoncrawford12 Jul 2022 15:30 UTC

12 points

0 comments1 min readLW link

(rootsofprogress.org)