All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 202020212022 2023 2024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 222324 25 26 27 28 29 30

Potential Alignment mental tool: Keeping track of the types

Donald Hobson22 Nov 2021 20:05 UTC

29 points

1 comment2 min readLW link

Yudkowsky and Christiano discuss “Takeoff Speeds”

Eliezer Yudkowsky22 Nov 2021 19:35 UTC

212 points

176 comments60 min readLW link 1 review

Morally underdefined situations can be deadly

Stuart_Armstrong22 Nov 2021 14:48 UTC

17 points

8 comments2 min readLW link

A Bayesian Aggregation Paradox

Jsevillamol22 Nov 2021 10:39 UTC

87 points

23 comments7 min readLW link

[Question] Do factored sets elucidate anything about how to update everyday beliefs?

TekhneMakre22 Nov 2021 6:51 UTC

5 points

1 comment1 min readLW link

Even if you’re right, you’re wrong

DanielFilan22 Nov 2021 5:40 UTC

18 points

5 comments1 min readLW link

(danielfilan.com)

The Meta-Puzzle

DanielFilan22 Nov 2021 5:30 UTC

28 points

28 comments3 min readLW link

(danielfilan.com)

Some real examples of gradient hacking

Oliver Sourbut22 Nov 2021 0:11 UTC

17 points

8 comments2 min readLW link

“The Wisdom of the Lazy Teacher”

Richard_Kennaway21 Nov 2021 21:11 UTC

17 points

5 comments1 min readLW link

Vitalik: Cryptoeconomics and X-Risk Researchers Should Listen to Each Other More

Emerson Spartz21 Nov 2021 18:53 UTC

47 points

9 comments5 min readLW link

Giving Up On T-Mobile

jefftk21 Nov 2021 16:50 UTC

13 points

6 comments2 min readLW link

(www.jefftk.com)

From language to ethics by automated reasoning

Michele Campolo21 Nov 2021 15:16 UTC

4 points

4 comments6 min readLW link

Split and Commit

Duncan Sabien (Inactive)21 Nov 2021 6:27 UTC

218 points

35 comments5 min readLW link 1 review

What’s the weirdest way to win this game?

Adam Scherlis21 Nov 2021 5:18 UTC

9 points

5 comments1 min readLW link

(adam.scherlis.com)

Eat the cute animals instead

Andrew Vlahos21 Nov 2021 1:06 UTC

−4 points

2 comments1 min readLW link

Chris Voss negotiation MasterClass: review

VipulNaik20 Nov 2021 22:39 UTC

70 points

15 comments33 min readLW link

ACX Montreal Meetup Dec 4 2021

E20 Nov 2021 17:49 UTC

8 points

0 comments1 min readLW link

The Maker of MIND

Tomás B.20 Nov 2021 16:28 UTC

160 points

21 comments11 min readLW link

South Bay ACX/LW Meetup—CHANGED LOCATION

IS20 Nov 2021 14:42 UTC

11 points

0 comments1 min readLW link

The Emperor’s New Clothes: a story of motivated stupidity

David Hugh-Jones20 Nov 2021 13:24 UTC

10 points

5 comments3 min readLW link

(wyclif.substack.com)

[Book Review] “Sorceror’s Apprentice” by Tahir Shah

lsusr20 Nov 2021 11:29 UTC

96 points

11 comments7 min readLW link

Competence/Confidence

Duncan Sabien (Inactive)20 Nov 2021 8:59 UTC

60 points

19 comments1 min readLW link

Awesome-github Post-Scarcity List

lorepieri20 Nov 2021 8:47 UTC

3 points

6 comments1 min readLW link

A Certain Formalization of Corrigibility Is VNM-Incoherent

TurnTrout20 Nov 2021 0:30 UTC

68 points

24 comments8 min readLW link

More detailed proposal for measuring alignment of current models

Beth Barnes20 Nov 2021 0:03 UTC

31 points

0 comments8 min readLW link

Ambitious Altruistic Software Engineering Efforts: Opportunities and Benefits

ozziegooen19 Nov 2021 17:55 UTC

42 points

1 comment9 min readLW link

(forum.effectivealtruism.org)

[Question] Which booster shot to get and when?

NormanPerlmutter19 Nov 2021 8:52 UTC

22 points

17 comments2 min readLW link

Goodhart: Endgame

Charlie Steiner19 Nov 2021 1:26 UTC

25 points

3 comments8 min readLW link

Reaction and Reply to Sasha Chapin on Bad In-group Norms

Nicholas Kross19 Nov 2021 1:13 UTC

6 points

0 comments3 min readLW link

(www.thinkingmuchbetter.com)

[Question] Does anyone know what Marvin Minsky is talking about here?

delton13719 Nov 2021 0:56 UTC

1 point

6 comments3 min readLW link

How To Get Into Independent Research On Alignment/Agency

johnswentworth19 Nov 2021 0:00 UTC

365 points

38 comments13 min readLW link 2 reviews

“Acquisition of Chess Knowledge in AlphaZero”: probing AZ over time

jsd18 Nov 2021 23:24 UTC

11 points

9 comments1 min readLW link

(arxiv.org)

Ngo and Yudkowsky on AI capability gains

Eliezer Yudkowsky and Richard_Ngo

18 Nov 2021 22:19 UTC

132 points

61 comments38 min readLW link 1 review

Covid 11/18: Paxlovid Remains Illegal

Zvi18 Nov 2021 15:50 UTC

55 points

36 comments14 min readLW link

(thezvi.wordpress.com)

Satisficers Tend To Seek Power: Instrumental Convergence Via Retargetability

TurnTrout18 Nov 2021 1:54 UTC

86 points

8 comments17 min readLW link

(www.overleaf.com)

Forecasting: Zeroth and First Order

jsteinhardt18 Nov 2021 1:30 UTC

33 points

6 comments5 min readLW link

(bounded-regret.ghost.io)

Experience on Methotrexate

jefftk17 Nov 2021 22:40 UTC

13 points

0 comments2 min readLW link

(www.jefftk.com)

Applications for AI Safety Camp 2022 Now Open!

adamShimi17 Nov 2021 21:42 UTC

47 points

3 comments1 min readLW link

[Question] Did EcoHealth create SARS-CoV-2?

jamal17 Nov 2021 20:42 UTC

3 points

7 comments1 min readLW link

Sasha Chapin on bad social norms in rationality/EA

Kaj_Sotala17 Nov 2021 9:43 UTC

52 points

22 comments5 min readLW link

(sashachapin.substack.com)

[Question] What are the mutual benefits of AGI-human collaboration that would otherwise be unobtainable?

M. Y. Zuo17 Nov 2021 3:09 UTC

1 point

4 comments1 min readLW link

Quadratic Voting and Collusion

leogao17 Nov 2021 0:19 UTC

42 points

24 comments2 min readLW link

Taking a simplified model

dominicq16 Nov 2021 22:21 UTC

9 points

8 comments1 min readLW link

The Greedy Doctor Problem

Jan16 Nov 2021 22:06 UTC

6 points

10 comments12 min readLW link

(universalprior.substack.com)

Equity premium puzzles

Ege Erdil and Metaculus

16 Nov 2021 20:50 UTC

20 points

4 comments12 min readLW link

(www.metaculus.com)

Why I am no longer driven

dominicq16 Nov 2021 20:43 UTC

72 points

16 comments4 min readLW link

Super intelligent AIs that don’t require alignment

Yair Halberstadt16 Nov 2021 19:55 UTC

10 points

2 comments6 min readLW link

Why Save The Drowning Child: Ethics Vs Theory

Raymond Douglas16 Nov 2021 19:07 UTC

17 points

12 comments4 min readLW link

Two Stupid AI Alignment Ideas

aphyer16 Nov 2021 16:13 UTC

27 points

3 comments4 min readLW link

[linkpost] Project Blueprint: ‘Measuring and then maximally reversing the quantified biological age of my organs’

matteodimaio16 Nov 2021 2:48 UTC

2 points

0 comments1 min readLW link