All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 2930

Gradient hacking: definitions and examples

Richard_Ngo29 Jun 2022 21:35 UTC

44 points

2 comments5 min readLW link

Progress links and tweets, 2022-06-29

jasoncrawford29 Jun 2022 21:33 UTC

9 points

0 comments1 min readLW link

(rootsofprogress.org)

[Question] Correcting human error vs doing exactly what you’re told—is there literature on this in context of general system design?

Jan Czechowski29 Jun 2022 21:30 UTC

6 points

0 comments1 min readLW link

Latent Adversarial Training

Adam Jermyn29 Jun 2022 20:04 UTC

58 points

13 comments5 min readLW link

Game Review: This Merchant Life

Zvi29 Jun 2022 18:30 UTC

20 points

0 comments13 min readLW link

(thezvi.wordpress.com)

Limits to Legibility

Jan_Kulveit29 Jun 2022 17:42 UTC

168 points

12 comments5 min readLW link 1 review

Will Capabilities Generalise More?

Ramana Kumar29 Jun 2022 17:12 UTC

133 points

39 comments4 min readLW link

Kevin Kelly’s “103 Bits of Advice,” Expanded

Dalton Mabery29 Jun 2022 13:36 UTC

19 points

0 comments5 min readLW link

The table of different sampling assumptions in anthropics

avturchin29 Jun 2022 10:41 UTC

39 points

5 comments12 min readLW link

Can We Align AI by Having It Learn Human Preferences? I’m Scared (summary of last third of Human Compatible)

apollonianblues29 Jun 2022 4:09 UTC

19 points

3 comments6 min readLW link

Kurzgesagt – The Last Human (Youtube)

habryka29 Jun 2022 3:28 UTC

54 points

7 comments1 min readLW link

(www.youtube.com)

[Question] Literature on How to Maximize Preferences

josh28 Jun 2022 22:41 UTC

1 point

0 comments1 min readLW link

Challenge: A Much More Alien Message

kman28 Jun 2022 21:50 UTC

24 points

7 comments1 min readLW link

It’s Probably Not Lithium

Natália28 Jun 2022 21:24 UTC

447 points

186 comments28 min readLW link 1 review

Reflections on Living in “Guess Culture”

Dalton Mabery28 Jun 2022 21:00 UTC

13 points

1 comment3 min readLW link

[Question] What is the LessWrong Logo(?) Supposed to Represent?

DragonGod28 Jun 2022 20:20 UTC

8 points

6 comments1 min readLW link

What Are You Tracking In Your Head?

johnswentworth28 Jun 2022 19:30 UTC

303 points

84 comments4 min readLW link 1 review

Why is so much political commentary misleading?

contrarianbrit28 Jun 2022 17:10 UTC

−2 points

5 comments6 min readLW link

(thomasprosser.substack.com)

CFAR Handbook: Introduction

CFAR!Duncan28 Jun 2022 16:53 UTC

124 points

12 comments1 min readLW link

Units of Exchange

CFAR!Duncan28 Jun 2022 16:53 UTC

102 points

28 comments11 min readLW link

Scott Aaronson and Steven Pinker Debate AI Scaling

Liron28 Jun 2022 16:04 UTC

37 points

7 comments1 min readLW link

(scottaaronson.blog)

A physicist’s approach to Origins of Life

pchvykov28 Jun 2022 15:23 UTC

12 points

6 comments16 min readLW link

What success looks like

Marius Hobbhahn, MaxRa, JasperGeh and Yannick_Muehlhaeuser

28 Jun 2022 14:38 UTC

19 points

4 comments1 min readLW link

(forum.effectivealtruism.org)

Four reasons I find AI safety emotionally compelling

KatWoods and AmberDawn

28 Jun 2022 14:10 UTC

39 points

3 comments4 min readLW link

Some alternative AI safety research projects

Michele Campolo28 Jun 2022 14:09 UTC

9 points

0 comments3 min readLW link

Doom doubts—is inner alignment a likely problem?

Crissman28 Jun 2022 12:42 UTC

6 points

7 comments1 min readLW link

Low-Friction MBTA Predictions

jefftk28 Jun 2022 12:30 UTC

15 points

0 comments1 min readLW link

(www.jefftk.com)

What Diet Books Don’t Teach: A book review and a request for more reading

Lone Pine28 Jun 2022 12:27 UTC

22 points

34 comments4 min readLW link

Assessing AlephAlphas Multimodal Model

p.b.28 Jun 2022 9:28 UTC

30 points

5 comments3 min readLW link

[Question] Is there any way someone could post about public policy relating to abortion access (or another sensitive subject) on LessWrong without getting super downvoted?

Evan_Gaensbauer28 Jun 2022 5:45 UTC

18 points

20 comments1 min readLW link

[Test Post Please Ignore] Testing polling features

Lone Pine28 Jun 2022 4:35 UTC

7 points

5 comments1 min readLW link

Yann LeCun, A Path Towards Autonomous Machine Intelligence [link]

Bill Benzon27 Jun 2022 23:29 UTC

5 points

1 comment1 min readLW link

Limits of Bodily Autonomy

jefftk27 Jun 2022 19:50 UTC

28 points

18 comments1 min readLW link

(www.jefftk.com)

[Question] Systems Biology for self study

Ulisse Mini27 Jun 2022 19:36 UTC

5 points

2 comments1 min readLW link

[Yann Lecun] A Path Towards Autonomous Machine Intelligence

DragonGod27 Jun 2022 19:24 UTC

38 points

14 comments1 min readLW link

(openreview.net)

Exploring Mild Behaviour in Embedded Agents

Megan Kinniment27 Jun 2022 18:56 UTC

21 points

4 comments18 min readLW link

Epistemic modesty and how I think about AI risk

Aryeh Englander27 Jun 2022 18:47 UTC

22 points

4 comments4 min readLW link

Deliberation Everywhere: Simple Examples

Oliver Sourbut27 Jun 2022 17:26 UTC

28 points

3 comments15 min readLW link

Deliberation, Reactions, and Control: Tentative Definitions and a Restatement of Instrumental Convergence

Oliver Sourbut27 Jun 2022 17:25 UTC

13 points

0 comments11 min readLW link

[Question] Are long-form dating profiles productive?

AABoyles27 Jun 2022 17:03 UTC

34 points

32 comments1 min readLW link

Custom iPhone Widget to Encourage Less Wrong Use

Will Payne27 Jun 2022 16:14 UTC

10 points

2 comments2 min readLW link

(forum.effectivealtruism.org)

Announcing the Inverse Scaling Prize ($250k Prize Pool)

Ethan Perez, Ian McKenzie and Sam Bowman

27 Jun 2022 15:58 UTC

171 points

14 comments7 min readLW link

Announcing Epoch: A research organization investigating the road to Transformative AI

Jsevillamol, Pablo Villalobos, Tamay, lennart, Marius Hobbhahn and anson.ho

27 Jun 2022 13:55 UTC

97 points

2 comments2 min readLW link

(epochai.org)

Air Conditioner Repair

Zvi27 Jun 2022 12:40 UTC

83 points

34 comments4 min readLW link

(thezvi.wordpress.com)

[Question] Why Are Posts in the Sequences Tagged [Personal Blog] Instead of [Frontpage]?

DragonGod27 Jun 2022 9:35 UTC

5 points

2 comments1 min readLW link

Contest: An Alien Message

DaemonicSigil27 Jun 2022 5:54 UTC

96 points

100 comments1 min readLW link

Robin Hanson asks “Why Not Wait On AI Risk?”

Gunnar_Zarncke26 Jun 2022 23:32 UTC

22 points

4 comments1 min readLW link

(www.overcomingbias.com)

Sex Fairy Lore

pchvykov26 Jun 2022 20:42 UTC

−25 points

10 comments6 min readLW link

King David’s %: Establishing a new symbol for Bayesian probability.

Paul Logan26 Jun 2022 19:47 UTC

−11 points

1 comment5 min readLW link

(laulpogan.substack.com)

Training Trace Priors and Speed Priors

Adam Jermyn26 Jun 2022 18:07 UTC

17 points

0 comments3 min readLW link