All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 456 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Safetywashing

Adam SchollJul 1, 2022, 11:56 AM

261 points

20 comments1 min readLW link 2 reviews

[Question] AGI alignment with what?

AlignmentMirrorJul 1, 2022, 10:22 AM

6 points

10 comments1 min readLW link

Open & Welcome Thread—July 2022

Kaj_SotalaJul 1, 2022, 7:47 AM

20 points

61 comments1 min readLW link

[Question] What is the contrast to counterfactual reasoning?

Dominic RoserJul 1, 2022, 7:39 AM

5 points

10 comments1 min readLW link

Meiosis is all you need

MetacelsusJul 1, 2022, 7:39 AM

41 points

3 comments2 min readLW link

(denovo.substack.com)

[Question] How to Navigate Evaluating Politicized Research?

Davis_KingsleyJul 1, 2022, 5:59 AM

11 points

1 comment1 min readLW link

One is (almost) normal in base π

Adam ScherlisJul 1, 2022, 4:05 AM

14 points

0 comments1 min readLW link

(adam.scherlis.com)

AI safety university groups: a promising opportunity to reduce existential risk

micJul 1, 2022, 3:59 AM

14 points

0 comments11 min readLW link

Looking back on my alignment PhD

TurnTroutJul 1, 2022, 3:19 AM

334 points

66 comments11 min readLW link

Selection processes for subagents

Ryan KiddJun 30, 2022, 11:57 PM

36 points

2 comments9 min readLW link

[Question] Cryonics-adjacent question

FlaglandbaseJun 30, 2022, 11:03 PM

12 points

3 comments1 min readLW link

Forecasts are not enough

Ege ErdilJun 30, 2022, 10:00 PM

44 points

5 comments5 min readLW link

Murphyjitsu: an Inner Simulator algorithm

CFAR!DuncanJun 30, 2022, 9:50 PM

67 points

24 comments11 min readLW link 2 reviews

GPT-3 Catching Fish in Morse Code

Megan KinnimentJun 30, 2022, 9:22 PM

117 points

27 comments8 min readLW link

Metacognition in the Rat

Jacob FalkovichJun 30, 2022, 8:53 PM

19 points

0 comments6 min readLW link

On viewquakes

Dalton MaberyJun 30, 2022, 8:08 PM

8 points

0 comments2 min readLW link

The Track Record of Futurists Seems … Fine

HoldenKarnofskyJun 30, 2022, 7:40 PM

91 points

25 comments12 min readLW link

(www.cold-takes.com)

Quick survey on AI alignment resources

frances_lorenzJun 30, 2022, 7:09 PM

14 points

0 comments1 min readLW link

[Linkpost] Solving Quantitative Reasoning Problems with Language Models

YitzJun 30, 2022, 6:58 PM

76 points

15 comments2 min readLW link

(storage.googleapis.com)

Failing to fix a dangerous intersection

alyssavanceJun 30, 2022, 6:09 PM

110 points

17 comments2 min readLW link

Most Functions Have Undesirable Global Extrema

En KepeigJun 30, 2022, 5:10 PM

8 points

5 comments3 min readLW link

Hedonistic Isotopes:

TrozxzrJun 30, 2022, 4:49 PM

1 point

0 comments1 min readLW link

Abadarian Trades

David UdellJun 30, 2022, 4:41 PM

17 points

22 comments2 min readLW link

Covid 6/30/22: Vaccine Update Update

ZviJun 30, 2022, 2:00 PM

32 points

6 comments12 min readLW link

(thezvi.wordpress.com)

[Question] How should I talk about optimal but not subgame-optimal play?

JamesFavilleJun 30, 2022, 1:58 PM

5 points

1 comment3 min readLW link

Formal Philosophy and Alignment Possible Projects

Daniel HerrmannJun 30, 2022, 10:42 AM

34 points

5 comments8 min readLW link

Bangalore LW/ACX Meetup in person

AdityaJun 30, 2022, 7:21 AM

5 points

2 comments1 min readLW link

Cultivating And Destroying Agency

hathJun 30, 2022, 3:59 AM

105 points

11 comments9 min readLW link

$500 bounty for alignment contest ideas

Orpheus16Jun 30, 2022, 1:56 AM

29 points

5 comments2 min readLW link

any good rationalist guides to nutrition / healthy eating?

Ben AJun 30, 2022, 12:50 AM

7 points

15 comments1 min readLW link

A summary of every Replacing Guilt post

Orpheus16Jun 30, 2022, 12:46 AM

35 points

3 comments10 min readLW link

(forum.effectivealtruism.org)

Gradient hacking: definitions and examples

Richard_NgoJun 29, 2022, 9:35 PM

38 points

2 comments5 min readLW link

Progress links and tweets, 2022-06-29

jasoncrawfordJun 29, 2022, 9:33 PM

9 points

0 comments1 min readLW link

(rootsofprogress.org)

[Question] Correcting human error vs doing exactly what you’re told—is there literature on this in context of general system design?

Jan CzechowskiJun 29, 2022, 9:30 PM

6 points

0 comments1 min readLW link

Latent Adversarial Training

Adam JermynJun 29, 2022, 8:04 PM

52 points

13 comments5 min readLW link

Game Review: This Merchant Life

ZviJun 29, 2022, 6:30 PM

20 points

0 comments13 min readLW link

(thezvi.wordpress.com)

Limits to Legibility

Jan_KulveitJun 29, 2022, 5:42 PM

157 points

11 comments5 min readLW link 1 review

Will Capabilities Generalise More?

Ramana Kumar29 Jun 2022 17:12 UTC

133 points

39 comments4 min readLW link

Kevin Kelly’s “103 Bits of Advice,” Expanded

Dalton Mabery29 Jun 2022 13:36 UTC

19 points

0 comments5 min readLW link

The table of different sampling assumptions in anthropics

avturchin29 Jun 2022 10:41 UTC

39 points

5 comments12 min readLW link

Can We Align AI by Having It Learn Human Preferences? I’m Scared (summary of last third of Human Compatible)

apollonianblues29 Jun 2022 4:09 UTC

19 points

3 comments6 min readLW link

Kurzgesagt – The Last Human (Youtube)

habryka29 Jun 2022 3:28 UTC

54 points

7 comments1 min readLW link

(www.youtube.com)

[Question] Literature on How to Maximize Preferences

josh28 Jun 2022 22:41 UTC

1 point

0 comments1 min readLW link

Challenge: A Much More Alien Message

kman28 Jun 2022 21:50 UTC

24 points

7 comments1 min readLW link

It’s Probably Not Lithium

Natália28 Jun 2022 21:24 UTC

444 points

187 comments28 min readLW link 1 review

Reflections on Living in “Guess Culture”

Dalton Mabery28 Jun 2022 21:00 UTC

13 points

1 comment3 min readLW link

[Question] What is the LessWrong Logo(?) Supposed to Represent?

DragonGod28 Jun 2022 20:20 UTC

8 points

6 comments1 min readLW link

What Are You Tracking In Your Head?

johnswentworth28 Jun 2022 19:30 UTC

289 points

83 comments4 min readLW link 1 review

Why is so much political commentary misleading?

contrarianbrit28 Jun 2022 17:10 UTC

−2 points

5 comments6 min readLW link

(thomasprosser.substack.com)

CFAR Handbook: Introduction

CFAR!Duncan28 Jun 2022 16:53 UTC

119 points

12 comments1 min readLW link