All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 181920 21 22 23 24 25 26 27 28 29 30 31

Alignment’s phlogiston

Eleni Angelou18 Aug 2022 22:27 UTC

10 points

2 comments2 min readLW link

Announcing the Distillation for Alignment Practicum (DAP)

Jonas Hallgren and CallumMcDougall

18 Aug 2022 19:50 UTC

23 points

3 comments3 min readLW link

A conversation about progress and safety

jasoncrawford18 Aug 2022 18:36 UTC

12 points

0 comments7 min readLW link

(rootsofprogress.org)

Discovering Agents

zac_kenton18 Aug 2022 17:33 UTC

77 points

11 comments6 min readLW link

Oops It’s Time To Overthrow the Organizer Day!

Screwtape18 Aug 2022 16:40 UTC

65 points

5 comments4 min readLW link

Bias towards simple functions; application to alignment?

DavidHolmes18 Aug 2022 16:15 UTC

5 points

8 comments2 min readLW link

What Games These Days?

jefftk18 Aug 2022 14:30 UTC

24 points

6 comments3 min readLW link

(www.jefftk.com)

Covid 8/18/22: CDC Admits Mistakes

Zvi18 Aug 2022 14:30 UTC

46 points

9 comments17 min readLW link

(thezvi.wordpress.com)

In Defense Of Making Money

George3d618 Aug 2022 14:10 UTC

70 points

13 comments7 min readLW link

(www.epistem.ink)

Astral Codex Ten meetup in Prague [Oct 6]

Jiří Nádvorník18 Aug 2022 12:15 UTC

4 points

0 comments1 min readLW link

Playing Without Affordances

Alex Hollow18 Aug 2022 11:53 UTC

11 points

0 comments1 min readLW link

(alexhollow.wordpress.com)

Goal-directedness: relativising complexity

Morgan_Rogers18 Aug 2022 9:48 UTC

3 points

0 comments11 min readLW link

What’s up with the bad Meta projects?

Yitz18 Aug 2022 5:34 UTC

42 points

29 comments1 min readLW link

Announcing Encultured AI: Building a Video Game

Andrew_Critch and Nick Hay

18 Aug 2022 2:16 UTC

103 points

26 comments4 min readLW link

Detroit ACX September Meetup

MattArnold18 Aug 2022 0:48 UTC

1 point

0 comments1 min readLW link

Matt Yglesias on AI Policy

Grant Demaree17 Aug 2022 23:57 UTC

25 points

1 comment1 min readLW link

(www.slowboring.com)

Spoons and Myofascial Trigger Points

vitaliya17 Aug 2022 22:54 UTC

5 points

3 comments1 min readLW link

Concrete Advice for Forming Inside Views on AI Safety

Neel Nanda17 Aug 2022 22:02 UTC

30 points

6 comments10 min readLW link

Progress links and tweets, 2022-08-17

jasoncrawford17 Aug 2022 21:27 UTC

11 points

0 comments2 min readLW link

(rootsofprogress.org)

Conditioning, Prompts, and Fine-Tuning

Adam Jermyn17 Aug 2022 20:52 UTC

38 points

9 comments4 min readLW link

The Core of the Alignment Problem is...

Thomas Larsen, Jeremy Gillen and JamesH

17 Aug 2022 20:07 UTC

76 points

10 comments9 min readLW link

[Question] Could the simulation argument also apply to dreams?

Nathan112317 Aug 2022 19:55 UTC

6 points

4 comments3 min readLW link

Interpretability Tools Are an Attack Channel

Thane Ruthenis17 Aug 2022 18:47 UTC

42 points

14 comments1 min readLW link

Human Mimicry Mainly Works When We’re Already Close

johnswentworth17 Aug 2022 18:41 UTC

83 points

16 comments5 min readLW link

Thoughts on ‘List of Lethalities’

Alex Lawsen 17 Aug 2022 18:33 UTC

27 points

0 comments10 min readLW link

The longest training run

Jsevillamol, Tamay, Owen D and anson.ho

17 Aug 2022 17:18 UTC

71 points

12 comments9 min readLW link

(epochai.org)

Spoiler-Free Review: Across the Obelisk

Zvi17 Aug 2022 14:30 UTC

17 points

0 comments6 min readLW link

(thezvi.wordpress.com)

Autonomy as taking responsibility for reference maintenance

Ramana Kumar17 Aug 2022 12:50 UTC

61 points

3 comments5 min readLW link

Duplicating Rasberry Pi Images

jefftk17 Aug 2022 12:10 UTC

9 points

4 comments4 min readLW link

(www.jefftk.com)

ACX Meetup—Amsterdam

Pierre Vandenberghe17 Aug 2022 9:56 UTC

2 points

1 comment1 min readLW link

Insufficient awareness of how everything sucks

Flaglandbase17 Aug 2022 8:01 UTC

−13 points

5 comments1 min readLW link

Mesa-optimization for goals defined only within a training environment is dangerous

Rubi J. Hudson17 Aug 2022 3:56 UTC

6 points

2 comments4 min readLW link

ACX / SSC Meetup Singapore

DG17 Aug 2022 2:08 UTC

2 points

1 comment1 min readLW link

That-time-of-year Astral Codex Ten Meetup

Ben Smith17 Aug 2022 0:02 UTC

3 points

2 comments1 min readLW link

SSC Reno Meetup

Steven16 Aug 2022 23:37 UTC

1 point

3 comments1 min readLW link

My thoughts on direct work (and joining LessWrong)

RobertM16 Aug 2022 18:53 UTC

58 points

4 comments6 min readLW link

We can make the future a million years from now go better [video]

Writer16 Aug 2022 13:03 UTC

7 points

1 comment6 min readLW link

(youtu.be)

The Open Society and Its Enemies: Summary and Thoughts

matto16 Aug 2022 11:44 UTC

12 points

4 comments17 min readLW link

An introduction to signalling theory

Mvolz16 Aug 2022 9:37 UTC

17 points

1 comment5 min readLW link

Understanding differences between humans and intelligence-in-general to build safe AGI

Florian_Dietz16 Aug 2022 8:27 UTC

7 points

8 comments1 min readLW link

Against population ethics

jasoncrawford16 Aug 2022 5:19 UTC

29 points

39 comments3 min readLW link

Deception as the optimal: mesa-optimizers and inner alignment

Eleni Angelou16 Aug 2022 4:49 UTC

11 points

0 comments5 min readLW link

Crowdsourcing Anki Decks

Arden16 Aug 2022 2:53 UTC

1 point

0 comments1 min readLW link

What Makes an Idea Understandable? On Architecturally and Culturally Natural Ideas.

NickyP, Peter S. Park and Stephen Fowler

16 Aug 2022 2:09 UTC

21 points

2 comments16 min readLW link

Dwarves & D.Sci: Data Fortress Evaluation & Ruleset

aphyer16 Aug 2022 0:15 UTC

27 points

10 comments8 min readLW link

I’m mildly skeptical that blindness prevents schizophrenia

Steven Byrnes15 Aug 2022 23:36 UTC

95 points

9 comments4 min readLW link

What’s General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworth15 Aug 2022 22:48 UTC

158 points

18 comments10 min readLW link

“What Mistakes Are You Making Right Now?”

David Udell15 Aug 2022 21:19 UTC

13 points

2 comments1 min readLW link

On Preference Manipulation in Reward Learning Processes

Felix Hofstätter15 Aug 2022 19:32 UTC

8 points

0 comments4 min readLW link

Cambist Booking: Discussing What We Value

Screwtape15 Aug 2022 18:24 UTC

5 points

1 comment1 min readLW link