All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 171819 20 21 22 23 24 25 26 27 28 29 30 31

Matt Yglesias on AI Policy

Grant Demaree17 Aug 2022 23:57 UTC

25 points

1 comment1 min readLW link

(www.slowboring.com)

Spoons and Myofascial Trigger Points

vitaliya17 Aug 2022 22:54 UTC

5 points

3 comments1 min readLW link

Concrete Advice for Forming Inside Views on AI Safety

Neel Nanda17 Aug 2022 22:02 UTC

30 points

6 comments10 min readLW link

Progress links and tweets, 2022-08-17

jasoncrawford17 Aug 2022 21:27 UTC

11 points

0 comments2 min readLW link

(rootsofprogress.org)

Conditioning, Prompts, and Fine-Tuning

Adam Jermyn17 Aug 2022 20:52 UTC

38 points

9 comments4 min readLW link

The Core of the Alignment Problem is...

Thomas Larsen, Jeremy Gillen and JamesH

17 Aug 2022 20:07 UTC

76 points

10 comments9 min readLW link

[Question] Could the simulation argument also apply to dreams?

Nathan112317 Aug 2022 19:55 UTC

6 points

4 comments3 min readLW link

Interpretability Tools Are an Attack Channel

Thane Ruthenis17 Aug 2022 18:47 UTC

42 points

14 comments1 min readLW link

Human Mimicry Mainly Works When We’re Already Close

johnswentworth17 Aug 2022 18:41 UTC

83 points

16 comments5 min readLW link

Thoughts on ‘List of Lethalities’

Alex Lawsen 17 Aug 2022 18:33 UTC

27 points

0 comments10 min readLW link

The longest training run

Jsevillamol, Tamay, Owen D and anson.ho

17 Aug 2022 17:18 UTC

71 points

12 comments9 min readLW link

(epochai.org)

Spoiler-Free Review: Across the Obelisk

Zvi17 Aug 2022 14:30 UTC

17 points

0 comments6 min readLW link

(thezvi.wordpress.com)

Autonomy as taking responsibility for reference maintenance

Ramana Kumar17 Aug 2022 12:50 UTC

63 points

3 comments5 min readLW link

Duplicating Rasberry Pi Images

jefftk17 Aug 2022 12:10 UTC

9 points

4 comments4 min readLW link

(www.jefftk.com)

ACX Meetup—Amsterdam

Pierre Vandenberghe17 Aug 2022 9:56 UTC

2 points

1 comment1 min readLW link

Insufficient awareness of how everything sucks

Flaglandbase17 Aug 2022 8:01 UTC

−13 points

5 comments1 min readLW link

Mesa-optimization for goals defined only within a training environment is dangerous

Rubi J. Hudson17 Aug 2022 3:56 UTC

6 points

2 comments4 min readLW link

ACX / SSC Meetup Singapore

DG17 Aug 2022 2:08 UTC

2 points

1 comment1 min readLW link

That-time-of-year Astral Codex Ten Meetup

Ben Smith17 Aug 2022 0:02 UTC

3 points

2 comments1 min readLW link

SSC Reno Meetup

Steven16 Aug 2022 23:37 UTC

1 point

3 comments1 min readLW link

My thoughts on direct work (and joining LessWrong)

RobertM16 Aug 2022 18:53 UTC

58 points

4 comments6 min readLW link

We can make the future a million years from now go better [video]

Writer16 Aug 2022 13:03 UTC

7 points

1 comment6 min readLW link

(youtu.be)

The Open Society and Its Enemies: Summary and Thoughts

matto16 Aug 2022 11:44 UTC

12 points

4 comments17 min readLW link

An introduction to signalling theory

Mvolz16 Aug 2022 9:37 UTC

17 points

1 comment5 min readLW link

Understanding differences between humans and intelligence-in-general to build safe AGI

Florian_Dietz16 Aug 2022 8:27 UTC

7 points

8 comments1 min readLW link

Against population ethics

jasoncrawford16 Aug 2022 5:19 UTC

29 points

39 comments3 min readLW link

Deception as the optimal: mesa-optimizers and inner alignment

Eleni Angelou16 Aug 2022 4:49 UTC

11 points

0 comments5 min readLW link

Crowdsourcing Anki Decks

Arden16 Aug 2022 2:53 UTC

1 point

0 comments1 min readLW link

What Makes an Idea Understandable? On Architecturally and Culturally Natural Ideas.

NickyP, Peter S. Park and Stephen Fowler

16 Aug 2022 2:09 UTC

21 points

2 comments16 min readLW link

Dwarves & D.Sci: Data Fortress Evaluation & Ruleset

aphyer16 Aug 2022 0:15 UTC

27 points

10 comments8 min readLW link

I’m mildly skeptical that blindness prevents schizophrenia

Steven Byrnes15 Aug 2022 23:36 UTC

96 points

9 comments4 min readLW link

What’s General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworth15 Aug 2022 22:48 UTC

160 points

18 comments10 min readLW link

“What Mistakes Are You Making Right Now?”

David Udell15 Aug 2022 21:19 UTC

13 points

2 comments1 min readLW link

On Preference Manipulation in Reward Learning Processes

Felix Hofstätter15 Aug 2022 19:32 UTC

8 points

0 comments4 min readLW link

Cambist Booking: Discussing What We Value

Screwtape15 Aug 2022 18:24 UTC

5 points

1 comment1 min readLW link

Capital and inequality

NathanBarnard15 Aug 2022 17:23 UTC

7 points

2 comments5 min readLW link

[Question] Are there practical exercises for developing the Scout mindset?

ChristianKl15 Aug 2022 17:23 UTC

15 points

2 comments1 min readLW link

The Parable of the Boy Who Cried 5% Chance of Wolf

KatWoods15 Aug 2022 14:33 UTC

141 points

24 comments2 min readLW link

And the Revenues Are So Small

Zvi15 Aug 2022 13:00 UTC

19 points

5 comments11 min readLW link

(thezvi.wordpress.com)

Extreme Security

lc15 Aug 2022 12:11 UTC

38 points

6 comments5 min readLW link

No shortcuts to knowledge: Why AI needs to ease up on scaling and learn how to code

Yldedly15 Aug 2022 8:42 UTC

5 points

0 comments1 min readLW link

(deoxyribose.github.io)

Seeking Interns/RAs for Mechanistic Interpretability Projects

Neel Nanda15 Aug 2022 7:11 UTC

61 points

0 comments2 min readLW link

A Mechanistic Interpretability Analysis of Grokking

Neel Nanda and Tom Lieberum

15 Aug 2022 2:41 UTC

378 points

48 comments36 min readLW link 1 review

(colab.research.google.com)

[Question] If a nuke is coming towards SF Bay can people bunker in BART tunnels?

Pee Doom15 Aug 2022 1:56 UTC

15 points

2 comments1 min readLW link

[Question] What is the probability that a superintelligent, sentient AGI is actually infeasible?

Nathan112314 Aug 2022 22:41 UTC

−3 points

6 comments1 min readLW link

Dealing With Delusions

adrusi14 Aug 2022 21:11 UTC

9 points

1 comment1 min readLW link

All the posts I will never write

Alexander Gietelink Oldenziel14 Aug 2022 18:29 UTC

55 points

8 comments8 min readLW link

Brain-like AGI project “aintelope”

Gunnar_Zarncke14 Aug 2022 16:33 UTC

54 points

2 comments1 min readLW link

AI Transparency: Why it’s critical and how to obtain it.

Zohar Jackson14 Aug 2022 10:31 UTC

6 points

1 comment5 min readLW link

A brief note on Simplicity Bias

carboniferous_umbraculum 14 Aug 2022 2:05 UTC

20 points

0 comments4 min readLW link