All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 202020212022 2023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Mystery Hunt 2022

Scott Garrabrant13 Dec 2021 21:57 UTC

30 points

5 comments1 min readLW link

Enabling More Feedback for AI Safety Researchers

frances_lorenz13 Dec 2021 20:10 UTC

17 points

0 comments3 min readLW link

Language Model Alignment Research Internships

Ethan Perez13 Dec 2021 19:53 UTC

74 points

1 comment1 min readLW link

Omicron Post #6

Zvi13 Dec 2021 18:00 UTC

89 points

30 comments8 min readLW link

(thezvi.wordpress.com)

Analysis of Bird Box (2018)

TekhneMakre13 Dec 2021 17:30 UTC

11 points

3 comments5 min readLW link

Solving Interpretability Week

Logan Riggs13 Dec 2021 17:09 UTC

11 points

5 comments1 min readLW link

Understanding and controlling auto-induced distributional shift

L Rudolf L13 Dec 2021 14:59 UTC

33 points

4 comments16 min readLW link

A fate worse than death?

RomanS13 Dec 2021 11:05 UTC

−25 points

26 comments2 min readLW link

What’s the backward-forward FLOP ratio for Neural Networks?

Marius Hobbhahn and Jsevillamol

13 Dec 2021 8:54 UTC

20 points

12 comments10 min readLW link

Summary of the Acausal Attack Issue for AIXI

Diffractor13 Dec 2021 8:16 UTC

12 points

6 comments4 min readLW link

Hard-Coding Neural Computation

MadHatter13 Dec 2021 4:35 UTC

34 points

8 comments27 min readLW link

[Question] Is “gears-level” just a synonym for “mechanistic”?

David Scott Krueger (formerly: capybaralet)13 Dec 2021 4:11 UTC

48 points

29 comments1 min readLW link

Baby Nicknames

jefftk13 Dec 2021 2:20 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] Why do governments refer to existential risks primarily in terms of national security?

Evan_Gaensbauer13 Dec 2021 1:05 UTC

3 points

3 comments1 min readLW link

[Question] [Resolved] Who else prefers “AI alignment” to “AI safety?”

Evan_Gaensbauer13 Dec 2021 0:35 UTC

5 points

8 comments1 min readLW link

Working through D&D.Sci, problem 1

Pablo Repetto12 Dec 2021 23:10 UTC

8 points

2 comments1 min readLW link

(pabloernesto.github.io)

Teaser: Hard-coding Transformer Models

MadHatter12 Dec 2021 22:04 UTC

74 points

19 comments1 min readLW link

The Three Mutations of Dark Rationality

DarkRationalist12 Dec 2021 22:01 UTC

−15 points

0 comments2 min readLW link

Redwood’s Technique-Focused Epistemic Strategy

adamShimi12 Dec 2021 16:36 UTC

48 points

1 comment7 min readLW link

For and Against Lotteries in Elite University Admissions

Sam Enright12 Dec 2021 13:41 UTC

10 points

2 comments3 min readLW link

[Question] Nuclear war anthropics

smountjoy12 Dec 2021 4:54 UTC

11 points

7 comments1 min readLW link

Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment

Rob Bensinger12 Dec 2021 2:08 UTC

70 points

35 comments7 min readLW link

Magna Alta Doctrina

jacob_cannell11 Dec 2021 21:54 UTC

60 points

7 comments28 min readLW link

EA Dinner Covid Logistics

jefftk11 Dec 2021 21:50 UTC

17 points

7 comments2 min readLW link

(www.jefftk.com)

Transforming myopic optimization to ordinary optimization—Do we want to seek convergence for myopic optimization problems?

tailcalled11 Dec 2021 20:38 UTC

12 points

1 comment5 min readLW link

What on Earth is a Series I savings bond?

rossry11 Dec 2021 12:18 UTC

11 points

7 comments7 min readLW link

D&D.Sci GURPS Dec 2021: Hunters of Monsters

J Bostock11 Dec 2021 12:13 UTC

20 points

21 comments2 min readLW link

Anxiety and computer architecture

Adam Zerner11 Dec 2021 10:37 UTC

13 points

8 comments3 min readLW link

[Question] Reasons to act according to the free will paradigm?

Maciej Jałocha11 Dec 2021 8:44 UTC

−3 points

5 comments1 min readLW link

Extrinsic and Intrinsic Moral Frameworks

lsusr11 Dec 2021 5:28 UTC

14 points

5 comments2 min readLW link

Moore’s Law, AI, and the pace of progress

Veedrac11 Dec 2021 3:02 UTC

128 points

38 comments24 min readLW link

What role should evolutionary analogies play in understanding AI takeoff speeds?

anson.ho11 Dec 2021 1:19 UTC

14 points

0 comments42 min readLW link

[Question] Nonverbal thinkers: how do you experience your inner critic?

Phoenix Eliot11 Dec 2021 0:40 UTC

9 points

2 comments1 min readLW link

The Plan

johnswentworth10 Dec 2021 23:41 UTC

261 points

78 comments14 min readLW link 1 review

[Linkpost] Chinese government’s guidelines on AI

RomanS10 Dec 2021 21:10 UTC

61 points

14 comments1 min readLW link

Understanding Gradient Hacking

peterbarnett10 Dec 2021 15:58 UTC

41 points

5 comments30 min readLW link

There is essentially one best-validated theory of cognition.

abramdemski10 Dec 2021 15:51 UTC

89 points

33 comments3 min readLW link

The Promise and Peril of Finite Sets

davidad10 Dec 2021 12:29 UTC

42 points

5 comments6 min readLW link

Are big brains for processing sensory input?

lsusr10 Dec 2021 7:08 UTC

42 points

20 comments3 min readLW link

Combining Forecasts

jsteinhardt10 Dec 2021 2:10 UTC

10 points

1 comment6 min readLW link

(bounded-regret.ghost.io)

Covid 12/9: Counting Down the Days

Zvi9 Dec 2021 21:40 UTC

59 points

12 comments11 min readLW link

(thezvi.wordpress.com)

Conversation on technology forecasting and gradualism

Richard_Ngo, Eliezer Yudkowsky, Rohin Shah and Rob Bensinger

9 Dec 2021 21:23 UTC

108 points

30 comments31 min readLW link

Omicron Post #5

Zvi9 Dec 2021 21:10 UTC

102 points

18 comments14 min readLW link

(thezvi.wordpress.com)

LessWrong discussed in New Ideas in Psychology article

rogersbacon9 Dec 2021 21:01 UTC

76 points

11 comments4 min readLW link

[Question] What alignment-related concepts should be better known in the broader ML community?

Lauro Langosco9 Dec 2021 20:44 UTC

6 points

4 comments1 min readLW link

The end of Victorian culture, part I: structural forces

David Hugh-Jones9 Dec 2021 19:25 UTC

24 points

0 comments4 min readLW link

(wyclif.substack.com)

[MLSN #2]: Adversarial Training

Dan H9 Dec 2021 17:16 UTC

26 points

0 comments3 min readLW link

Supervised learning and self-modeling: What’s “superhuman?”

Charlie Steiner9 Dec 2021 12:44 UTC

13 points

1 comment8 min readLW link

Austin Winter Solstice

SilasBarta9 Dec 2021 5:01 UTC

9 points

1 comment1 min readLW link

Stop arbitrarily limiting yourself

unoptimal9 Dec 2021 2:42 UTC

31 points

7 comments2 min readLW link