All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 202020212022 2023 2024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 141516 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

The Natural Abstraction Hypothesis: Implications and Evidence

CallumMcDougall14 Dec 2021 23:14 UTC

44 points

9 comments19 min readLW link

Robin Hanson’s “Humans are Early”

Raemon14 Dec 2021 22:07 UTC

11 points

0 comments2 min readLW link

(www.overcomingbias.com)

Ngo’s view on alignment difficulty

Richard_Ngo and Eliezer Yudkowsky

14 Dec 2021 21:34 UTC

73 points

7 comments17 min readLW link

A proposed system for ideas jumpstart

Valentin202614 Dec 2021 21:01 UTC

4 points

2 comments3 min readLW link

Should we rely on the speed prior for safety?

Marc Carauleanu14 Dec 2021 20:45 UTC

14 points

5 comments5 min readLW link

ARC’s first technical report: Eliciting Latent Knowledge

paulfchristiano, Mark Xu and Ajeya Cotra

14 Dec 2021 20:09 UTC

230 points

90 comments1 min readLW link 3 reviews

(docs.google.com)

ARC is hiring!

paulfchristiano and Mark Xu

14 Dec 2021 20:09 UTC

64 points

2 comments1 min readLW link

Interlude: Agents as Automobiles

Daniel Kokotajlo14 Dec 2021 18:49 UTC

26 points

6 comments5 min readLW link

Zvi’s Thoughts on the Survival and Flourishing Fund (SFF)

Zvi14 Dec 2021 14:30 UTC

193 points

65 comments64 min readLW link 1 review

(thezvi.wordpress.com)

Consequentialism & corrigibility

Steven Byrnes14 Dec 2021 13:23 UTC

72 points

35 comments7 min readLW link

Mystery Hunt 2022

Scott Garrabrant13 Dec 2021 21:57 UTC

30 points

5 comments1 min readLW link

Enabling More Feedback for AI Safety Researchers

frances_lorenz13 Dec 2021 20:10 UTC

17 points

0 comments3 min readLW link

Language Model Alignment Research Internships

Ethan Perez13 Dec 2021 19:53 UTC

74 points

1 comment1 min readLW link

Omicron Post #6

Zvi13 Dec 2021 18:00 UTC

89 points

30 comments8 min readLW link

(thezvi.wordpress.com)

Analysis of Bird Box (2018)

TekhneMakre13 Dec 2021 17:30 UTC

11 points

3 comments5 min readLW link

Solving Interpretability Week

Logan Riggs13 Dec 2021 17:09 UTC

11 points

5 comments1 min readLW link

Understanding and controlling auto-induced distributional shift

L Rudolf L13 Dec 2021 14:59 UTC

33 points

4 comments16 min readLW link

A fate worse than death?

RomanS13 Dec 2021 11:05 UTC

−25 points

26 comments2 min readLW link

What’s the backward-forward FLOP ratio for Neural Networks?

Marius Hobbhahn and Jsevillamol

13 Dec 2021 8:54 UTC

20 points

12 comments10 min readLW link

Summary of the Acausal Attack Issue for AIXI

Diffractor13 Dec 2021 8:16 UTC

12 points

6 comments4 min readLW link

Hard-Coding Neural Computation

MadHatter13 Dec 2021 4:35 UTC

34 points

8 comments27 min readLW link

[Question] Is “gears-level” just a synonym for “mechanistic”?

David Scott Krueger13 Dec 2021 4:11 UTC

48 points

29 comments1 min readLW link

Baby Nicknames

jefftk13 Dec 2021 2:20 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] Why do governments refer to existential risks primarily in terms of national security?

Evan_Gaensbauer13 Dec 2021 1:05 UTC

3 points

3 comments1 min readLW link

[Question] [Resolved] Who else prefers “AI alignment” to “AI safety?”

Evan_Gaensbauer13 Dec 2021 0:35 UTC

7 points

8 comments1 min readLW link

Working through D&D.Sci, problem 1

Pablo Repetto12 Dec 2021 23:10 UTC

8 points

2 comments1 min readLW link

(pabloernesto.github.io)

Teaser: Hard-coding Transformer Models

MadHatter12 Dec 2021 22:04 UTC

74 points

19 comments1 min readLW link

The Three Mutations of Dark Rationality

DarkRationalist12 Dec 2021 22:01 UTC

−15 points

0 comments2 min readLW link

Redwood’s Technique-Focused Epistemic Strategy

adamShimi12 Dec 2021 16:36 UTC

48 points

1 comment7 min readLW link

For and Against Lotteries in Elite University Admissions

Sam Enright12 Dec 2021 13:41 UTC

10 points

2 comments3 min readLW link

[Question] Nuclear war anthropics

smountjoy12 Dec 2021 4:54 UTC

11 points

7 comments1 min readLW link

Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment

Rob Bensinger12 Dec 2021 2:08 UTC

70 points

35 comments7 min readLW link

Magna Alta Doctrina

jacob_cannell11 Dec 2021 21:54 UTC

63 points

7 comments28 min readLW link

EA Dinner Covid Logistics

jefftk11 Dec 2021 21:50 UTC

17 points

7 comments2 min readLW link

(www.jefftk.com)

Transforming myopic optimization to ordinary optimization—Do we want to seek convergence for myopic optimization problems?

tailcalled11 Dec 2021 20:38 UTC

12 points

1 comment5 min readLW link

What on Earth is a Series I savings bond?

rossry11 Dec 2021 12:18 UTC

11 points

7 comments7 min readLW link

D&D.Sci GURPS Dec 2021: Hunters of Monsters

J Bostock11 Dec 2021 12:13 UTC

20 points

23 comments2 min readLW link

Anxiety and computer architecture

Adam Zerner11 Dec 2021 10:37 UTC

13 points

8 comments3 min readLW link

[Question] Reasons to act according to the free will paradigm?

Maciej Jałocha11 Dec 2021 8:44 UTC

−3 points

5 comments1 min readLW link

Extrinsic and Intrinsic Moral Frameworks

lsusr11 Dec 2021 5:28 UTC

14 points

5 comments2 min readLW link

Moore’s Law, AI, and the pace of progress

Veedrac11 Dec 2021 3:02 UTC

130 points

38 comments24 min readLW link

What role should evolutionary analogies play in understanding AI takeoff speeds?

anson.ho11 Dec 2021 1:19 UTC

14 points

0 comments42 min readLW link

[Question] Nonverbal thinkers: how do you experience your inner critic?

Phoenix Eliot11 Dec 2021 0:40 UTC

9 points

2 comments1 min readLW link

The Plan

johnswentworth10 Dec 2021 23:41 UTC

264 points

78 comments14 min readLW link 1 review

[Linkpost] Chinese government’s guidelines on AI

RomanS10 Dec 2021 21:10 UTC

62 points

14 comments1 min readLW link

Understanding Gradient Hacking

peterbarnett10 Dec 2021 15:58 UTC

41 points

5 comments30 min readLW link

There is essentially one best-validated theory of cognition.

abramdemski10 Dec 2021 15:51 UTC

90 points

33 comments3 min readLW link

The Promise and Peril of Finite Sets

davidad10 Dec 2021 12:29 UTC

42 points

5 comments6 min readLW link

Are big brains for processing sensory input?

lsusr10 Dec 2021 7:08 UTC

42 points

20 comments3 min readLW link

Combining Forecasts

jsteinhardt10 Dec 2021 2:10 UTC

10 points

1 comment6 min readLW link

(bounded-regret.ghost.io)