All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 202020212022 2023 2024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 252627 28 29 30 31

My Overview of the AI Alignment Landscape: Threat Models

Neel Nanda25 Dec 2021 23:07 UTC

55 points

3 comments28 min readLW link

[Question] What is a probabilistic physical theory?

Ege Erdil25 Dec 2021 16:30 UTC

15 points

36 comments2 min readLW link

Belief-conditional things—things that only exist when you believe in them

Jan25 Dec 2021 10:49 UTC

7 points

3 comments5 min readLW link

(universalprior.substack.com)

Tough Choices and Disappointment

maralorn24 Dec 2021 21:59 UTC

2 points

6 comments1 min readLW link

Converging toward a Million Worlds

Joe Kwon24 Dec 2021 21:33 UTC

11 points

1 comment3 min readLW link

Understanding the tensor product formulation in Transformer Circuits

Tom Lieberum24 Dec 2021 18:05 UTC

16 points

2 comments3 min readLW link

[Question] How to select a long-term goal and align my mind towards it?

Alexander24 Dec 2021 11:40 UTC

19 points

8 comments2 min readLW link

Prerequisite Skills

lsusr24 Dec 2021 10:11 UTC

17 points

4 comments1 min readLW link

Mechanistic Interpretability for the MLP Layers (rough early thoughts)

MadHatter24 Dec 2021 7:24 UTC

12 points

3 comments1 min readLW link

(www.youtube.com)

Risks from AI persuasion

Beth Barnes24 Dec 2021 1:48 UTC

76 points

15 comments31 min readLW link

Prioritizing Information

jsteinhardt24 Dec 2021 0:00 UTC

18 points

0 comments7 min readLW link

(bounded-regret.ghost.io)

Omicron Post #9

Zvi23 Dec 2021 21:50 UTC

89 points

11 comments19 min readLW link

(thezvi.wordpress.com)

Reply to Eliezer on Biological Anchors

HoldenKarnofsky23 Dec 2021 16:15 UTC

155 points

46 comments15 min readLW link

Get Set, Also Go

Zvi23 Dec 2021 15:00 UTC

62 points

21 comments16 min readLW link

(thezvi.wordpress.com)

2021 AI Alignment Literature Review and Charity Comparison

Larks23 Dec 2021 14:06 UTC

168 points

28 comments73 min readLW link

Testing, Testing, Hopefully

Zvi23 Dec 2021 12:30 UTC

41 points

8 comments4 min readLW link

(thezvi.wordpress.com)

Physics Erotica

lsusr23 Dec 2021 11:01 UTC

6 points

12 comments1 min readLW link

[Book Review] “The Most Powerful Idea in the World” by William Rosen

lsusr23 Dec 2021 8:27 UTC

41 points

4 comments8 min readLW link

Specialization

DirectedEvolution23 Dec 2021 3:23 UTC

15 points

1 comment5 min readLW link

Worst-case thinking in AI alignment

Buck23 Dec 2021 1:29 UTC

167 points

18 comments6 min readLW link 2 reviews

[Question] Hedging the Possibility of Russia invading Ukraine

Annapurna23 Dec 2021 1:13 UTC

32 points

8 comments1 min readLW link

Gifts

George3d622 Dec 2021 23:50 UTC

13 points

1 comment9 min readLW link

(www.epistem.ink)

A spreadsheet/template for doing an annual review

peterslattery22 Dec 2021 23:29 UTC

12 points

1 comment2 min readLW link

[Question] What time in your life were you the most productive at learning and/or thinking and why?

Jack R22 Dec 2021 22:56 UTC

11 points

2 comments1 min readLW link

Transformer Circuits

evhub22 Dec 2021 21:09 UTC

145 points

4 comments3 min readLW link

(transformer-circuits.pub)

[Question] Help figuring out my sexuality?

Centhart22 Dec 2021 20:28 UTC

13 points

13 comments2 min readLW link

DnD.Sci GURPS Evaluation and Ruleset

J Bostock22 Dec 2021 19:05 UTC

17 points

2 comments6 min readLW link

Potential gears level explanations of smooth progress

ryan_greenblatt22 Dec 2021 18:05 UTC

4 points

2 comments2 min readLW link

Random facts can come back to bite you

tailcalled22 Dec 2021 17:33 UTC

70 points

7 comments2 min readLW link 1 review

What’s Up With the CDC Nowcast?

Zvi22 Dec 2021 13:00 UTC

61 points

4 comments5 min readLW link

(thezvi.wordpress.com)

Morality and constrained maximization, part 1

Joe Carlsmith22 Dec 2021 8:47 UTC

20 points

5 comments11 min readLW link

Six Specializations Makes You World-Class

lsusr22 Dec 2021 8:03 UTC

53 points

23 comments1 min readLW link

Worldbuilding exercise: The Highwayverse.

Yair Halberstadt22 Dec 2021 6:47 UTC

13 points

13 comments11 min readLW link

Two (very different) kinds of donors

Duncan Sabien (Inactive)22 Dec 2021 1:43 UTC

108 points

19 comments3 min readLW link

[Question] Confusion about Sequences and Review Sequences

Alex_Altair21 Dec 2021 18:13 UTC

14 points

3 comments1 min readLW link

Working through D&D.Sci, problem 1 (solution)

Pablo Repetto21 Dec 2021 17:42 UTC

9 points

2 comments1 min readLW link

(pabloernesto.github.io)

Demanding and Designing Aligned Cognitive Architectures

Koen.Holtman21 Dec 2021 17:32 UTC

8 points

5 comments5 min readLW link

Experiences raising children in shared housing

juliawise21 Dec 2021 17:09 UTC

118 points

5 comments6 min readLW link

[Question] What questions do you have about doing work on AI safety?

peterbarnett21 Dec 2021 16:36 UTC

13 points

8 comments1 min readLW link

Perpetual Dickensian Poverty?

jefftk21 Dec 2021 13:30 UTC

121 points

18 comments1 min readLW link

(www.jefftk.com)

On (Not) Reading Papers

Jan21 Dec 2021 9:57 UTC

53 points

10 comments7 min readLW link

(universalprior.substack.com)

Quick Poll: Booster Reactions

Elizabeth21 Dec 2021 7:40 UTC

40 points

2 comments2 min readLW link

(acesounderglass.com)

Book Launch: The Engines of Cognition

Ben Pace, the Vacationing Vagabond21 Dec 2021 7:24 UTC

174 points

56 comments5 min readLW link

Researcher incentives cause smoother progress on benchmarks

ryan_greenblatt21 Dec 2021 4:13 UTC

20 points

4 comments1 min readLW link

Omicron Post #8

Zvi20 Dec 2021 23:10 UTC

96 points

33 comments16 min readLW link

(thezvi.wordpress.com)

[Question] Good complete views on motivation

Valdes20 Dec 2021 22:10 UTC

6 points

4 comments1 min readLW link

Prizes for last year’s 2019 Review

Raemon20 Dec 2021 21:58 UTC

40 points

0 comments3 min readLW link

Omicron Paths

jefftk20 Dec 2021 18:30 UTC

14 points

8 comments2 min readLW link

(www.jefftk.com)

[Question] Is there a term / better way of phrasing the general case where an intervention helps certain individuals do better at zero-sum games but doesn’t provide any external value?

freedomandutility20 Dec 2021 17:35 UTC

4 points

8 comments1 min readLW link

Bayesian Dharani, Great Dharani for Conserving Evidence

Gordon Seidoh Worley20 Dec 2021 16:32 UTC

9 points

5 comments1 min readLW link