All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 222324 25 26 27 28 29 30 31

Article Review: Discovering Latent Knowledge (Burns, Ye, et al)

Robert_AIZI22 Dec 2022 18:16 UTC

13 points

4 comments6 min readLW link

(aizi.substack.com)

Let’s think about slowing down AI

KatjaGrace22 Dec 2022 17:40 UTC

563 points

182 comments38 min readLW link 3 reviews

(aiimpacts.org)

Some Notes on the mathematics of Toy Autoencoding Problems

carboniferous_umbraculum 22 Dec 2022 17:21 UTC

18 points

1 comment12 min readLW link

December 2022 updates and fundraising

AI Impacts22 Dec 2022 17:20 UTC

39 points

1 comment3 min readLW link

(aiimpacts.org)

Covid 12/22/22: Reevaluating Past Options

Zvi22 Dec 2022 16:50 UTC

30 points

2 comments9 min readLW link

(thezvi.wordpress.com)

China Covid #4

Zvi22 Dec 2022 16:30 UTC

50 points

2 comments11 min readLW link

(thezvi.wordpress.com)

Racing through a minefield: the AI deployment problem

HoldenKarnofsky22 Dec 2022 16:10 UTC

38 points

2 comments13 min readLW link

(www.cold-takes.com)

Lead in Chocolate?

jefftk22 Dec 2022 16:10 UTC

44 points

9 comments2 min readLW link

(www.jefftk.com)

Response to Holden’s alignment plan

Alex Flint22 Dec 2022 16:08 UTC

36 points

4 comments6 min readLW link

Staring into the abyss as a core life skill

benkuhn22 Dec 2022 15:30 UTC

380 points

24 comments12 min readLW link 1 review

(www.benkuhn.net)

Secular Solstice for children

juliawise and denkenberger

22 Dec 2022 14:33 UTC

31 points

1 comment3 min readLW link

Mental acceptance and reflection

remember and Gabriel Alfour

22 Dec 2022 14:32 UTC

34 points

1 comment2 min readLW link

Against Diversification

Jack Malde22 Dec 2022 13:29 UTC

4 points

0 comments3 min readLW link

(ethicaleconomist.substack.com)

Notes on Meta’s Diplomacy-Playing AI

Erich_Grunewald22 Dec 2022 11:34 UTC

19 points

2 comments14 min readLW link

(www.erichgrunewald.com)

Take 13: RLHF bad, conditioning good.

Charlie Steiner22 Dec 2022 10:44 UTC

54 points

4 comments2 min readLW link

Applied Linear Algebra Lecture Series

johnswentworth22 Dec 2022 6:57 UTC

103 points

8 comments1 min readLW link

Naive Set Theory, Halmos

David Udell22 Dec 2022 2:34 UTC

11 points

1 comment8 min readLW link

Not Getting Hacked

jefftk21 Dec 2022 21:40 UTC

41 points

14 comments7 min readLW link

(www.jefftk.com)

[Question] How much is DQC (Dynamic Quantum Clustering) currently looked into in AI Capabilities Research?

macmillan21 Dec 2022 20:46 UTC

1 point

0 comments1 min readLW link

Think wider about the root causes of progress

jasoncrawford21 Dec 2022 20:05 UTC

49 points

11 comments4 min readLW link

(rootsofprogress.org)

[Question] What readings did you consider best for the happy parts of the secular solstice?

ChristianKl21 Dec 2022 15:45 UTC

17 points

0 comments1 min readLW link

Recreating logic in type theory

Thomas Kehrenberg21 Dec 2022 15:19 UTC

19 points

0 comments13 min readLW link

You become the UI you use

Viliam21 Dec 2022 15:04 UTC

21 points

7 comments2 min readLW link

Price’s equation for neural networks

tailcalled21 Dec 2022 13:09 UTC

31 points

4 comments2 min readLW link

Decisions: Ontologically Shifting to Determinism

Chris_Leong21 Dec 2022 12:41 UTC

8 points

11 comments6 min readLW link

A Comprehensive Mechanistic Interpretability Explainer & Glossary

Neel Nanda21 Dec 2022 12:35 UTC

91 points

6 comments2 min readLW link

(neelnanda.io)

Google Search loses to ChatGPT fair and square

Shmi21 Dec 2022 8:11 UTC

14 points

17 comments1 min readLW link

(www.surgehq.ai)

Sazen

Duncan Sabien (Inactive)21 Dec 2022 7:54 UTC

305 points

87 comments12 min readLW link 2 reviews

Podcast: What’s Wrong With LessWrong

Alfred21 Dec 2022 7:06 UTC

−32 points

11 comments1 min readLW link

(youtu.be)

New AI risk intro from Vox [link post]

JakubK21 Dec 2022 6:00 UTC

5 points

1 comment2 min readLW link

(www.vox.com)

Local Memes Against Geometric Rationality

Scott Garrabrant21 Dec 2022 3:53 UTC

96 points

3 comments6 min readLW link

Logging Shell History in Zsh

jefftk21 Dec 2022 3:30 UTC

19 points

2 comments1 min readLW link

(www.jefftk.com)

CIRL Corrigibility is Fragile

Rachel Freedman and AdamGleave

21 Dec 2022 1:40 UTC

58 points

8 comments12 min readLW link

[Question] [DISC] Are Values Robust?

DragonGod21 Dec 2022 1:00 UTC

12 points

9 comments2 min readLW link

Performing an SVD on a time-series matrix of gradient updates on an MNIST network produces 92.5 singular values

Garrett Baker21 Dec 2022 0:44 UTC

9 points

10 comments5 min readLW link

Progress links and tweets, 2022-12-20

jasoncrawford21 Dec 2022 0:35 UTC

12 points

0 comments2 min readLW link

(rootsofprogress.org)

K-complexity is silly; use cross-entropy instead

So8res20 Dec 2022 23:06 UTC

153 points

60 comments14 min readLW link 2 reviews

Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic

Orpheus1620 Dec 2022 21:39 UTC

19 points

2 comments11 min readLW link

Discovering Language Model Behaviors with Model-Written Evaluations

evhub and Ethan Perez

20 Dec 2022 20:08 UTC

100 points

34 comments1 min readLW link

(www.anthropic.com)

Reflections: Bureaucratic Hell

Haris Rashid20 Dec 2022 19:22 UTC

−5 points

1 comment1 min readLW link

(www.harisrab.com)

Proliferating Education

Haris Rashid20 Dec 2022 19:22 UTC

−1 points

2 comments5 min readLW link

(www.harisrab.com)

AGI is here, but nobody wants it. Why should we even care?

MGow20 Dec 2022 19:14 UTC

−22 points

0 comments17 min readLW link

Properties of current AIs and some predictions of the evolution of AI from the perspective of scale-free theories of agency and regulative development

Roman Leventov20 Dec 2022 17:13 UTC

34 points

3 comments36 min readLW link

I believe some AI doomers are overconfident

FTPickle20 Dec 2022 17:09 UTC

8 points

15 comments2 min readLW link

Note on algorithms with multiple trained components

Steven Byrnes20 Dec 2022 17:08 UTC

23 points

4 comments2 min readLW link

Marvel Snap: Phase 2

Zvi20 Dec 2022 14:50 UTC

11 points

1 comment13 min readLW link

(thezvi.wordpress.com)

(Extremely) Naive Gradient Hacking Doesn’t Work

ojorgensen20 Dec 2022 14:35 UTC

17 points

0 comments6 min readLW link

An Open Agency Architecture for Safe Transformative AI

davidad20 Dec 2022 13:04 UTC

80 points

22 comments4 min readLW link

Under-Appreciated Ways to Use Flashcards—Part I

Florence Hinder20 Dec 2022 12:43 UTC

22 points

5 comments5 min readLW link

(thoughtsaver.ghost.io)

EA & LW Forums Weekly Summary (12th Dec − 18th Dec 22′)

Zoe Williams20 Dec 2022 9:49 UTC

10 points

0 comments17 min readLW link