All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 232425 26 27 28 29 30

[Question] Examples of Highly Counterfactual Discoveries?

johnswentworth23 Apr 2024 22:19 UTC

204 points

116 comments1 min readLW link

[Question] Is there software to practice reading expressions?

lsusr23 Apr 2024 21:53 UTC

37 points

11 comments1 min readLW link

Let’s Design A School, Part 1

Sable23 Apr 2024 21:50 UTC

57 points

5 comments11 min readLW link

(affablyevil.substack.com)

On Minicircle

Metacelsus23 Apr 2024 21:28 UTC

10 points

0 comments1 min readLW link

(docs.google.com)

Simple probes can catch sleeper agents

Monte M, Carson Denison, Zac Hatfield-Dodds, David Duvenaud, Sam Bowman, Ethan Perez and evhub

23 Apr 2024 21:10 UTC

131 points

21 comments1 min readLW link

(www.anthropic.com)

Manifold “exploring real cash prizes”

Rana Dexsin23 Apr 2024 21:07 UTC

7 points

0 comments1 min readLW link

(manifoldmarkets.notion.site)

[Question] (When) Should you work through the night when inspiration strikes you?

Chi Nguyen23 Apr 2024 21:07 UTC

21 points

4 comments1 min readLW link

Book review: Deep Utopia

PeterMcCluskey23 Apr 2024 19:55 UTC

45 points

14 comments4 min readLW link

(bayesianinvestor.com)

On what research policymakers actually need

MondSemmel23 Apr 2024 19:50 UTC

38 points

0 comments3 min readLW link

(www.slowboring.com)

Dequantifying first-order theories

jessicata23 Apr 2024 19:04 UTC

40 points

9 comments8 min readLW link

(unstableontology.com)

Vector Planning in a Lattice Graph

Johannes C. Mayer and Thomas Kehrenberg

23 Apr 2024 16:58 UTC

20 points

7 comments2 min readLW link

ProLU: A Nonlinearity for Sparse Autoencoders

Glen Taggart23 Apr 2024 14:09 UTC

44 points

4 comments9 min readLW link

Subjective Questions Require Subjective information

Ben23 Apr 2024 13:16 UTC

8 points

4 comments4 min readLW link

Rejecting Television

Declan Molony23 Apr 2024 4:59 UTC

91 points

10 comments6 min readLW link

LW Frontpage Experiments! (aka “Take the wheel, Shoggoth!”)

Ruby and RobertM

23 Apr 2024 3:58 UTC

71 points

27 comments5 min readLW link

Thoughts on Zero Points

depressurize23 Apr 2024 2:22 UTC

34 points

2 comments4 min readLW link

(sexandchicago.substack.com)

Funny Anecdote of Eliezer From His Sister

Noah Birnbaum22 Apr 2024 22:05 UTC

235 points

7 comments2 min readLW link

How LLMs Work, in the Style of The Economist

utilistrutil22 Apr 2024 19:06 UTC

0 points

0 comments2 min readLW link

Measuring Coherence and Goal-Directedness in RL Policies

Dylan Xu22 Apr 2024 18:26 UTC

10 points

0 comments7 min readLW link

AI Regulation is Unsafe

Maxwell Tabarrok22 Apr 2024 16:37 UTC

40 points

41 comments4 min readLW link

(www.maximum-progress.com)

Priors and Prejudice

MathiasKB22 Apr 2024 15:00 UTC

157 points

32 comments7 min readLW link 1 review

Forget Everything (Statistical Mechanics Part 1)

J Bostock22 Apr 2024 13:33 UTC

47 points

7 comments3 min readLW link

On Llama-3 and Dwarkesh Patel’s Podcast with Zuckerberg

Zvi22 Apr 2024 13:10 UTC

63 points

4 comments47 min readLW link

(thezvi.wordpress.com)

Motivation gaps: Why so much EA criticism is hostile and lazy

titotal22 Apr 2024 11:49 UTC

70 points

5 comments19 min readLW link

(titotal.substack.com)

Should we break up Google DeepMind?

Hauke Hillebrandt22 Apr 2024 9:16 UTC

3 points

0 comments4 min readLW link

What should our containers do?

Richard Henage22 Apr 2024 6:17 UTC

1 point

1 comment2 min readLW link

Goal oriented cognition in “a single forward pass”

dxu and habryka

22 Apr 2024 5:03 UTC

20 points

15 comments26 min readLW link

Time complexity for deterministic string machines

alcatal21 Apr 2024 22:35 UTC

21 points

2 comments21 min readLW link

Transfer Learning in Humans

niplav21 Apr 2024 20:49 UTC

71 points

1 comment13 min readLW link

I created an Asi Alignment Tier List

TimeGoat21 Apr 2024 18:44 UTC

−6 points

0 comments1 min readLW link

The losing identity of Twitter

Itay Dreyfus21 Apr 2024 13:43 UTC

20 points

1 comment12 min readLW link

(productidentity.co)

Good Bings copy, great Bings steal

dr_s21 Apr 2024 9:52 UTC

31 points

6 comments9 min readLW link

Paper: “The Ethics of Advanced AI Assistants” -Google DeepMind

Tristan Wegner21 Apr 2024 6:45 UTC

20 points

0 comments1 min readLW link

(storage.googleapis.com)

Contra Chord Simplification

jefftk21 Apr 2024 2:30 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

A couple productivity tips for overthinkers

Steven Byrnes20 Apr 2024 16:05 UTC

79 points

13 comments4 min readLW link

“You’re the most beautiful girl in the world” and Wittgensteinian Language Games

Chris_Leong20 Apr 2024 14:54 UTC

5 points

18 comments1 min readLW link

Past Tense Features

Can20 Apr 2024 14:34 UTC

12 points

0 comments4 min readLW link

Thoughts on seed oil

dynomight20 Apr 2024 12:29 UTC

367 points

131 comments17 min readLW link 1 review

(dynomight.net)

How to know whether you are an idealist or a physicalist/materialist

JackOfAllTrades20 Apr 2024 11:53 UTC

−3 points

2 comments1 min readLW link

How I Think, Part Four: Money is Weird

Richard Henage20 Apr 2024 6:21 UTC

0 points

3 comments5 min readLW link

The power of finite and the weakness of infinite binary point numbers

AxiomWriter20 Apr 2024 6:03 UTC

−3 points

6 comments2 min readLW link

WISDOMISM A Moral Theory for the Age of Information

Peter lawless 19 Apr 2024 23:06 UTC

3 points

0 comments9 min readLW link

Inducing Unprompted Misalignment in LLMs

Sam Svenningsen, evhub and Henry Sleight

19 Apr 2024 20:00 UTC

38 points

7 comments16 min readLW link

Introspection

A*19 Apr 2024 19:10 UTC

7 points

0 comments1 min readLW link

[Full Post] Progress Update #1 from the GDM Mech Interp Team

Neel Nanda, Arthur Conmy, lewis smith, Senthooran Rajamanoharan, Tom Lieberum, János Kramár and Vikrant Varma

19 Apr 2024 19:06 UTC

80 points

10 comments8 min readLW link

[Summary] Progress Update #1 from the GDM Mech Interp Team

Neel Nanda, Arthur Conmy, lewis smith, Senthooran Rajamanoharan, Tom Lieberum, János Kramár and Vikrant Varma

19 Apr 2024 19:06 UTC

73 points

0 comments3 min readLW link

Daniel Dennett has died (1942-2024)

kave19 Apr 2024 16:17 UTC

151 points

5 comments1 min readLW link

(dailynous.com)

Events Booking New Callers?

jefftk19 Apr 2024 15:50 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] What is the best way to talk about probabilities you expect to change with evidence/experiments?

Will_Pearson19 Apr 2024 15:35 UTC

14 points

11 comments1 min readLW link

CTMU insight: maybe consciousness can affect quantum outcomes?

zhukeepa19 Apr 2024 15:23 UTC

15 points

11 comments5 min readLW link