All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

[Question] Wouldn’t weak AI agents provide warning?

Mandatory TopicApr 26, 2024, 7:34 PM

5 points

0 comments1 min readLW link

World models

A*Apr 26, 2024, 7:11 PM

1 point

0 comments1 min readLW link

Duct Tape security

Isaac KingApr 26, 2024, 6:57 PM

69 points

11 comments5 min readLW link

Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?

Gordon Seidoh WorleyApr 26, 2024, 6:10 PM

11 points

2 comments32 min readLW link

Scaling of AI training runs will slow down after GPT-5

Maxime RichéApr 26, 2024, 4:05 PM

40 points

5 comments3 min readLW link

Spatial attention as a “tell” for empathetic simulation?

Steven ByrnesApr 26, 2024, 3:10 PM

55 points

12 comments8 min readLW link

Arch-anarchy

Peter lawless Apr 26, 2024, 3:05 PM

−1 points

1 comment25 min readLW link

Breadboarding a Whistle Synth

jefftkApr 26, 2024, 3:00 PM

9 points

2 comments2 min readLW link

(www.jefftk.com)

An Introduction to AI Sandbagging

Teun van der Weij, Felix Hofstätter and Francis Rhys Ward

Apr 26, 2024, 1:40 PM

46 points

13 comments8 min readLW link

LLMs seem (relatively) safe

JustisMillsApr 25, 2024, 10:13 PM

53 points

24 comments7 min readLW link

(justismills.substack.com)

Losing Faith In Contrarianism

Bentham's BulldogApr 25, 2024, 8:53 PM

39 points

44 comments5 min readLW link

Why I stopped being into basin broadness

tailcalledApr 25, 2024, 8:47 PM

16 points

3 comments2 min readLW link

AXRP Episode 29 - Science of Deep Learning with Vikrant Varma

DanielFilanApr 25, 2024, 7:10 PM

20 points

1 comment63 min readLW link

Improving Dictionary Learning with Gated Sparse Autoencoders

Senthooran Rajamanoharan, Arthur Conmy, lewis smith, Tom Lieberum, Vikrant Varma, János Kramár, Rohin Shah and Neel Nanda

Apr 25, 2024, 6:43 PM

63 points

38 comments1 min readLW link

(arxiv.org)

“Why I Write” by George Orwell (1946)

Arjun PanicksseryApr 25, 2024, 4:02 PM

59 points

2 comments9 min readLW link

(www.orwellfoundation.com)

Knowledge Base 8: The truth as an attractor in the information space

iwisApr 25, 2024, 3:28 PM

−8 points

0 comments2 min readLW link

Cybersecurity of Frontier AI Models: A Regulatory Review

Deric Cheng and Elliot Mckernon

Apr 25, 2024, 2:51 PM

8 points

0 comments8 min readLW link

The first future and the best future

KatjaGraceApr 25, 2024, 6:40 AM

106 points

12 comments1 min readLW link

(worldspiritsockpuppet.com)

NIH Cancer Myths Myths

belkarxApr 25, 2024, 5:43 AM

15 points

1 comment2 min readLW link

social lemon markets

bhauthApr 25, 2024, 2:18 AM

22 points

6 comments3 min readLW link

(www.bhauth.com)

Bayesian inference without priors

DanielFilanApr 24, 2024, 11:50 PM

26 points

8 comments8 min readLW link

(danielfilan.com)

The Inner Ring by C. S. Lewis

Saul MunnApr 24, 2024, 10:48 PM

69 points

6 comments13 min readLW link

(www.lewissociety.org)

This is Water by David Foster Wallace

Nathan YoungApr 24, 2024, 9:21 PM

60 points

16 comments13 min readLW link

(fs.blog)

Betadine oral rinses for covid and other viral infections

ElizabethApr 24, 2024, 5:50 PM

22 points

3 comments5 min readLW link

(acesounderglass.com)

At last! ChatGPT does, shall we say, interesting imitations of “Kubla Khan”

Bill BenzonApr 24, 2024, 2:56 PM

−3 points

0 comments4 min readLW link

Magic by forgetting

avturchinApr 24, 2024, 2:32 PM

18 points

39 comments4 min readLW link

Changes in College Admissions

ZviApr 24, 2024, 1:50 PM

50 points

11 comments39 min readLW link

(thezvi.wordpress.com)

1-page outline of Carlsmith’s otherness and control series

Nathan YoungApr 24, 2024, 11:25 AM

22 points

3 comments3 min readLW link

How to use and interpret activation patching

StefanHex and Neel Nanda

Apr 24, 2024, 8:35 AM

13 points

6 comments18 min readLW link

AI Generated Music as a Method of Installing Essential Rationalist Skills

keltanApr 24, 2024, 7:48 AM

18 points

4 comments1 min readLW link

Electronic Harp Mandolin Prototype

jefftkApr 24, 2024, 2:20 AM

9 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] Examples of Highly Counterfactual Discoveries?

johnswentworthApr 23, 2024, 10:19 PM

197 points

108 comments1 min readLW link

[Question] Is there software to practice reading expressions?

lsusrApr 23, 2024, 9:53 PM

37 points

11 comments1 min readLW link

Let’s Design A School, Part 1

SableApr 23, 2024, 9:50 PM

56 points

5 comments11 min readLW link

(affablyevil.substack.com)

WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals

trevorApr 23, 2024, 9:33 PM

37 points

5 comments5 min readLW link

(www.wsj.com)

On Minicircle

MetacelsusApr 23, 2024, 9:28 PM

10 points

0 comments1 min readLW link

(docs.google.com)

Simple probes can catch sleeper agents

Monte M, Carson Denison, Zac Hatfield-Dodds, David Duvenaud, Sam Bowman, Ethan Perez and evhub

Apr 23, 2024, 9:10 PM

133 points

21 comments1 min readLW link

(www.anthropic.com)

Manifold “exploring real cash prizes”

Rana DexsinApr 23, 2024, 9:07 PM

7 points

0 comments1 min readLW link

(manifoldmarkets.notion.site)

[Question] (When) Should you work through the night when inspiration strikes you?

Chi NguyenApr 23, 2024, 9:07 PM

21 points

4 comments1 min readLW link

Book review: Deep Utopia

PeterMcCluskeyApr 23, 2024, 7:55 PM

45 points

14 comments4 min readLW link

(bayesianinvestor.com)

On what research policymakers actually need

MondSemmelApr 23, 2024, 7:50 PM

38 points

0 comments3 min readLW link

(www.slowboring.com)

Dequantifying first-order theories

jessicataApr 23, 2024, 7:04 PM

40 points

9 comments8 min readLW link

(unstableontology.com)

Vector Planning in a Lattice Graph

Johannes C. Mayer and Thomas Kehrenberg

Apr 23, 2024, 4:58 PM

20 points

7 comments2 min readLW link

ProLU: A Nonlinearity for Sparse Autoencoders

Glen TaggartApr 23, 2024, 2:09 PM

44 points

4 comments9 min readLW link

Subjective Questions Require Subjective information

BenApr 23, 2024, 1:16 PM

7 points

4 comments4 min readLW link

Rejecting Television

Declan MolonyApr 23, 2024, 4:59 AM

90 points

10 comments6 min readLW link

LW Frontpage Experiments! (aka “Take the wheel, Shoggoth!”)

Ruby and RobertM

Apr 23, 2024, 3:58 AM

71 points

27 comments5 min readLW link

Thoughts on Zero Points

depressurizeApr 23, 2024, 2:22 AM

31 points

1 comment4 min readLW link

(sexandchicago.substack.com)

Funny Anecdote of Eliezer From His Sister

Noah BirnbaumApr 22, 2024, 10:05 PM

207 points

6 comments2 min readLW link

How LLMs Work, in the Style of The Economist

utilistrutilApr 22, 2024, 7:06 PM

0 points

0 comments2 min readLW link