All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 171819 20 21 22 23 24 25 26 27 28 29 30 31

Four visions of Transformative AI success

Steven Byrnes17 Jan 2024 20:45 UTC

113 points

22 comments15 min readLW link

AI Disclosure Ballot Initiative (and voting method)

Aaron Hamlin17 Jan 2024 20:02 UTC

−6 points

3 comments1 min readLW link

Hatching the Cosmic Egg (Hymn to Dionysus)

rogersbacon17 Jan 2024 18:34 UTC

7 points

0 comments9 min readLW link

(www.secretorum.life)

[Question] What do people colloquially mean by deep breathing? Slow, large, or diaphragmatic?

VipulNaik17 Jan 2024 18:01 UTC

13 points

8 comments2 min readLW link

AlphaGeometry: An Olympiad-level AI system for geometry

alyssavance17 Jan 2024 17:17 UTC

45 points

9 comments1 min readLW link

(deepmind.google)

On Anthropic’s Sleeper Agents Paper

Zvi17 Jan 2024 16:10 UTC

54 points

5 comments36 min readLW link

(thezvi.wordpress.com)

A Pedagogical Guide to Corrigibility

A.H.17 Jan 2024 11:45 UTC

6 points

3 comments16 min readLW link

An Introduction To The Mandelbrot Set That Doesn’t Mention Complex Numbers

Yitz17 Jan 2024 9:48 UTC

83 points

11 comments9 min readLW link

Vote in the LessWrong review! (LW 2022 Review voting phase)

habryka17 Jan 2024 7:22 UTC

26 points

9 comments2 min readLW link

Coalescer Models

DaemonicSigil17 Jan 2024 6:39 UTC

16 points

2 comments10 min readLW link

Maybe talking isn’t the best way to communicate with LLMs

mnvr17 Jan 2024 6:24 UTC

3 points

1 comment1 min readLW link

(mrmr.io)

D&D.Sci Hypersphere Analysis Part 3: Beat it with Linear Algebra

aphyer16 Jan 2024 22:44 UTC

26 points

1 comment5 min readLW link

Social media alignment test

amayhew16 Jan 2024 20:56 UTC

1 point

0 comments1 min readLW link

(naiveskepticblog.wordpress.com)

Medical Roundup #1

Zvi16 Jan 2024 20:30 UTC

57 points

9 comments29 min readLW link

(thezvi.wordpress.com)

Being nicer than Clippy

Joe Carlsmith16 Jan 2024 19:44 UTC

110 points

32 comments27 min readLW link

How polysemantic can one neuron be? Investigating features in TinyStories.

Evan Anders16 Jan 2024 19:10 UTC

14 points

0 comments8 min readLW link

(evanhanders.blog)

Applying AI Safety concepts to astronomy

Faris16 Jan 2024 18:29 UTC

1 point

0 comments12 min readLW link

Managing catastrophic misuse without robust AIs

ryan_greenblatt and Buck

16 Jan 2024 17:27 UTC

63 points

17 comments11 min readLW link

[Question] What are the most common social insecurities?

Chris Lakin16 Jan 2024 17:24 UTC

9 points

6 comments1 min readLW link

Why wasn’t preservation with the goal of potential future revival started earlier in history?

Andy_McKenzie16 Jan 2024 16:15 UTC

31 points

1 comment6 min readLW link

[Question] Why are people unkeen to immortality that would come from technological advancements and/or AI?

Gabi QUENE16 Jan 2024 14:23 UTC

12 points

42 comments1 min readLW link

Dealing with Awkwardness

Jonathan Moregård16 Jan 2024 12:32 UTC

13 points

0 comments4 min readLW link

(honestliving.substack.com)

The impossible problem of due process

mingyuan16 Jan 2024 5:18 UTC

231 points

71 comments14 min readLW link 3 reviews

[Retracted] Newton’s law of cooling from first principles

Nisan16 Jan 2024 4:21 UTC

9 points

15 comments2 min readLW link

Sparse Autoencoders Work on Attention Layer Outputs

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

16 Jan 2024 0:26 UTC

85 points

9 comments18 min readLW link

Goals selected from learned knowledge: an alternative to RL alignment

Seth Herd15 Jan 2024 21:52 UTC

45 points

17 comments7 min readLW link

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Arjun Panickssery and agg

15 Jan 2024 21:21 UTC

33 points

0 comments1 min readLW link

Live Sound: Big-O Improvements

jefftk15 Jan 2024 19:50 UTC

8 points

0 comments1 min readLW link

(www.jefftk.com)

Investigating Bias Representations in LLMs via Activation Steering

DawnLu15 Jan 2024 19:39 UTC

29 points

4 comments5 min readLW link

Sparse MLP Distillation

slavachalnev15 Jan 2024 19:39 UTC

34 points

3 comments6 min readLW link

Review of Alignment Plan Critiques- December AI-Plans Critique-a-Thon Results

Iknownothing15 Jan 2024 19:37 UTC

24 points

0 comments25 min readLW link

(aiplans.substack.com)

[Question] What does it look like for AI to significantly improve human coordination, before superintelligence?

Bird Concept15 Jan 2024 19:22 UTC

23 points

2 comments1 min readLW link

Now Accepting Player Applications for Band of Blades

Joe Rogero15 Jan 2024 17:58 UTC

2 points

0 comments3 min readLW link

Three Types of Constraints in the Space of Agents

Nora_Ammann and Mateusz Bagiński

15 Jan 2024 17:27 UTC

26 points

3 comments17 min readLW link

The case for training frontier AIs on Sumerian-only corpus

Alexandre Variengien, Charbel-Raphaël and Jonathan Claybrough

15 Jan 2024 16:40 UTC

143 points

16 comments3 min readLW link

How to Promote More Productive Dialogue Outside of LessWrong

sweenesm15 Jan 2024 14:16 UTC

18 points

4 comments2 min readLW link

[Question] Come and daydream with me about science reform

TeaTieAndHat15 Jan 2024 11:09 UTC

9 points

1 comment1 min readLW link

AI doing philosophy = AI generating hands?

Wei Dai15 Jan 2024 9:04 UTC

51 points

26 comments3 min readLW link

Even if we lose, we win

Morphism15 Jan 2024 2:15 UTC

25 points

17 comments4 min readLW link

Detachment vs attachment [AI risk and mental health]

Neil 15 Jan 2024 0:41 UTC

15 points

4 comments3 min readLW link

Making up statistics to establish priority on Land Value Tax vs Earned Income Tax Credit vs Social Media Dynamic Regulation

Canucklug14 Jan 2024 23:57 UTC

−5 points

2 comments7 min readLW link

Is the universe all there is? ‘Evidence’ for objects outside the universe...

JonathanHall14 Jan 2024 23:56 UTC

−4 points

27 comments11 min readLW link

[Question] What is the minimum amount of time travel and resources needed to secure the future?

Perhaps14 Jan 2024 22:01 UTC

−3 points

5 comments1 min readLW link

Gothenburg LW / ACX meetup

Stefan14 Jan 2024 21:21 UTC

1 point

0 comments1 min readLW link

Gothenburg LW / ACX meetup

Stefan14 Jan 2024 21:20 UTC

1 point

1 comment1 min readLW link

D&D.Sci Hypersphere Analysis Part 2: Nonlinear Effects & Interactions

aphyer14 Jan 2024 19:59 UTC

24 points

0 comments7 min readLW link

Gender Exploration

sapphire14 Jan 2024 18:57 UTC

124 points

27 comments5 min readLW link 1 review

(open.substack.com)

List of projects that seem impactful for AI Governance

JaimeRV and Teun van der Weij

14 Jan 2024 16:53 UTC

14 points

0 comments13 min readLW link

The Leeroy Jenkins principle: How faulty AI could guarantee “warning shots”

titotal14 Jan 2024 15:03 UTC

48 points

6 comments21 min readLW link

(titotal.substack.com)

Notice When People Are Directionally Correct

Chris_Leong14 Jan 2024 14:12 UTC

157 points

15 comments2 min readLW link