All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

All JanFebMar Apr May Jun Jul Aug Sep Oct

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 222324 25 26 27 28

Power Lies Trembling: a three-book review

Richard_Ngo22 Feb 2025 22:57 UTC

214 points

29 comments15 min readLW link

(www.mindthefuture.info)

Transformer Dynamics: a neuro-inspired approach to MechInterp

guitchounts and jfernando

22 Feb 2025 21:33 UTC

11 points

0 comments5 min readLW link

Recursive Cognitive Refinement (RCR): A Self-Correcting Approach for LLM Hallucinations

mxTheo22 Feb 2025 21:32 UTC

0 points

0 comments2 min readLW link

Gradual Disempowerment: Simplified

Annapurna22 Feb 2025 16:59 UTC

10 points

1 comment1 min readLW link

(jorgevelez.substack.com)

AI Apocalypse and the Buddha

pchvykov22 Feb 2025 16:33 UTC

−17 points

6 comments9 min readLW link

Unaligned AGI & Brief History of Inequality

ank22 Feb 2025 16:26 UTC

−20 points

4 comments7 min readLW link

HPMOR Anniversary Guide

Screwtape22 Feb 2025 16:17 UTC

63 points

7 comments3 min readLW link

Forecasting Uncontrolled Spread of AI

Alvin Ånestrand22 Feb 2025 13:05 UTC

2 points

0 comments10 min readLW link

(forecastingaifutures.substack.com)

Seeing Through the Eyes of the Algorithm

silentbob22 Feb 2025 11:54 UTC

18 points

3 comments10 min readLW link

Proselytizing

lsusr22 Feb 2025 11:54 UTC

50 points

3 comments2 min readLW link

Workshop: Interpretability in LLMs using Geometric and Statistical Methods

Karthik Viswanathan22 Feb 2025 9:39 UTC

17 points

0 comments8 min readLW link

Information throughput of biological humans and frontier LLMs

benwr22 Feb 2025 7:15 UTC

12 points

0 comments1 min readLW link

Inefficiencies in Pharmaceutical Research Practices

ErioirE22 Feb 2025 4:43 UTC

20 points

2 comments5 min readLW link

Build a Metaculus Forecasting Bot in 30 Minutes: A Practical Guide

ChristianWilliams22 Feb 2025 3:52 UTC

7 points

0 comments1 min readLW link

Intelligence–Agency Equivalence ≈ Mass–Energy Equivalence: On Static Nature of Intelligence & Physicalization of Ethics

ank22 Feb 2025 0:12 UTC

1 point

0 comments6 min readLW link

Alignment can be the ‘clean energy’ of AI

Cameron Berg, Judd Rosenblatt and Trent Hodgeson

22 Feb 2025 0:08 UTC

68 points

8 comments8 min readLW link

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

Thane Ruthenis21 Feb 2025 20:15 UTC

152 points

53 comments6 min readLW link

ParaScopes: Do Language Models Plan the Upcoming Paragraph?

NickyP21 Feb 2025 16:50 UTC

36 points

2 comments20 min readLW link

Linguistic Imperialism in AI: Enforcing Human-Readable Chain-of-Thought

Lukas Petersson21 Feb 2025 15:45 UTC

5 points

0 comments5 min readLW link

(lukaspetersson.com)

On OpenAI’s Model Spec 2.0

Zvi21 Feb 2025 14:10 UTC

52 points

4 comments43 min readLW link

(thezvi.wordpress.com)

Longtermist implications of aliens Space-Faring Civilizations—Introduction

Maxime Riché21 Feb 2025 12:08 UTC

21 points

0 comments6 min readLW link

MAISU—Minimal AI Safety Unconference

Linda Linsefors21 Feb 2025 11:36 UTC

19 points

2 comments2 min readLW link

The case for the death penalty

Yair Halberstadt21 Feb 2025 8:30 UTC

26 points

80 comments5 min readLW link

Make Superintelligence Loving

Davey Morse21 Feb 2025 6:07 UTC

8 points

9 comments5 min readLW link

Fun, endless art debates v. morally charged art debates that are intrinsically endless

d_el_ez21 Feb 2025 4:44 UTC

6 points

2 comments2 min readLW link

The Takeoff Speeds Model Predicts We May Be Entering Crunch Time

johncrox21 Feb 2025 2:26 UTC

45 points

3 comments18 min readLW link

(readtheoom.substack.com)

Humans are Just Self Aware Intelligent Biological Machines

asksathvik21 Feb 2025 1:03 UTC

3 points

9 comments2 min readLW link

(asksathvik.substack.com)

Pre-ASI: The case for an enlightened mind, capital, and AI literacy in maximizing the good life

Noahh21 Feb 2025 0:03 UTC

5 points

5 comments6 min readLW link

(open.substack.com)

Timaeus in 2024

Jesse Hoogland, Stan van Wingerden, Alexander Gietelink Oldenziel and Daniel Murfet

20 Feb 2025 23:54 UTC

99 points

1 comment8 min readLW link

Biological humans collectively exert at most 400 gigabits/s of control over the world.

benwr20 Feb 2025 23:44 UTC

15 points

3 comments1 min readLW link

The first RCT for GLP-1 drugs and alcoholism isn’t what we hoped

dynomight20 Feb 2025 22:30 UTC

62 points

4 comments6 min readLW link

(dynomight.net)

Published report: Pathways to short TAI timelines

Zershaaneh Qureshi20 Feb 2025 22:10 UTC

22 points

0 comments17 min readLW link

(www.convergenceanalysis.org)

Neural Scaling Laws Rooted in the Data Distribution

aribrill20 Feb 2025 21:22 UTC

8 points

0 comments1 min readLW link

(arxiv.org)

Demonstrating specification gaming in reasoning models

Matrice Jacobine20 Feb 2025 19:26 UTC

4 points

0 comments1 min readLW link

(arxiv.org)

What makes a theory of intelligence useful?

Cole Wyeth20 Feb 2025 19:22 UTC

16 points

0 comments11 min readLW link

AI #104: American State Capacity on the Brink

Zvi20 Feb 2025 14:50 UTC

37 points

9 comments44 min readLW link

(thezvi.wordpress.com)

US AI Safety Institute will be ‘gutted,’ Axios reports

Matrice Jacobine20 Feb 2025 14:40 UTC

11 points

1 comment1 min readLW link

(www.zdnet.com)

Human-AI Relationality is Already Here

bridgebot20 Feb 2025 7:08 UTC

17 points

0 comments15 min readLW link

Safe Distillation With a Powerful Untrusted AI

Alek Westover20 Feb 2025 3:14 UTC

5 points

1 comment5 min readLW link

Modularity and assembly: AI safety via thinking smaller

D Wong20 Feb 2025 0:58 UTC

2 points

0 comments11 min readLW link

(criticalreason.substack.com)

Eliezer’s Lost Alignment Articles / The Arbital Sequence

Ruby and RobertM

20 Feb 2025 0:48 UTC

207 points

10 comments5 min readLW link

Arbital has been imported to LessWrong

RobertM, jimrandomh, Ben Pace and Ruby

20 Feb 2025 0:47 UTC

281 points

30 comments5 min readLW link

The Dilemma’s Dilemma

James Stephen Brown19 Feb 2025 23:50 UTC

9 points

12 comments7 min readLW link

(nonzerosum.games)

[Question] Why do we have the NATO logo?

KvmanThinking19 Feb 2025 22:59 UTC

1 point

4 comments1 min readLW link

Metaculus Q4 AI Benchmarking: Bots Are Closing The Gap

Molly and Tom Liptay

19 Feb 2025 22:42 UTC

13 points

0 comments13 min readLW link

(www.metaculus.com)

Several Arguments Against the Mathematical Universe Hypothesis

Vittu Perkele19 Feb 2025 22:13 UTC

−4 points

6 comments3 min readLW link

(open.substack.com)

Literature Review of Text AutoEncoders

NickyP19 Feb 2025 21:54 UTC

20 points

5 comments8 min readLW link

DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability

garrison19 Feb 2025 21:02 UTC

10 points

1 comment3 min readLW link

(garrisonlovely.substack.com)

Won’t vs. Can’t: Sandbagging-like Behavior from Claude Models

Joe Benton and Zachary Witten

19 Feb 2025 20:47 UTC

15 points

1 comment1 min readLW link

(alignment.anthropic.com)

AI Alignment and the Financial War Against Narcissistic Manipulation

henophilia19 Feb 2025 20:42 UTC

−17 points

2 comments3 min readLW link