Power Lies Trem­bling: a three-book review

Richard_NgoFeb 22, 2025, 10:57 PM
214 points
29 comments15 min readLW link
(www.mindthefuture.info)

Trans­former Dy­nam­ics: a neuro-in­spired ap­proach to MechInterp

Feb 22, 2025, 9:33 PM
11 points
0 comments5 min readLW link

Re­cur­sive Cog­ni­tive Refine­ment (RCR): A Self-Cor­rect­ing Ap­proach for LLM Hallucinations

mxTheoFeb 22, 2025, 9:32 PM
0 points
0 comments2 min readLW link

Grad­ual Disem­pow­er­ment: Simplified

AnnapurnaFeb 22, 2025, 4:59 PM
10 points
1 comment1 min readLW link
(jorgevelez.substack.com)

AI Apoca­lypse and the Buddha

pchvykovFeb 22, 2025, 4:33 PM
−17 points
6 comments9 min readLW link

Unal­igned AGI & Brief His­tory of Inequality

ankFeb 22, 2025, 4:26 PM
−20 points
4 comments7 min readLW link

HPMOR An­niver­sary Guide

ScrewtapeFeb 22, 2025, 4:17 PM
63 points
7 comments3 min readLW link

Fore­cast­ing Un­con­trol­led Spread of AI

Alvin ÅnestrandFeb 22, 2025, 1:05 PM
2 points
0 comments10 min readLW link
(forecastingaifutures.substack.com)

See­ing Through the Eyes of the Algorithm

silentbobFeb 22, 2025, 11:54 AM
18 points
3 comments10 min readLW link

Proselytizing

lsusrFeb 22, 2025, 11:54 AM
49 points
3 comments2 min readLW link

Work­shop: In­ter­pretabil­ity in LLMs us­ing Geo­met­ric and Statis­ti­cal Methods

Karthik ViswanathanFeb 22, 2025, 9:39 AM
17 points
0 comments8 min readLW link

In­for­ma­tion through­put of biolog­i­cal hu­mans and fron­tier LLMs

benwrFeb 22, 2025, 7:15 AM
12 points
0 comments1 min readLW link

Ineffi­cien­cies in Phar­ma­ceu­ti­cal Re­search Practices

ErioirEFeb 22, 2025, 4:43 AM
20 points
2 comments5 min readLW link

Build a Me­tac­u­lus Fore­cast­ing Bot in 30 Minutes: A Prac­ti­cal Guide

ChristianWilliamsFeb 22, 2025, 3:52 AM
6 points
0 commentsLW link

In­tel­li­gence–Agency Equiv­alence ≈ Mass–En­ergy Equiv­alence: On Static Na­ture of In­tel­li­gence & Phys­i­cal­iza­tion of Ethics

ankFeb 22, 2025, 12:12 AM
1 point
0 comments6 min readLW link

Align­ment can be the ‘clean en­ergy’ of AI

Feb 22, 2025, 12:08 AM
67 points
8 comments8 min readLW link

The Sorry State of AI X-Risk Ad­vo­cacy, and Thoughts on Do­ing Better

Thane RuthenisFeb 21, 2025, 8:15 PM
148 points
51 comments6 min readLW link

ParaS­copes: Do Lan­guage Models Plan the Up­com­ing Para­graph?

NickyPFeb 21, 2025, 4:50 PM
36 points
2 comments20 min readLW link

Lin­guis­tic Im­pe­ri­al­ism in AI: En­forc­ing Hu­man-Read­able Chain-of-Thought

Lukas PeterssonFeb 21, 2025, 3:45 PM
5 points
0 comments5 min readLW link
(lukaspetersson.com)

On OpenAI’s Model Spec 2.0

ZviFeb 21, 2025, 2:10 PM
52 points
4 comments43 min readLW link
(thezvi.wordpress.com)

Longter­mist im­pli­ca­tions of aliens Space-Far­ing Civ­i­liza­tions—Introduction

Maxime RichéFeb 21, 2025, 12:08 PM
21 points
0 comments6 min readLW link

MAISU—Min­i­mal AI Safety Un­con­fer­ence

Linda LinseforsFeb 21, 2025, 11:36 AM
19 points
2 comments2 min readLW link

The case for the death penalty

Yair HalberstadtFeb 21, 2025, 8:30 AM
26 points
80 comments5 min readLW link

Make Su­per­in­tel­li­gence Loving

Davey MorseFeb 21, 2025, 6:07 AM
8 points
9 comments5 min readLW link

Fun, end­less art de­bates v. morally charged art de­bates that are in­trin­si­cally endless

danielechlinFeb 21, 2025, 4:44 AM
6 points
2 comments2 min readLW link

The Take­off Speeds Model Pre­dicts We May Be En­ter­ing Crunch Time

johncroxFeb 21, 2025, 2:26 AM
44 points
3 comments18 min readLW link
(readtheoom.substack.com)

Hu­mans are Just Self Aware In­tel­li­gent Biolog­i­cal Machines

asksathvikFeb 21, 2025, 1:03 AM
3 points
9 comments2 min readLW link
(asksathvik.substack.com)

Pre-ASI: The case for an en­light­ened mind, cap­i­tal, and AI liter­acy in max­i­miz­ing the good life

NoahhFeb 21, 2025, 12:03 AM
5 points
5 comments6 min readLW link
(open.substack.com)

Ti­maeus in 2024

Feb 20, 2025, 11:54 PM
99 points
1 comment8 min readLW link

Biolog­i­cal hu­mans col­lec­tively ex­ert at most 400 gi­gabits/​s of con­trol over the world.

benwrFeb 20, 2025, 11:44 PM
15 points
3 comments1 min readLW link

The first RCT for GLP-1 drugs and al­co­holism isn’t what we hoped

dynomightFeb 20, 2025, 10:30 PM
60 points
4 comments6 min readLW link
(dynomight.net)

Pub­lished re­port: Path­ways to short TAI timelines

Zershaaneh QureshiFeb 20, 2025, 10:10 PM
22 points
0 commentsLW link
(www.convergenceanalysis.org)

Neu­ral Scal­ing Laws Rooted in the Data Distribution

aribrillFeb 20, 2025, 9:22 PM
7 points
0 comments1 min readLW link
(arxiv.org)

De­mon­strat­ing speci­fi­ca­tion gam­ing in rea­son­ing models

Matrice JacobineFeb 20, 2025, 7:26 PM
4 points
0 comments1 min readLW link
(arxiv.org)

What makes a the­ory of in­tel­li­gence use­ful?

Cole WyethFeb 20, 2025, 7:22 PM
16 points
0 comments11 min readLW link

AI #104: Amer­i­can State Ca­pac­ity on the Brink

ZviFeb 20, 2025, 2:50 PM
37 points
9 comments44 min readLW link
(thezvi.wordpress.com)

US AI Safety In­sti­tute will be ‘gut­ted,’ Ax­ios reports

Matrice JacobineFeb 20, 2025, 2:40 PM
11 points
1 commentLW link
(www.zdnet.com)

Hu­man-AI Re­la­tion­al­ity is Already Here

bridgebotFeb 20, 2025, 7:08 AM
17 points
0 comments15 min readLW link

Safe Distil­la­tion With a Pow­er­ful Un­trusted AI

Alek WestoverFeb 20, 2025, 3:14 AM
5 points
1 comment5 min readLW link

Mo­du­lar­ity and as­sem­bly: AI safety via think­ing smaller

D WongFeb 20, 2025, 12:58 AM
2 points
0 comments11 min readLW link
(criticalreason.substack.com)

Eliezer’s Lost Align­ment Ar­ti­cles /​ The Ar­bital Sequence

Feb 20, 2025, 12:48 AM
207 points
10 comments5 min readLW link

Ar­bital has been im­ported to LessWrong

Feb 20, 2025, 12:47 AM
281 points
30 comments5 min readLW link

The Dilemma’s Dilemma

James Stephen BrownFeb 19, 2025, 11:50 PM
6 points
11 comments7 min readLW link
(nonzerosum.games)

[Question] Why do we have the NATO logo?

KvmanThinkingFeb 19, 2025, 10:59 PM
1 point
4 comments1 min readLW link

Me­tac­u­lus Q4 AI Bench­mark­ing: Bots Are Clos­ing The Gap

Feb 19, 2025, 10:42 PM
13 points
0 comments13 min readLW link
(www.metaculus.com)

Sev­eral Ar­gu­ments Against the Math­e­mat­i­cal Uni­verse Hy­poth­e­sis

Vittu PerkeleFeb 19, 2025, 10:13 PM
−4 points
6 comments3 min readLW link
(open.substack.com)

Liter­a­ture Re­view of Text AutoEncoders

NickyPFeb 19, 2025, 9:54 PM
20 points
5 comments8 min readLW link

Deep­Seek Made it Even Harder for US AI Com­pa­nies to Ever Reach Profitability

garrisonFeb 19, 2025, 9:02 PM
10 points
1 commentLW link
(garrisonlovely.substack.com)

Won’t vs. Can’t: Sand­bag­ging-like Be­hav­ior from Claude Models

Feb 19, 2025, 8:47 PM
15 points
1 comment1 min readLW link
(alignment.anthropic.com)

AI Align­ment and the Fi­nan­cial War Against Nar­cis­sis­tic Manipulation

henophiliaFeb 19, 2025, 8:42 PM
−17 points
2 comments3 min readLW link