Im­proved vi­su­al­iza­tions of METR Time Hori­zons pa­per.

LDJMar 19, 2025, 11:36 PM
20 points
4 comments2 min readLW link

Is CCP au­thor­i­tar­i­anism good for build­ing safe AI?

HrussMar 19, 2025, 11:13 PM
1 point
0 comments1 min readLW link

The case against “The case against AI al­ign­ment”

KvmanThinkingMar 19, 2025, 10:40 PM
2 points
0 comments1 min readLW link

[Question] Su­per­in­tel­li­gence Strat­egy: A Prag­matic Path to… Doom?

Mr BeastlyMar 19, 2025, 10:30 PM
6 points
0 comments3 min readLW link

SHIFT re­lies on to­ken-level fea­tures to de-bias Bias in Bios probes

Tim HuaMar 19, 2025, 9:29 PM
39 points
2 comments6 min readLW link

Janet must die

ShmiMar 19, 2025, 8:35 PM
12 points
3 comments2 min readLW link

[Question] Why am I get­ting down­voted on Less­wrong?

OxidizeMar 19, 2025, 6:32 PM
7 points
14 comments1 min readLW link

Fore­cast­ing AI Fu­tures Re­source Hub

Alvin ÅnestrandMar 19, 2025, 5:26 PM
2 points
0 comments2 min readLW link
(forecastingaifutures.substack.com)

TBC epi­sode w Dave Kas­ten from Con­trol AI on AI Policy

EneaszMar 19, 2025, 5:09 PM
8 points
0 comments1 min readLW link
(www.thebayesianconspiracy.com)

Pri­ori­tiz­ing threats for AI control

ryan_greenblattMar 19, 2025, 5:09 PM
58 points
2 comments10 min readLW link

The Illu­sion of Trans­parency as a Trust-Build­ing Mechanism

Priyanka BharadwajMar 19, 2025, 5:09 PM
2 points
0 comments1 min readLW link

How Do We Govern AI Well?

kaimeMar 19, 2025, 5:08 PM
2 points
0 comments25 min readLW link

METR: Mea­sur­ing AI Abil­ity to Com­plete Long Tasks

Zach Stein-PerlmanMar 19, 2025, 4:00 PM
241 points
104 comments5 min readLW link
(metr.org)

Why I think AI will go poorly for humanity

Alek WestoverMar 19, 2025, 3:52 PM
13 points
0 comments30 min readLW link

The prin­ci­ple of ge­nomic liberty

TsviBTMar 19, 2025, 2:27 PM
76 points
51 comments17 min readLW link

Go­ing Nova

ZviMar 19, 2025, 1:30 PM
64 points
14 comments15 min readLW link
(thezvi.wordpress.com)

Equa­tions Mean Things

abstractapplicMar 19, 2025, 8:16 AM
46 points
10 comments3 min readLW link

Elite Co­or­di­na­tion via the Con­sen­sus of Power

Richard_NgoMar 19, 2025, 6:56 AM
92 points
15 comments12 min readLW link
(www.mindthefuture.info)

What I am work­ing on right now and why: rep­re­sen­ta­tion en­g­ineer­ing edition

Lukasz G BartoszczeMar 18, 2025, 10:37 PM
3 points
0 comments3 min readLW link

Boots the­ory and Sy­bil Ramkin

philhMar 18, 2025, 10:10 PM
37 points
17 comments11 min readLW link
(reasonableapproximation.net)

Sch­midt Sciences Tech­ni­cal AI Safety RFP on In­fer­ence-Time Com­pute – Dead­line: April 30

Ryan GajarawalaMar 18, 2025, 6:05 PM
18 points
0 comments2 min readLW link
(www.schmidtsciences.org)

PRISM: Per­spec­tive Rea­son­ing for In­te­grated Syn­the­sis and Me­di­a­tion (In­ter­ac­tive Demo)

Anthony DiamondMar 18, 2025, 6:03 PM
10 points
2 comments1 min readLW link

Sub­space Rer­out­ing: Us­ing Mechanis­tic In­ter­pretabil­ity to Craft Ad­ver­sar­ial At­tacks against Large Lan­guage Models

Le magicien quantiqueMar 18, 2025, 5:55 PM
6 points
1 comment10 min readLW link

Progress links and short notes, 2025-03-18

jasoncrawfordMar 18, 2025, 5:14 PM
8 points
0 comments3 min readLW link
(newsletter.rootsofprogress.org)

The Con­ver­gent Path to the Stars

Maxime RichéMar 18, 2025, 5:09 PM
6 points
0 comments20 min readLW link

Sapir-Whorf Ego Death

Jonathan MoregårdMar 18, 2025, 4:57 PM
8 points
7 comments2 min readLW link
(honestliving.substack.com)

Smel­ling Nice is Good, Actually

Gordon Seidoh WorleyMar 18, 2025, 4:54 PM
28 points
8 comments3 min readLW link
(uncertainupdates.substack.com)

A Tax­on­omy of Jobs Deeply Re­sis­tant to TAI Automation

Deric ChengMar 18, 2025, 4:25 PM
9 points
0 comments12 min readLW link
(www.convergenceanalysis.org)

Why Are The Hu­man Sciences Hard? Two New Hypotheses

Mar 18, 2025, 3:45 PM
39 points
14 comments9 min readLW link

Go home GPT-4o, you’re drunk: emer­gent mis­al­ign­ment as low­ered inhibitions

Mar 18, 2025, 2:48 PM
79 points
12 comments5 min readLW link

[Question] What is the the­ory of change be­hind writ­ing pa­pers about AI safety?

KajusMar 18, 2025, 12:51 PM
7 points
1 comment1 min readLW link

OpenAI #11: Amer­ica Ac­tion Plan

ZviMar 18, 2025, 12:50 PM
83 points
3 comments6 min readLW link
(thezvi.wordpress.com)

I changed my mind about orca intelligence

Towards_KeeperhoodMar 18, 2025, 10:15 AM
46 points
24 comments5 min readLW link

[Question] Is Peano ar­ith­metic try­ing to kill us? Do we care?

Q HomeMar 18, 2025, 8:22 AM
17 points
2 comments2 min readLW link

Do What the Mam­mals Do

CrimsonChinMar 18, 2025, 3:57 AM
2 points
6 comments4 min readLW link

What Ac­tu­ally Mat­ters Un­til We Reach the Singularity

LexiusMar 18, 2025, 2:17 AM
−1 points
0 comments9 min readLW link

Mean­ing as a cog­ni­tive sub­sti­tute for sur­vival in­stincts: A thought experiment

Ovidijus ŠimkusMar 18, 2025, 1:53 AM
0 points
0 comments2 min readLW link

Against Yud­kowsky’s evolu­tion anal­ogy for AI x-risk [un­finished]

Fiora SunshineMar 18, 2025, 1:41 AM
50 points
18 comments11 min readLW link

An “AI re­searcher” has writ­ten a pa­per on op­ti­miz­ing AI ar­chi­tec­ture and op­ti­mized a lan­guage model to sev­eral or­ders of mag­ni­tude more effi­ciency.

Y BMar 18, 2025, 1:15 AM
3 points
1 comment1 min readLW link

LessOn­line 2025: Early Bird Tick­ets On Sale

Ben PaceMar 18, 2025, 12:22 AM
37 points
5 comments5 min readLW link

Feed­back loops for ex­er­cise (VO2Max)

ElizabethMar 18, 2025, 12:10 AM
63 points
12 comments8 min readLW link
(acesounderglass.com)

Fron­tierMath Score of o3-mini Much Lower Than Claimed

YafahEdelmanMar 17, 2025, 10:41 PM
61 points
7 comments1 min readLW link

Proof-of-Con­cept De­bug­ger for a Small LLM

Mar 17, 2025, 10:27 PM
27 points
0 comments11 min readLW link

Effec­tively Com­mu­ni­cat­ing with DC Policymakers

PolicyTakesMar 17, 2025, 10:11 PM
14 points
0 comments2 min readLW link

Mind the Gap

Bridgett KayMar 17, 2025, 9:59 PM
8 points
0 comments5 min readLW link
(dxmrevealed.wordpress.com)

EIS XV: A New Proof of Con­cept for Use­ful Interpretability

scasperMar 17, 2025, 8:05 PM
30 points
2 comments3 min readLW link

Sen­tinel’s Global Risks Weekly Roundup #11/​2025. Trump in­vokes Alien Ene­mies Act, Chi­nese in­va­sion barges de­ployed in ex­er­cise.

NunoSempereMar 17, 2025, 7:34 PM
59 points
3 comments6 min readLW link
(blog.sentinel-team.org)

Claude Son­net 3.7 (of­ten) knows when it’s in al­ign­ment evaluations

Mar 17, 2025, 7:11 PM
182 points
9 comments6 min readLW link

Things Look Bleak for White-Col­lar Jobs Due to AI Acceleration

Declan MolonyMar 17, 2025, 5:03 PM
15 points
0 comments10 min readLW link

Three Types of In­tel­li­gence Explosion

Mar 17, 2025, 2:47 PM
39 points
8 comments3 min readLW link
(www.forethought.org)