Fron­tierMath Score of o3-mini Much Lower Than Claimed

YafahEdelmanMar 17, 2025, 10:41 PM
61 points
7 comments1 min readLW link

Proof-of-Con­cept De­bug­ger for a Small LLM

Mar 17, 2025, 10:27 PM
27 points
0 comments11 min readLW link

Effec­tively Com­mu­ni­cat­ing with DC Policymakers

PolicyTakesMar 17, 2025, 10:11 PM
14 points
0 comments2 min readLW link

Mind the Gap

Bridgett KayMar 17, 2025, 9:59 PM
8 points
0 comments5 min readLW link
(dxmrevealed.wordpress.com)

EIS XV: A New Proof of Con­cept for Use­ful Interpretability

scasperMar 17, 2025, 8:05 PM
30 points
2 comments3 min readLW link

Sen­tinel’s Global Risks Weekly Roundup #11/​2025. Trump in­vokes Alien Ene­mies Act, Chi­nese in­va­sion barges de­ployed in ex­er­cise.

NunoSempereMar 17, 2025, 7:34 PM
59 points
3 comments6 min readLW link
(blog.sentinel-team.org)

Claude Son­net 3.7 (of­ten) knows when it’s in al­ign­ment evaluations

Mar 17, 2025, 7:11 PM
184 points
9 comments6 min readLW link

Things Look Bleak for White-Col­lar Jobs Due to AI Acceleration

Declan MolonyMar 17, 2025, 5:03 PM
15 points
0 comments10 min readLW link

Three Types of In­tel­li­gence Explosion

Mar 17, 2025, 2:47 PM
40 points
8 comments3 min readLW link
(www.forethought.org)

An Ad­vent of Thought

KaarelMar 17, 2025, 2:21 PM
51 points
13 comments48 min readLW link

In­ter­ested in work­ing from a new Bos­ton AI Safety Hub?

Mar 17, 2025, 1:42 PM
17 points
0 comments2 min readLW link

Other Civ­i­liza­tions Would Re­cover 84+% of Our Cos­mic Re­sources—A Challenge to Ex­tinc­tion Risk Prioritization

Maxime RichéMar 17, 2025, 1:12 PM
5 points
0 comments12 min readLW link

Monthly Roundup #28: March 2025

ZviMar 17, 2025, 12:50 PM
31 points
8 comments14 min readLW link
(thezvi.wordpress.com)

Are cor­po­ra­tions su­per­in­tel­li­gent?

Mar 17, 2025, 10:36 AM
3 points
3 comments1 min readLW link
(aisafety.info)

One pager

samuelshadrachMar 17, 2025, 8:12 AM
6 points
2 comments8 min readLW link
(samuelshadrach.com)

The Case for AI Optimism

AnnapurnaMar 17, 2025, 1:29 AM
2 points
1 comment1 min readLW link
(nationalaffairs.com)

Notable run­away-op­ti­miser-like LLM failure modes on Biolog­i­cally and Eco­nom­i­cally al­igned AI safety bench­marks for LLMs with sim­plified ob­ser­va­tion for­mat (BioBlue)

Mar 16, 2025, 11:23 PM
45 points
8 comments11 min readLW link

Read More News

utilistrutilMar 16, 2025, 9:31 PM
22 points
2 comments5 min readLW link

What would a post la­bor econ­omy *ac­tu­ally* look like?

Ansh JunejaMar 16, 2025, 8:38 PM
2 points
2 comments17 min readLW link

Why White-Box Redteam­ing Makes Me Feel Weird

Zygi StraznickasMar 16, 2025, 6:54 PM
204 points
36 comments3 min readLW link

How I’ve run ma­jor projects

benkuhnMar 16, 2025, 6:40 PM
126 points
10 comments8 min readLW link
(www.benkuhn.net)

Count­ing Ob­jec­tions to Housing

jefftkMar 16, 2025, 6:20 PM
13 points
7 comments3 min readLW link
(www.jefftk.com)

I make sev­eral mil­lion dol­lars per year and have hun­dreds of thou­sands of fol­low­ers—what is the straight­est line path to uti­liz­ing these re­sources to re­duce ex­is­ten­tial-level AI threats?

shrimpyMar 16, 2025, 4:52 PM
161 points
26 comments1 min readLW link

Sibe­rian Arc­tic ori­gins of East Asian psy­chol­ogy

davidsunMar 16, 2025, 4:52 PM
6 points
0 comments1 min readLW link

AI Model His­tory is Be­ing Lost

ValeMar 16, 2025, 12:38 PM
19 points
1 comment1 min readLW link
(vale.rocks)

Me­tacog­ni­tion Broke My Nail-Bit­ing Habit

RafkaMar 16, 2025, 12:36 PM
45 points
20 comments2 min readLW link

[Question] Can we ever en­sure AI al­ign­ment if we can only test AI per­sonas?

Karl von WendtMar 16, 2025, 8:06 AM
22 points
8 comments1 min readLW link

Can time prefer­ences make AI safe?

TerriLeafMar 15, 2025, 9:41 PM
1 point
1 comment2 min readLW link

Help make the orca lan­guage ex­per­i­ment happen

Towards_KeeperhoodMar 15, 2025, 9:39 PM
9 points
12 comments5 min readLW link

An­nounc­ing EXP: Ex­per­i­men­tal Sum­mer Work­shop on Col­lec­tive Cognition

Mar 15, 2025, 8:14 PM
36 points
2 comments4 min readLW link

AI Self-Cor­rec­tion vs. Self-Reflec­tion: Is There a Fun­da­men­tal Differ­ence?

Project SolonMar 15, 2025, 6:24 PM
−3 points
0 comments1 min readLW link

The Fork in the Road

testingthewatersMar 15, 2025, 5:36 PM
14 points
12 comments2 min readLW link

Any-Benefit Mind­set and Any-Rea­son Reasoning

silentbobMar 15, 2025, 5:10 PM
36 points
9 comments6 min readLW link

The Silent War: AGI-on-AGI War­fare and What It Means For Us

funnyfrancoMar 15, 2025, 3:24 PM
−1 points
2 comments22 min readLW link

Paper: Field-build­ing and the epistemic cul­ture of AI safety

peterslatteryMar 15, 2025, 12:30 PM
13 points
3 comments3 min readLW link
(firstmonday.org)

Why Billion­aires Will Not Sur­vive an AGI Ex­tinc­tion Event

funnyfrancoMar 15, 2025, 6:08 AM
8 points
0 comments14 min readLW link

AI Says It’s Not Con­scious. That’s a Bad An­swer to the Wrong Ques­tion.

JohnMarkNormanMar 15, 2025, 1:25 AM
1 point
0 comments2 min readLW link

Re­port & ret­ro­spec­tive on the Dove­tail fellowship

Alex_AltairMar 14, 2025, 11:20 PM
26 points
3 comments9 min readLW link

The Dangers of Out­sourc­ing Think­ing: Los­ing Our Crit­i­cal Think­ing to the Over-Reli­ance on AI De­ci­sion-Making

Cameron Tomé-MoreiraMar 14, 2025, 11:07 PM
11 points
4 comments8 min readLW link

LLMs may en­able di­rect democ­racy at scale

Davey MorseMar 14, 2025, 10:51 PM
14 points
20 comments1 min readLW link

2024 Unoffi­cial LessWrong Sur­vey Results

ScrewtapeMar 14, 2025, 10:29 PM
109 points
28 comments48 min readLW link

AI4Science: The Hid­den Power of Neu­ral Net­works in Scien­tific Discovery

Max MaMar 14, 2025, 9:18 PM
2 points
2 comments1 min readLW link

What are we do­ing when we do math­e­mat­ics?

epicurusMar 14, 2025, 8:54 PM
7 points
2 comments1 min readLW link
(asving.com)

AI for Epistemics Hackathon

Austin ChenMar 14, 2025, 8:46 PM
77 points
12 comments10 min readLW link
(manifund.substack.com)

Geom­e­try of Fea­tures in Mechanis­tic Interpretability

Gunnar CarlssonMar 14, 2025, 7:11 PM
16 points
0 comments8 min readLW link

AI Tools for Ex­is­ten­tial Security

Mar 14, 2025, 6:38 PM
22 points
4 comments11 min readLW link
(www.forethought.org)

Cap­i­tal­ism as the Cat­a­lyst for AGI-In­duced Hu­man Extinction

funnyfrancoMar 14, 2025, 6:14 PM
−3 points
2 comments21 min readLW link

Minor in­ter­pretabil­ity ex­plo­ra­tion #3: Ex­tend­ing su­per­po­si­tion to differ­ent ac­ti­va­tion func­tions (loss land­scape)

Rareș BaronMar 14, 2025, 3:45 PM
3 points
0 comments3 min readLW link

AI for AI safety

Joe CarlsmithMar 14, 2025, 3:00 PM
78 points
13 comments17 min readLW link
(joecarlsmith.substack.com)

Eval­u­at­ing the ROI of Information

Declan MolonyMar 14, 2025, 2:22 PM
12 points
3 comments3 min readLW link