In­san­i­tyBench: Cryp­tic Puz­zles as a Probe for Lat­eral Thinking

RobinHa22 Feb 2026 14:20 UTC
15 points
0 comments4 min readLW link
(www.robinhaselhorst.com)

The world won’t end, but we should be ashamed for trying

George3d622 Feb 2026 13:01 UTC
−2 points
0 comments12 min readLW link
(cerebralab.com)

Mul­ti­ple In­de­pen­dent Se­man­tic Axes in Gemma 3 270M

CharlesL22 Feb 2026 1:55 UTC
12 points
0 comments3 min readLW link

A Tax­on­omy of Traces

aleph_four22 Feb 2026 1:28 UTC
0 points
0 comments10 min readLW link

Hier­ar­chi­cal Goal In­duc­tion With Ethics

aleph_four22 Feb 2026 0:53 UTC
5 points
0 comments4 min readLW link

Did Claude 3 Opus al­ign it­self via gra­di­ent hack­ing?

Fiora Starlight21 Feb 2026 22:24 UTC
171 points
9 comments19 min readLW link

If you don’t feel deeply con­fused about AGI risk, some­thing’s wrong

Dave Banerjee21 Feb 2026 15:34 UTC
62 points
8 comments5 min readLW link
(open.substack.com)

Ponzi schemes as a demon­stra­tion of out-of-dis­tri­bu­tion generalization

TFD21 Feb 2026 13:19 UTC
9 points
2 comments6 min readLW link
(www.thefloatingdroid.com)

LLMs and Liter­a­ture: Where Value Ac­tu­ally Comes From

derelict543221 Feb 2026 13:16 UTC
7 points
7 comments4 min readLW link

The Spec­tre haunt­ing the “AI Safety” Community

Gabriel Alfour21 Feb 2026 11:14 UTC
142 points
9 comments6 min readLW link
(cognition.cafe)

Align­ment to Evil

Matrice Jacobine21 Feb 2026 3:29 UTC
54 points
2 comments1 min readLW link
(tetraspace.substack.com)

Robert Sapolsky Is Sim­ply Not Talk­ing About Compatibilism

Julius21 Feb 2026 1:27 UTC
11 points
4 comments8 min readLW link
(thegreymatter.substack.com)

How will we do SFT on mod­els with opaque rea­son­ing?

21 Feb 2026 0:00 UTC
31 points
11 comments7 min readLW link

Agent-first con­text menus

Surya Kasturi20 Feb 2026 23:45 UTC
3 points
1 comment2 min readLW link

Ho­doscope: Vi­su­al­iza­tion for Effi­cient Hu­man Supervision

20 Feb 2026 23:41 UTC
7 points
0 comments2 min readLW link
(hodoscope.dev)

Can Cur­rent AI Match (or Out­match) Pro­fes­sion­als in Eco­nom­i­cally Valuable Tasks?

saahir.vazirani20 Feb 2026 21:38 UTC
6 points
0 comments5 min readLW link

METR’s 14h 50% Hori­zon Im­pacts The Econ­omy More Than ASI Timelines

Michaël Trazzi20 Feb 2026 21:08 UTC
40 points
11 comments2 min readLW link

New video from Pal­isade Re­search: No One Un­der­stands Why AI Works

peterbarnett20 Feb 2026 20:29 UTC
55 points
2 comments1 min readLW link
(www.youtube.com)

Mili­taries are go­ing au­tonomous. But will AI lead to new wars? A tour of re­cent research

Mordechai Rorvig20 Feb 2026 18:26 UTC
1 point
0 comments2 min readLW link
(www.foommagazine.org)

Un­prece­dented Catas­tro­phes Have Non-Canon­i­cal Probabilities

E.G. Blee-Goldman20 Feb 2026 18:23 UTC
7 points
2 comments14 min readLW link