RSS

METR: Mea­sur­ing AI Abil­ity to Com­plete Long Tasks

Zach Stein-Perlman19 Mar 2025 16:00 UTC
173 points
44 comments1 min readLW link
(metr.org)

Why abor­tion looks more okay to us than kil­ling babies

cousin_it24 Nov 2010 10:08 UTC
25 points
67 comments1 min readLW link

[Question] How far along Metr’s law can AI start au­tomat­ing or helping with al­ign­ment re­search?

Christopher King20 Mar 2025 15:58 UTC
18 points
17 comments1 min readLW link

Blues, Greens and abortion

Snowyowl5 Mar 2011 19:15 UTC
17 points
158 comments1 min readLW link

Why White-Box Redteam­ing Makes Me Feel Weird

Zygi Straznickas16 Mar 2025 18:54 UTC
171 points
28 comments3 min readLW link

[Question] Any mis­takes in my un­der­stand­ing of Trans­form­ers?

Kallistos21 Mar 2025 0:34 UTC
1 point
0 comments1 min readLW link

The prin­ci­ple of ge­nomic liberty

TsviBT19 Mar 2025 14:27 UTC
87 points
16 comments17 min readLW link

A Cri­tique of “Utility”

Zero Contradictions20 Mar 2025 23:21 UTC
−6 points
1 comment2 min readLW link
(thewaywardaxolotl.blogspot.com)

In­ter­pretabil­ity as Com­pres­sion: Re­con­sid­er­ing SAE Ex­pla­na­tions of Neu­ral Ac­ti­va­tions with MDL-SAEs

23 Aug 2024 18:52 UTC
42 points
8 comments16 min readLW link

[Question] Why am I get­ting down­voted on Less­wrong?

Oxidize19 Mar 2025 18:32 UTC
4 points
13 comments1 min readLW link

Counter-the­ses on Sleep

Natália21 Mar 2022 23:21 UTC
447 points
135 comments15 min readLW link1 review

Alge­braic Linguistics

abstractapplic7 Dec 2024 19:18 UTC
35 points
29 comments5 min readLW link

How AI Takeover Might Hap­pen in 2 Years

joshc7 Feb 2025 17:10 UTC
391 points
131 comments29 min readLW link
(x.com)

In­ten­tion to Treat

Alicorn20 Mar 2025 20:01 UTC
70 points
3 comments2 min readLW link

Fron­tierMath Score of o3-mini Much Lower Than Claimed

YafahEdelman17 Mar 2025 22:41 UTC
48 points
7 comments1 min readLW link

Why Are The Hu­man Sciences Hard? Two New Hypotheses

18 Mar 2025 15:45 UTC
20 points
7 comments9 min readLW link

A Path out of In­suffi­cient Views

Unreal24 Sep 2024 20:00 UTC
40 points
53 comments9 min readLW link

An­thropic: Progress from our Fron­tier Red Team

UnofficialLinkpostBot20 Mar 2025 19:12 UTC
2 points
0 comments6 min readLW link
(www.anthropic.com)

The Geom­e­try of Lin­ear Re­gres­sion ver­sus PCA

criticalpoints23 Feb 2025 21:01 UTC
20 points
5 comments6 min readLW link
(eregis.github.io)

Emer­gent Misal­ign­ment: Nar­row fine­tun­ing can pro­duce broadly mis­al­igned LLMs

25 Feb 2025 17:39 UTC
321 points
88 comments4 min readLW link