RSS

The Fu­ture of Align­ing Deep Learn­ing sys­tems will prob­a­bly look like “train­ing on in­terp”

williawa20 Mar 2026 23:06 UTC
8 points
1 comment4 min readLW link

An agent au­tonomously builds a 1.5 GHz Linux-ca­pa­ble RISC-V CPU

sanxiyn20 Mar 2026 23:03 UTC
16 points
0 comments2 min readLW link
(arxiv.org)

Un­trusted mon­i­tor­ing: ex­tra bits

Morgan S20 Mar 2026 21:32 UTC
7 points
0 comments15 min readLW link

Find­ing fea­tures in Trans­form­ers: Con­trastive di­rec­tions elicit stronger low-level per­tur­ba­tion re­sponses than baselines

20 Mar 2026 21:09 UTC
18 points
1 comment6 min readLW link

ARENA 7.0 Im­pact Report

20 Mar 2026 17:09 UTC
7 points
0 comments21 min readLW link

The Fed­eral AI Policy Frame­work: An Im­prove­ment, But My Offer Is (Still Al­most) Nothing

Zvi20 Mar 2026 16:51 UTC
20 points
0 comments8 min readLW link
(thezvi.wordpress.com)

Con­fu­sion around the term re­ward hacking

ariana_azarbal20 Mar 2026 16:13 UTC
29 points
5 comments5 min readLW link

The Distaff Texts

Tomás B.20 Mar 2026 15:05 UTC
44 points
1 comment14 min readLW link

It’s a Good Thing to Re­spond to In­ter­net Trolls

Bowl of Cereal20 Mar 2026 14:22 UTC
1 point
2 comments2 min readLW link

Un­trusted Mon­i­tor­ing is De­fault; Trusted Mon­i­tor­ing is not

J Bostock20 Mar 2026 14:10 UTC
19 points
0 comments4 min readLW link

Against Mes­si­anic AI: Why Op­ti­miz­ing the En­vi­ron­ment Doesn’t Op­ti­mize the Agent

Nathan Heath20 Mar 2026 12:40 UTC
−3 points
0 comments3 min readLW link

2nd (Unoffi­cial) ACX Weekend

Fernand020 Mar 2026 12:13 UTC
1 point
0 comments1 min readLW link

Why I am not buy­ing IPv4 ad­dresses as an investment

samuelshadrach20 Mar 2026 9:02 UTC
5 points
1 comment5 min readLW link
(samuelshadrach.com)

Hun­dred ways a su­per­in­tel­li­gence could kill you (non-se­ri­ous ex­er­cise)

samuelshadrach20 Mar 2026 8:58 UTC
2 points
1 comment6 min readLW link
(samuelshadrach.com)

In­ter­net anonymity with­out Tor

samuelshadrach20 Mar 2026 8:52 UTC
3 points
0 comments3 min readLW link
(samuelshadrach.com)

No, You Don’t Need Self-Lo­cat­ing Ev­i­dence.

Ape in the coat20 Mar 2026 5:38 UTC
9 points
2 comments5 min readLW link
(substack.com)

The Low Hang­ing Fruit of AI Self Improvement

HunterJay20 Mar 2026 4:09 UTC
1 point
0 comments5 min readLW link

Nul­lius in Verba

Aurelia20 Mar 2026 3:19 UTC
90 points
10 comments12 min readLW link

Does He­brew Have Verbs?

Benquo20 Mar 2026 3:04 UTC
27 points
4 comments6 min readLW link

Pos­i­tive-sum in­ter­ac­tions be­tween play­ers with lin­ear util­ity in resources

Cleo Nardo20 Mar 2026 0:42 UTC
10 points
0 comments2 min readLW link