RSS

In­ten­tion to Treat

Alicorn20 Mar 2025 20:01 UTC
68 points
3 comments2 min readLW link

Every­thing’s An Emer­gency

omnizoid20 Mar 2025 17:12 UTC
7 points
0 comments2 min readLW link

Non-Con­sen­sual Con­sent: The Perfor­mance of Choice in a Co­er­cive World

Alex_Steiner20 Mar 2025 17:12 UTC
9 points
0 comments13 min readLW link

[Question] How far along Metr’s law can AI start au­tomat­ing or helping with al­ign­ment re­search?

Christopher King20 Mar 2025 15:58 UTC
17 points
13 comments1 min readLW link

What is an al­ign­ment tax?

20 Mar 2025 13:06 UTC
2 points
0 comments1 min readLW link
(aisafety.info)

Longter­mist Im­pli­ca­tions of the Ex­is­tence Neu­tral­ity Hypothesis

Maxime Riché20 Mar 2025 12:20 UTC
2 points
2 comments21 min readLW link

Defense Against The Su­per-Worms

viemccoy20 Mar 2025 7:24 UTC
11 points
0 comments2 min readLW link

So­cially Grace­ful Degradation

Screwtape20 Mar 2025 4:03 UTC
37 points
0 comments9 min readLW link

Daniel Den­nett, the Unity of Con­scious­ness, and An­i­mal Minds

stormykat20 Mar 2025 3:43 UTC
1 point
0 comments6 min readLW link

Im­proved vi­su­al­iza­tions of METR Time Hori­zons pa­per.

LDJ19 Mar 2025 23:36 UTC
20 points
4 comments2 min readLW link

The case against “The case against AI al­ign­ment”

KvmanThinking19 Mar 2025 22:40 UTC
2 points
0 comments1 min readLW link

[Question] Su­per­in­tel­li­gence Strat­egy: A Prag­matic Path to… Doom?

Mr Beastly19 Mar 2025 22:30 UTC
6 points
0 comments3 min readLW link

SHIFT re­lies on to­ken-level fea­tures to de-bias Bias in Bios probes

Tim Hua19 Mar 2025 21:29 UTC
28 points
2 comments6 min readLW link

Fore­cast­ing AI Fu­tures Re­source Hub

Alvin Ånestrand19 Mar 2025 17:26 UTC
2 points
0 comments2 min readLW link
(forecastingaifutures.substack.com)

TBC epi­sode w Dave Kas­ten from Con­trol AI on AI Policy

Eneasz19 Mar 2025 17:09 UTC
8 points
0 comments1 min readLW link
(www.thebayesianconspiracy.com)

Pri­ori­tiz­ing threats for AI control

ryan_greenblatt19 Mar 2025 17:09 UTC
47 points
2 comments10 min readLW link

The Illu­sion of Trans­parency as a Trust-Build­ing Mechanism

Priyanka Bharadwaj19 Mar 2025 17:09 UTC
1 point
0 comments1 min readLW link

How Do We Govern AI Well?

kaime19 Mar 2025 17:08 UTC
2 points
0 comments25 min readLW link

METR: Mea­sur­ing AI Abil­ity to Com­plete Long Tasks

Zach Stein-Perlman19 Mar 2025 16:00 UTC
173 points
42 comments1 min readLW link
(metr.org)

Why I think AI will go poorly for humanity

Alek Westover19 Mar 2025 15:52 UTC
11 points
0 comments30 min readLW link