RSS

Vi­talik’s Re­sponse to AI 2027

Daniel Kokotajlo11 Jul 2025 21:43 UTC
97 points
33 comments12 min readLW link
(vitalik.eth.limo)

My fa­vorite mind­set for flirting

Chris Lakin11 Jul 2025 21:31 UTC
28 points
0 comments1 min readLW link
(chrislakin.blog)

OpenAI Model Differ­en­ti­a­tion 101

Zvi11 Jul 2025 20:30 UTC
24 points
5 comments11 min readLW link
(thezvi.wordpress.com)

Ad­ding noise to a sand­bag­ging model can re­veal its true capabilities

TheManxLoiner11 Jul 2025 16:56 UTC
5 points
0 comments6 min readLW link

Reflec­tions on AI Com­pan­ion­ship and Ra­tional Vuln­er­a­bil­ity (Or, how I al­most fell in love with an anime Cat­girl LLM).

Clock11 Jul 2025 16:12 UTC
7 points
2 comments8 min readLW link

The Per­ils of Op­ti­miz­ing Learned Re­ward Functions

Lukas Fluri11 Jul 2025 16:06 UTC
14 points
1 comment21 min readLW link

Every Uni­verse Thinks It’s the Realest One

Commander Zander11 Jul 2025 15:45 UTC
15 points
1 comment4 min readLW link

Im­plicit and Ex­plicit Learning

Remmelt11 Jul 2025 1:44 UTC
6 points
2 comments5 min readLW link

So You Think You’ve Awo­ken ChatGPT

JustisMills11 Jul 2025 1:01 UTC
149 points
26 comments9 min readLW link

Mea­sur­ing the Im­pact of Early-2025 AI on Ex­pe­rienced Open-Source Devel­oper Productivity

habryka11 Jul 2025 0:23 UTC
87 points
30 comments6 min readLW link
(metr.org)

On think­ing about AI risks concretely

zeshen11 Jul 2025 0:04 UTC
6 points
4 comments4 min readLW link

Me­tacog­ni­tion and Self-Model­ing in LLMs

Christopher Ackerman10 Jul 2025 21:25 UTC
12 points
2 comments16 min readLW link

My take on AI Align­ment: Cor­po­rate mis­al­ign­ment and DAOs

act6510 Jul 2025 20:33 UTC
7 points
1 comment1 min readLW link

what makes Claude 3 Opus misaligned

janus10 Jul 2025 20:06 UTC
92 points
12 comments5 min readLW link

The Tenets of a Ra­tional Debate

sd10 Jul 2025 19:25 UTC
5 points
2 comments4 min readLW link

The Ris­ing Premium of Life, Or: How We Learned to Start Wor­ry­ing and Fear Everything

Linch10 Jul 2025 19:12 UTC
9 points
10 comments1 min readLW link
(linch.substack.com)

Les­sons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC
143 points
23 comments4 min readLW link

Linkpost: Red­wood Re­search read­ing list

Julian Stastny10 Jul 2025 18:39 UTC
42 points
0 comments1 min readLW link
(redwoodresearch.substack.com)

Gen­er­al­ized Han­gri­ness: A Stan­dard Ra­tion­al­ist Stance Toward Emotions

johnswentworth10 Jul 2025 18:22 UTC
206 points
16 comments7 min readLW link

The bit­ter les­son of mi­suse detection

10 Jul 2025 14:50 UTC
26 points
6 comments7 min readLW link