Vi­talik’s Re­sponse to AI 2027

Daniel Kokotajlo11 Jul 2025 21:43 UTC
116 points
53 comments12 min readLW link
(vitalik.eth.limo)

the jack­pot age

thiccythot11 Jul 2025 21:05 UTC
263 points
17 comments4 min readLW link

OpenAI Model Differ­en­ti­a­tion 101

Zvi11 Jul 2025 20:30 UTC
31 points
5 comments11 min readLW link
(thezvi.wordpress.com)

ACX Cape Town

teegs11 Jul 2025 18:46 UTC
1 point
0 comments1 min readLW link

Ad­ding noise to a sand­bag­ging model can re­veal its true capabilities

TheManxLoiner11 Jul 2025 16:56 UTC
16 points
1 comment6 min readLW link

Reflec­tions on AI Com­pan­ion­ship and Ra­tional Vuln­er­a­bil­ity (Or, how I al­most fell in love with an anime Cat­girl LLM).

Noah Weinberger11 Jul 2025 16:12 UTC
11 points
2 comments8 min readLW link

The Per­ils of Op­ti­miz­ing Learned Re­ward Functions

Lukas Fluri11 Jul 2025 16:06 UTC
17 points
1 comment21 min readLW link

Every Uni­verse Thinks It’s the Realest One

Commander Zander11 Jul 2025 15:45 UTC
15 points
1 comment4 min readLW link

On open-sci­ence re­search labs on dis­cord, and get­ting more peo­ple.

Seon Gunness11 Jul 2025 12:05 UTC
9 points
2 comments34 min readLW link

Me­mory De­cod­ing Jour­nal Club: Bi­nary and ana­log vari­a­tion of synapses be­tween cor­ti­cal pyra­mi­dal neurons

Devin Ward11 Jul 2025 4:47 UTC
1 point
0 comments1 min readLW link

De­con­fus­ing ‘AI’ and ‘evolu­tion’

Remmelt11 Jul 2025 1:44 UTC
12 points
9 comments26 min readLW link

So You Think You’ve Awo­ken ChatGPT

JustisMills11 Jul 2025 1:01 UTC
310 points
87 comments9 min readLW link

Mea­sur­ing the Im­pact of Early-2025 AI on Ex­pe­rienced Open-Source Devel­oper Productivity

habryka11 Jul 2025 0:23 UTC
97 points
43 comments6 min readLW link
(metr.org)

On think­ing about AI risks concretely

zeshen11 Jul 2025 0:04 UTC
9 points
4 comments4 min readLW link

Me­tacog­ni­tion and Self-Model­ing in LLMs

Christopher Ackerman10 Jul 2025 21:25 UTC
19 points
2 comments16 min readLW link

My take on AI Align­ment: Cor­po­rate mis­al­ign­ment and DAOs

act6510 Jul 2025 20:33 UTC
7 points
3 comments1 min readLW link

what makes Claude 3 Opus misaligned

janus10 Jul 2025 20:06 UTC
104 points
11 comments5 min readLW link

The Ris­ing Premium of Life, Or: How We Learned to Start Wor­ry­ing and Fear Everything

Linch10 Jul 2025 19:12 UTC
10 points
10 comments1 min readLW link
(linch.substack.com)

Les­sons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC
190 points
25 comments4 min readLW link

Linkpost: Red­wood Re­search read­ing list

Julian Stastny10 Jul 2025 18:39 UTC
50 points
0 comments1 min readLW link
(redwoodresearch.substack.com)

Gen­er­al­ized Han­gri­ness: A Stan­dard Ra­tion­al­ist Stance Toward Emotions

johnswentworth10 Jul 2025 18:22 UTC
359 points
69 comments7 min readLW link

The bit­ter les­son of mi­suse detection

10 Jul 2025 14:50 UTC
37 points
6 comments7 min readLW link

Eval­u­at­ing and mon­i­tor­ing for AI scheming

10 Jul 2025 14:24 UTC
52 points
9 comments5 min readLW link
(deepmindsafetyresearch.medium.com)

White Box Con­trol at UK AISI—Up­date on Sand­bag­ging Investigations

10 Jul 2025 13:37 UTC
78 points
10 comments18 min readLW link

AI #124: Grokless Interlude

Zvi10 Jul 2025 12:40 UTC
28 points
5 comments43 min readLW link
(thezvi.wordpress.com)

How many OOMs of com­pute span the hu­man range?

tickybob10 Jul 2025 11:51 UTC
12 points
6 comments1 min readLW link

The anti-Kar­da­shev scale is a bet­ter mea­sure of civ­i­liza­tional power

RussellThor10 Jul 2025 10:02 UTC
5 points
2 comments3 min readLW link

If Any­one Builds It, Every­one Dies: A Con­ver­sa­tion with Nate Soares and Tim Urban

10 Jul 2025 8:00 UTC
23 points
2 comments1 min readLW link

80,000 Hours is pro­duc­ing AI in Con­text — a new YouTube chan­nel. Our first video, about the AI 2027 sce­nario, is up!

chanamessinger9 Jul 2025 23:58 UTC
54 points
3 comments3 min readLW link

Ask­ing for a Friend (AI Re­search Pro­to­cols)

The Dao of Bayes9 Jul 2025 23:41 UTC
11 points
33 comments2 min readLW link

De­mons, Si­mu­la­tors and Gremlins

J Bostock9 Jul 2025 20:22 UTC
10 points
1 comment3 min readLW link

In­ves­ti­gat­ing Prim­ing in Align­ment Faking

Wayne9 Jul 2025 17:08 UTC
13 points
0 comments4 min readLW link

No, Grok, No

Zvi9 Jul 2025 15:10 UTC
92 points
3 comments17 min readLW link
(thezvi.wordpress.com)

The As­teroid Setup That De­mands an Explanation

David Björling9 Jul 2025 14:55 UTC
−2 points
32 comments5 min readLW link

What’s worse, spies or schemers?

9 Jul 2025 14:37 UTC
51 points
2 comments5 min readLW link

An­thropic rea­son­ing in­tro (notes on Bostrom)

jchan9 Jul 2025 14:24 UTC
7 points
0 comments7 min readLW link

No, We’re Not Get­ting Mean­ingful Over­sight of AI

Davidmanheim9 Jul 2025 11:10 UTC
41 points
4 comments1 min readLW link
(arxiv.org)

Hy­brid model re­veals peo­ple act less ra­tio­nally in com­plex games, more pre­dictably in sim­ple ones

Gunnar_Zarncke9 Jul 2025 10:15 UTC
9 points
0 comments1 min readLW link
(arxiv.org)

Sub­way Par­ti­cle Levels Aren’t That High

jefftk9 Jul 2025 2:30 UTC
80 points
4 comments1 min readLW link
(www.jefftk.com)

TT Self Study Jour­nal # 2

TristanTrim9 Jul 2025 2:16 UTC
3 points
0 comments7 min readLW link

AI Agent Bench­marks Are Broken

Sasha Cui8 Jul 2025 22:11 UTC
10 points
0 comments1 min readLW link
(ddkang.substack.com)

Why Do Some Lan­guage Models Fake Align­ment While Others Don’t?

8 Jul 2025 21:49 UTC
158 points
14 comments5 min readLW link
(arxiv.org)

A Medium Scenario

Chapin Lenthall-Cleary8 Jul 2025 20:09 UTC
18 points
12 comments20 min readLW link

An Opinionated Guide to Us­ing Anki Correctly

Luise8 Jul 2025 20:01 UTC
156 points
58 comments27 min readLW link

Lenses, Me­taphors, and Meaning

8 Jul 2025 19:46 UTC
7 points
0 comments4 min readLW link

Ap­ply­ing right-wing frames to AGI (geo)politics

Richard_Ngo8 Jul 2025 18:03 UTC
64 points
25 comments3 min readLW link
(x.com)

The Un­jour­nal’s “Pivotal Ques­tions” project

david reinstein8 Jul 2025 15:55 UTC
6 points
1 comment1 min readLW link
(forum.effectivealtruism.org)

Balsa Up­date: Spring­time in DC

Zvi8 Jul 2025 15:00 UTC
61 points
6 comments10 min readLW link
(thezvi.wordpress.com)

MIT Fu­tureTech are hiring a Post­doc­toral As­so­ci­ate to work on AI Perfor­mance and Safety

peterslattery8 Jul 2025 14:02 UTC
3 points
0 comments4 min readLW link

En­ergy-Based Trans­form­ers are Scal­able Learn­ers and Thinkers

Matrice Jacobine8 Jul 2025 13:44 UTC
7 points
5 comments1 min readLW link
(energy-based-transformers.github.io)