AI risk was not in­vented by AI CEOs to hype their companies

KatjaGrace30 Apr 2026 23:10 UTC
60 points
0 comments3 min readLW link
(worldspiritsockpuppet.com)

How much should the ideal per­son cry wolf?

KatjaGrace30 Apr 2026 23:10 UTC
37 points
7 comments2 min readLW link
(worldspiritsockpuppet.com)

Cam­bridge: the kettle

KatjaGrace30 Apr 2026 23:10 UTC
19 points
1 comment4 min readLW link
(worldspiritsockpuppet.com)

AI un­em­ploy­ment and AI ex­tinc­tion are of­ten the same

KatjaGrace30 Apr 2026 23:10 UTC
61 points
6 comments2 min readLW link
(worldspiritsockpuppet.com)

San Fran­cisco: self driving

KatjaGrace30 Apr 2026 23:10 UTC
8 points
0 comments1 min readLW link
(worldspiritsockpuppet.com)

SFF’s HSEE grant round; hu­man in­tel­li­gence am­plifi­ca­tion pro­jects I’d like to see

TsviBT30 Apr 2026 21:41 UTC
33 points
0 comments11 min readLW link

To what ex­tent is Qwen3-32B pre­dict­ing its per­sona?

30 Apr 2026 21:09 UTC
85 points
3 comments10 min readLW link

Pro­jects that might help ac­cel­er­ate strong reprogenetics

TsviBT30 Apr 2026 20:55 UTC
22 points
1 comment12 min readLW link

Ex­plor­ing the ca­pa­bil­ities spike with METR’s time hori­zon data: no clear signal

Ben_Snodin30 Apr 2026 20:54 UTC
19 points
0 comments5 min readLW link
(www.bensnodin.com)

Align­ment Fak­ing in Deep­Seek V4

Amina Keldibek30 Apr 2026 20:23 UTC
23 points
1 comment5 min readLW link

Up­com­ing Work­shop on Post-AGI Civ­i­liza­tional Equil­ibria

30 Apr 2026 19:51 UTC
28 points
1 comment1 min readLW link

Cy­borg evals

30 Apr 2026 17:31 UTC
33 points
2 comments5 min readLW link

AI #166: Google Sells Out

Zvi30 Apr 2026 15:40 UTC
33 points
2 comments55 min readLW link
(thezvi.wordpress.com)

Copy­cat Inkhaven 2 retrospective

Dentosal30 Apr 2026 13:33 UTC
4 points
0 comments1 min readLW link

Open in­tern­ship po­si­tion + call for col­lab­o­ra­tions on threat model-de­pen­dent al­ign­ment, gov­er­nance, and offense/​defense balance

otto.barten30 Apr 2026 12:40 UTC
7 points
0 comments1 min readLW link

Maybe I was too harsh on deep learn­ing the­ory (three days ago)

LawrenceC30 Apr 2026 6:57 UTC
109 points
13 comments4 min readLW link

On to­day’s panel with Bernie Sanders

David Scott Krueger30 Apr 2026 5:00 UTC
199 points
3 comments2 min readLW link
(therealartificialintelligence.substack.com)

Red vs blue: The parable of the feud within a feud

Joe Rogero30 Apr 2026 4:01 UTC
25 points
22 comments5 min readLW link
(subatomicarticles.com)

Scaf­fold­ing vs Re­in­force­ment Fine­tun­ing for AI Forecasting

Ram Potham30 Apr 2026 2:51 UTC
15 points
0 comments4 min readLW link

What Do You Mean by a Two-Year AGI Timeline?

Koby Lewis30 Apr 2026 1:58 UTC
6 points
1 comment1 min readLW link

No Strong Orthog­o­nal­ity From Selec­tion Pressure

lumpenspace30 Apr 2026 1:56 UTC
55 points
192 comments10 min readLW link

Com­pu­ta­tion in Su­per­po­si­tion: Two Hand­crafted Models

30 Apr 2026 0:58 UTC
17 points
0 comments7 min readLW link

Re­search Sab­o­tage in ML Codebases

30 Apr 2026 0:26 UTC
62 points
3 comments6 min readLW link

The fall of the the­o­rem econ­omy (David Bes­sis)

Caleb Biddulph29 Apr 2026 19:35 UTC
32 points
8 comments4 min readLW link
(davidbessis.substack.com)

Probe-Based Data At­tri­bu­tion: Sur­fac­ing and Miti­gat­ing Un­de­sir­able Be­hav­iors in LLM Post-Training

29 Apr 2026 19:30 UTC
16 points
0 comments13 min readLW link

Book re­view: The In­finity Machine

PeterMcCluskey29 Apr 2026 18:59 UTC
24 points
1 comment6 min readLW link

Lorxus Does Bud­get Inkhaven Again: 04/​22~04/​28

Lorxus29 Apr 2026 17:07 UTC
6 points
2 comments3 min readLW link
(tiled-with-pentagons.blogspot.com)

Poi­son­ing Fine-tun­ing Datasets of Con­sti­tu­tional Classifiers

29 Apr 2026 17:04 UTC
28 points
2 comments11 min readLW link
(alignment.anthropic.com)

AGI is Prob­a­bly Inevitable: A Model of So­cietal Ruptures

Mira Kennard29 Apr 2026 16:00 UTC
4 points
0 comments5 min readLW link

Fi­nal re­search agenda #2: first sketch of a plan

Mitchell_Porter29 Apr 2026 15:19 UTC
22 points
0 comments4 min readLW link

Bridg­ing the Gap on AI Safety Policy

James Newport29 Apr 2026 14:53 UTC
7 points
0 comments4 min readLW link
(forum.effectivealtruism.org)

The En­nea­gram is a Use­ful Fake Framework

Gordon Seidoh Worley29 Apr 2026 14:30 UTC
8 points
0 comments3 min readLW link
(www.uncertainupdates.com)

The Most Im­por­tant Charts In The World

Zvi29 Apr 2026 14:10 UTC
69 points
1 comment2 min readLW link
(thezvi.wordpress.com)

Let Kids Keep More Pro­duc­tivity Gains

jefftk29 Apr 2026 14:00 UTC
66 points
5 comments1 min readLW link
(www.jefftk.com)

Pears

TylerH29 Apr 2026 13:17 UTC
−1 points
0 comments4 min readLW link

Goblin Mode, 24 Hours Later

Dylan Bowman29 Apr 2026 12:19 UTC
52 points
10 comments4 min readLW link

Learn­ing zero, and what SLT gets wrong about it

Dmitry Vaintrob29 Apr 2026 6:41 UTC
37 points
6 comments13 min readLW link

Are LLMs not get­ting bet­ter?

kqr29 Apr 2026 6:27 UTC
24 points
4 comments2 min readLW link

llm as­sis­tant per­sonas seem in­creas­ingly in­co­her­ent (some sub­jec­tive ob­ser­va­tions)

nostalgebraist29 Apr 2026 3:53 UTC
343 points
84 comments9 min readLW link

The AI x-risk law­suit wait­ing to happen

David Scott Krueger29 Apr 2026 3:50 UTC
12 points
0 comments2 min readLW link
(therealartificialintelligence.substack.com)

Not a Paper: “Fron­tier Lab CEOs are Ca­pable of In-Con­text Schem­ing”

LawrenceC29 Apr 2026 3:00 UTC
226 points
8 comments7 min readLW link

Notes on Trans­former Consciousness

slavachalnev29 Apr 2026 0:00 UTC
36 points
2 comments2 min readLW link

Se­cureMaxx: A Lightweight Se­quence Screen­ing Tool for Agents

Austin Morrissey28 Apr 2026 23:47 UTC
11 points
0 comments8 min readLW link

Will whole brain em­u­la­tion mat­ter for the AI tran­si­tion?

djbinder28 Apr 2026 23:04 UTC
38 points
2 comments41 min readLW link
(defensesindepth.bio)

Causal in­fer­ence di­ary: skiing causes snow

Gretta Duleba28 Apr 2026 22:21 UTC
28 points
2 comments8 min readLW link

Is AI welfare work puntable?

Oscar28 Apr 2026 21:17 UTC
15 points
2 comments7 min readLW link

The Prob­lem in the “Nerd Sniping” xkcd Comic

peralice28 Apr 2026 20:40 UTC
72 points
6 comments12 min readLW link

Com­ment on “Fore­cast­ing is Way Over­rated, and We Should Stop Fund­ing It”

Josh Rosenberg28 Apr 2026 20:16 UTC
22 points
0 comments9 min readLW link

Strat­egy mat­ters when some­one im­ple­ments it. As­tra is cul­ti­vat­ing peo­ple to do both.

28 Apr 2026 19:58 UTC
18 points
0 comments4 min readLW link

ML Safety Newslet­ter #20: AI Wel­lbe­ing, Clas­sifier Jailbreak­ing and Hon­est Push­back Benchmarking

28 Apr 2026 19:16 UTC
16 points
0 comments5 min readLW link