The Rus­sell Con­ju­ga­tion Illuminator

TimmyMApr 17, 2025, 7:33 PM
51 points
14 comments1 min readLW link
(russellconjugations.com)

An­nounc­ing Progress Con­fer­ence 2025

jasoncrawfordApr 17, 2025, 5:12 PM
12 points
0 comments1 min readLW link
(newsletter.rootsofprogress.org)

The Mir­ror Paradox

Jeremy KraybillApr 17, 2025, 4:23 PM
−6 points
0 comments1 min readLW link

Me­mory De­cod­ing Jour­nal Club

Devin WardApr 17, 2025, 4:19 PM
1 point
0 comments1 min readLW link

Host Keys and SSHing to EC2

jefftkApr 17, 2025, 3:10 PM
10 points
6 comments1 min readLW link
(www.jefftk.com)

AI #112: Re­lease the Everything

ZviApr 17, 2025, 3:10 PM
41 points
6 comments40 min readLW link
(thezvi.wordpress.com)

On AI personhood

p.b.Apr 17, 2025, 12:31 PM
4 points
7 comments1 min readLW link

8 PRIME IDENTITIES - An analisis

P. JoãoApr 17, 2025, 11:36 AM
−5 points
0 comments2 min readLW link

8 LATENT VALUES - A sim­plified con­struc­tion from MaxEnt In­for­ma­tional Effi­ciency in 4 questions

P. JoãoApr 17, 2025, 11:04 AM
3 points
5 comments3 min readLW link

Au­tomat­ing Mechanis­tic In­ter­pretabil­ity via Pro­gram Synthesis

Edy NastaseApr 17, 2025, 10:58 AM
1 point
1 comment1 min readLW link

Un­der­stand­ing and over­com­ing AGI apathy

Dhruv SumathiApr 17, 2025, 1:04 AM
25 points
1 comment13 min readLW link
(dhruvsumathi.substack.com)

ALLFED emer­gency ap­peal: Help us raise $800,000 to avoid cut­ting half of programs

denkenbergerApr 16, 2025, 9:47 PM
43 points
9 comments3 min readLW link

Pro­dromes and Bio­mark­ers in Chronic Disease

sarahconstantinApr 16, 2025, 9:30 PM
23 points
2 comments3 min readLW link
(sarahconstantin.substack.com)

The Prac­ti­cal Im­per­a­tive for AI Con­trol Re­search

Archana VaidheeswaranApr 16, 2025, 8:27 PM
1 point
0 comments4 min readLW link

METR’s pre­limi­nary eval­u­a­tion of o3 and o4-mini

Christopher KingApr 16, 2025, 8:23 PM
14 points
7 comments1 min readLW link
(metr.github.io)

Mass Ex­po­sure Paradox

max-sixtyApr 16, 2025, 8:18 PM
6 points
2 comments2 min readLW link

GPT-4.5 is Cog­ni­tive Em­pa­thy, Son­net 3.5 is Affec­tive Empathy

JackApr 16, 2025, 7:12 PM
15 points
2 comments4 min readLW link

GPT-4.1 Is a Mini Upgrade

ZviApr 16, 2025, 7:00 PM
31 points
6 comments8 min readLW link
(thezvi.wordpress.com)

Do­ing Pri­ori­ti­za­tion Better

arvommApr 16, 2025, 6:46 PM
3 points
1 comment19 min readLW link
(forum.effectivealtruism.org)

Kamelo: A Rule-Based Con­structed Lan­guage for Univer­sal, Log­i­cal Communication

Saif KhanApr 16, 2025, 6:44 PM
12 points
7 comments2 min readLW link

Un­der­stand­ing Trust: Overview Presentations

abramdemskiApr 16, 2025, 6:08 PM
22 points
0 comments1 min readLW link

Un­der­stand­ing Trust—Overview Presentations

abramdemskiApr 16, 2025, 6:05 PM
13 points
0 comments1 min readLW link

Telescoping

za3kApr 16, 2025, 5:05 PM
13 points
1 comment1 min readLW link
(blog.za3k.com)

Fi­nance and AI Timelines

DALApr 16, 2025, 4:55 PM
5 points
2 comments3 min readLW link

FROM IA CODE TO HUMAN VALUES – A con­struc­tion from MaxEnt In­for­ma­tional Effi­ciency in 4 questions

P. JoãoApr 16, 2025, 4:53 PM
3 points
0 comments6 min readLW link

AI-en­abled coups: a small group could use AI to seize power

Apr 16, 2025, 4:51 PM
129 points
18 comments7 min readLW link

Ctrl-Z: Con­trol­ling AI Agents via Resampling

Apr 16, 2025, 4:21 PM
124 points
0 comments20 min readLW link

Gam­ify life from BayesianMind

P. JoãoApr 16, 2025, 4:17 PM
6 points
2 comments1 min readLW link

Top OpenAI Catas­trophic Risk Offi­cial Steps Down Abruptly

garrisonApr 16, 2025, 4:04 PM
14 points
0 comments5 min readLW link
(garrisonlovely.substack.com)

An artis­tic illus­tra­tion of Scal­able Over­sight—“A world apart, nei­ther gods nor mor­tals”

Marius Adrian NicoarăApr 16, 2025, 12:41 PM
1 point
0 comments1 min readLW link

Can LLM-based mod­els do model-based plan­ning?

jylin04Apr 16, 2025, 12:38 PM
11 points
1 comment2 min readLW link
(docs.google.com)

The road from hu­man-level to su­per­in­tel­li­gent AI may be short

Apr 16, 2025, 8:35 AM
10 points
0 comments2 min readLW link
(aisafety.info)

Hu­man-level is not the limit

Apr 16, 2025, 8:33 AM
23 points
2 comments2 min readLW link
(aisafety.info)

AI may at­tain hu­man-level soon

Apr 16, 2025, 8:28 AM
11 points
0 comments2 min readLW link
(aisafety.info)

AI is ad­vanc­ing fast

Apr 16, 2025, 8:17 AM
11 points
0 comments2 min readLW link
(aisafety.info)

How Logic “Really” Works: An Eng­ineer­ing Perspective

Daniil StrizhovApr 16, 2025, 5:34 AM
6 points
0 comments11 min readLW link

Op­por­tu­nity to to learn more about AI In­no­va­tion & Se­cu­rity Policy

PolicyTakesApr 16, 2025, 1:35 AM
2 points
0 comments1 min readLW link

D&D.Sci Tax Day: Ad­ven­tur­ers and Assessments

aphyerApr 15, 2025, 11:43 PM
46 points
14 comments2 min readLW link

Should AIs be En­couraged to Co­op­er­ate?

PeterMcCluskeyApr 15, 2025, 9:57 PM
13 points
2 comments5 min readLW link
(bayesianinvestor.com)

OpenAI rewrote its Pre­pared­ness Framework

Zach Stein-PerlmanApr 15, 2025, 8:00 PM
36 points
1 comment6 min readLW link

ASI ex­is­ten­tial risk: Re­con­sid­er­ing Align­ment as a Goal

habrykaApr 15, 2025, 7:57 PM
91 points
14 comments19 min readLW link
(michaelnotebook.com)

Nu­cleic Acid Ob­ser­va­tory Up­dates, April 2025

jefftkApr 15, 2025, 6:58 PM
27 points
0 comments4 min readLW link
(naobservatory.org)

Some Othel­loGPT Circuits

Alfred WongApr 15, 2025, 6:41 PM
7 points
0 comments7 min readLW link

The Mir­ror Prob­lem in AI: Why Lan­guage Models Say What­ever You Want

RobTApr 15, 2025, 6:40 PM
9 points
2 comments3 min readLW link

What hap­pens when LLMs learn new things? & Con­tinual learn­ing for­ever.

sunchipssterApr 15, 2025, 6:38 PM
4 points
1 comment7 min readLW link

To be leg­ible, ev­i­dence of mis­al­ign­ment prob­a­bly has to be behavioral

ryan_greenblattApr 15, 2025, 6:14 PM
55 points
19 comments3 min readLW link

AISN #51: AI Frontiers

Apr 15, 2025, 4:01 PM
6 points
1 comment5 min readLW link
(newsletter.safe.ai)

Sur­pris­ing LLM rea­son­ing failures make me think we still need qual­i­ta­tive break­throughs for AGI

Kaj_SotalaApr 15, 2025, 3:56 PM
174 points
51 comments18 min readLW link

OpenAI #13: Alt­man at TED and OpenAI Cut­ting Corners on Safety Testing

ZviApr 15, 2025, 3:30 PM
48 points
3 comments12 min readLW link
(thezvi.wordpress.com)

The real rea­son AI bench­marks haven’t re­flected eco­nomic impacts

Noosphere89Apr 15, 2025, 1:44 PM
15 points
0 comments1 min readLW link
(epoch.ai)