The Rus­sell Con­ju­ga­tion Illuminator

TimmyM17 Apr 2025 19:33 UTC
51 points
14 comments1 min readLW link
(russellconjugations.com)

An­nounc­ing Progress Con­fer­ence 2025

jasoncrawford17 Apr 2025 17:12 UTC
12 points
0 comments1 min readLW link
(newsletter.rootsofprogress.org)

The Mir­ror Paradox

Jeremy Kraybill17 Apr 2025 16:23 UTC
−6 points
0 comments1 min readLW link

Me­mory De­cod­ing Jour­nal Club

Devin Ward17 Apr 2025 16:19 UTC
1 point
0 comments1 min readLW link

Host Keys and SSHing to EC2

jefftk17 Apr 2025 15:10 UTC
10 points
6 comments1 min readLW link
(www.jefftk.com)

AI #112: Re­lease the Everything

Zvi17 Apr 2025 15:10 UTC
41 points
6 comments40 min readLW link
(thezvi.wordpress.com)

On AI personhood

p.b.17 Apr 2025 12:31 UTC
4 points
7 comments1 min readLW link

8 PRIME IDENTITIES - An analisis

P. João17 Apr 2025 11:36 UTC
−5 points
0 comments2 min readLW link

8 LATENT VALUES - A sim­plified con­struc­tion from MaxEnt In­for­ma­tional Effi­ciency in 4 questions

P. João17 Apr 2025 11:04 UTC
3 points
5 comments3 min readLW link

Au­tomat­ing Mechanis­tic In­ter­pretabil­ity via Pro­gram Synthesis

Edy Nastase17 Apr 2025 10:58 UTC
1 point
1 comment1 min readLW link

Un­der­stand­ing and over­com­ing AGI apathy

Dhruv Sumathi17 Apr 2025 1:04 UTC
25 points
1 comment13 min readLW link
(dhruvsumathi.substack.com)

ALLFED emer­gency ap­peal: Help us raise $800,000 to avoid cut­ting half of programs

denkenberger16 Apr 2025 21:47 UTC
49 points
9 comments3 min readLW link

Pro­dromes and Bio­mark­ers in Chronic Disease

sarahconstantin16 Apr 2025 21:30 UTC
23 points
2 comments3 min readLW link
(sarahconstantin.substack.com)

The Prac­ti­cal Im­per­a­tive for AI Con­trol Re­search

Archana Vaidheeswaran16 Apr 2025 20:27 UTC
1 point
0 comments4 min readLW link

METR’s pre­limi­nary eval­u­a­tion of o3 and o4-mini

Christopher King16 Apr 2025 20:23 UTC
14 points
7 comments1 min readLW link
(metr.github.io)

Mass Ex­po­sure Paradox

max-sixty16 Apr 2025 20:18 UTC
6 points
2 comments2 min readLW link

GPT-4.5 is Cog­ni­tive Em­pa­thy, Son­net 3.5 is Affec­tive Empathy

Jack16 Apr 2025 19:12 UTC
15 points
2 comments4 min readLW link

GPT-4.1 Is a Mini Upgrade

Zvi16 Apr 2025 19:00 UTC
31 points
6 comments8 min readLW link
(thezvi.wordpress.com)

Do­ing Pri­ori­ti­za­tion Better

arvomm16 Apr 2025 18:46 UTC
3 points
1 comment19 min readLW link
(forum.effectivealtruism.org)

Kamelo: A Rule-Based Con­structed Lan­guage for Univer­sal, Log­i­cal Communication

Saif Khan16 Apr 2025 18:44 UTC
12 points
7 comments2 min readLW link

Un­der­stand­ing Trust: Overview Presentations

abramdemski16 Apr 2025 18:08 UTC
22 points
0 comments1 min readLW link

Un­der­stand­ing Trust—Overview Presentations

abramdemski16 Apr 2025 18:05 UTC
13 points
0 comments1 min readLW link

Telescoping

za3k16 Apr 2025 17:05 UTC
13 points
1 comment1 min readLW link
(blog.za3k.com)

Fi­nance and AI Timelines

DAL16 Apr 2025 16:55 UTC
5 points
2 comments3 min readLW link

FROM IA CODE TO HUMAN VALUES – A con­struc­tion from MaxEnt In­for­ma­tional Effi­ciency in 4 questions

P. João16 Apr 2025 16:53 UTC
3 points
0 comments7 min readLW link

AI-en­abled coups: a small group could use AI to seize power

16 Apr 2025 16:51 UTC
132 points
23 comments7 min readLW link

Ctrl-Z: Con­trol­ling AI Agents via Resampling

16 Apr 2025 16:21 UTC
124 points
0 comments20 min readLW link

Gam­ify life from BayesianMind

P. João16 Apr 2025 16:17 UTC
6 points
2 comments1 min readLW link

Top OpenAI Catas­trophic Risk Offi­cial Steps Down Abruptly

garrison16 Apr 2025 16:04 UTC
14 points
0 comments5 min readLW link
(garrisonlovely.substack.com)

An artis­tic illus­tra­tion of Scal­able Over­sight—“A world apart, nei­ther gods nor mor­tals”

Marius Adrian Nicoară16 Apr 2025 12:41 UTC
1 point
0 comments1 min readLW link

Can LLM-based mod­els do model-based plan­ning?

jylin0416 Apr 2025 12:38 UTC
11 points
1 comment2 min readLW link
(docs.google.com)

The road from hu­man-level to su­per­in­tel­li­gent AI may be short

16 Apr 2025 8:35 UTC
10 points
0 comments2 min readLW link
(aisafety.info)

Hu­man-level is not the limit

16 Apr 2025 8:33 UTC
23 points
2 comments2 min readLW link
(aisafety.info)

AI may at­tain hu­man-level soon

16 Apr 2025 8:28 UTC
11 points
0 comments2 min readLW link
(aisafety.info)

AI is ad­vanc­ing fast

16 Apr 2025 8:17 UTC
11 points
0 comments2 min readLW link
(aisafety.info)

How Logic “Really” Works: An Eng­ineer­ing Perspective

Daniil Strizhov16 Apr 2025 5:34 UTC
6 points
0 comments11 min readLW link

Op­por­tu­nity to to learn more about AI In­no­va­tion & Se­cu­rity Policy

PolicyTakes16 Apr 2025 1:35 UTC
2 points
0 comments1 min readLW link

D&D.Sci Tax Day: Ad­ven­tur­ers and Assessments

aphyer15 Apr 2025 23:43 UTC
47 points
14 comments2 min readLW link

Should AIs be En­couraged to Co­op­er­ate?

PeterMcCluskey15 Apr 2025 21:57 UTC
13 points
2 comments5 min readLW link
(bayesianinvestor.com)

OpenAI rewrote its Pre­pared­ness Framework

Zach Stein-Perlman15 Apr 2025 20:00 UTC
36 points
1 comment6 min readLW link

ASI ex­is­ten­tial risk: Re­con­sid­er­ing Align­ment as a Goal

habryka15 Apr 2025 19:57 UTC
93 points
14 comments19 min readLW link
(michaelnotebook.com)

Nu­cleic Acid Ob­ser­va­tory Up­dates, April 2025

jefftk15 Apr 2025 18:58 UTC
27 points
0 comments4 min readLW link
(naobservatory.org)

Some Othel­loGPT Circuits

Alfred Wong15 Apr 2025 18:41 UTC
7 points
0 comments7 min readLW link

The Mir­ror Prob­lem in AI: Why Lan­guage Models Say What­ever You Want

RobT15 Apr 2025 18:40 UTC
9 points
2 comments3 min readLW link

What hap­pens when LLMs learn new things? & Con­tinual learn­ing for­ever.

sunchipsster15 Apr 2025 18:38 UTC
4 points
1 comment7 min readLW link

To be leg­ible, ev­i­dence of mis­al­ign­ment prob­a­bly has to be behavioral

ryan_greenblatt15 Apr 2025 18:14 UTC
57 points
19 comments3 min readLW link

AISN #51: AI Frontiers

15 Apr 2025 16:01 UTC
8 points
1 comment5 min readLW link
(newsletter.safe.ai)

Sur­pris­ing LLM rea­son­ing failures make me think we still need qual­i­ta­tive break­throughs for AGI

Kaj_Sotala15 Apr 2025 15:56 UTC
174 points
52 comments18 min readLW link

OpenAI #13: Alt­man at TED and OpenAI Cut­ting Corners on Safety Testing

Zvi15 Apr 2025 15:30 UTC
48 points
3 comments12 min readLW link
(thezvi.wordpress.com)

The real rea­son AI bench­marks haven’t re­flected eco­nomic impacts

Noosphere8915 Apr 2025 13:44 UTC
15 points
0 comments1 min readLW link
(epoch.ai)