Is Gem­ini now bet­ter than Claude at Poké­mon?

Julian BradshawApr 19, 2025, 11:34 PM
90 points
12 comments5 min readLW link

Im­pact, agency, and taste

benkuhnApr 19, 2025, 9:10 PM
204 points
10 comments8 min readLW link
(www.benkuhn.net)

Mo­ral pa­tient­hood of simu­lated minds al­lows un­countabe in­finity of value on finite hard­ware

LuckApr 19, 2025, 8:41 PM
−2 points
12 comments2 min readLW link

When the Model Starts Talk­ing Like Me: A User-In­duced Struc­tural Adap­ta­tion Case Study

JunxiApr 19, 2025, 7:40 PM
3 points
1 comment4 min readLW link

A Block-Based Reg­u­lariza­tion Pro­posal for Neu­ral Networks

Otto.DevApr 19, 2025, 6:56 PM
−8 points
0 comments1 min readLW link

How Close We Are to a Com­plete List of Im­printed Genes

MorpheusApr 19, 2025, 6:37 PM
30 points
3 comments14 min readLW link
(www.tassiloneubauer.com)

Novel Idea Gen­er­a­tion in LLMs: Judg­ment as Bottleneck

Davey MorseApr 19, 2025, 3:37 PM
6 points
1 comment1 min readLW link

Why Should I As­sume CCP AGI is Worse Than USG AGI?

Tomás B.Apr 19, 2025, 2:47 PM
252 points
87 comments1 min readLW link

An In­tro­duc­tion to SAEs and their Var­i­ants for Mech Interp

Adam NewgasApr 19, 2025, 2:09 PM
17 points
0 comments10 min readLW link

Ap­proaches to Miti­gat­ing AI Image-Gen­er­a­tion Risks through Regulation

scronkfinkleApr 19, 2025, 1:54 PM
−2 points
3 comments4 min readLW link

AI Ad­vances and De­tec­tion Strategy

jefftkApr 19, 2025, 11:40 AM
11 points
0 comments1 min readLW link
(www.jefftk.com)

Emo­tional The­ory for a Di­sor­der Man­ual on How Not to Freeze Completely

P. JoãoApr 19, 2025, 9:12 AM
13 points
0 comments2 min readLW link

The Sys­tem Didn’t, and Doesn’t Need to be This Way ~ Thomas Paine on Eco­nomic Justice

James Stephen BrownApr 19, 2025, 5:16 AM
2 points
3 comments4 min readLW link
(nonzerosum.games)

Se­cureDrop review

samuelshadrachApr 19, 2025, 4:29 AM
2 points
0 comments5 min readLW link
(samuelshadrach.com)

AI, Align­ment & the Art of Re­la­tion­ship Design

Priyanka BharadwajApr 19, 2025, 12:47 AM
6 points
4 comments2 min readLW link

Mea­sur­ing Beliefs of Lan­guage Models Dur­ing Chain-of-Thought Reasoning

Apr 18, 2025, 10:56 PM
9 points
0 comments13 min readLW link

LLM-based Fact Check­ing for Pop­u­lar Posts?

azerganteApr 18, 2025, 9:26 PM
1 point
2 comments62 min readLW link

o3 Will Use Its Tools For You

ZviApr 18, 2025, 9:20 PM
46 points
3 comments45 min readLW link
(thezvi.wordpress.com)

AI Con­trol Meth­ods Liter­a­ture Review

Ram PothamApr 18, 2025, 9:15 PM
10 points
1 comment9 min readLW link

Con­se­quen­tial­ists should have a com­pre­hen­sive set of de­on­tolog­i­cal be­liefs they ad­here to

Jay95Apr 18, 2025, 8:50 PM
3 points
2 comments1 min readLW link

What Makes an AI Startup “Net Pos­i­tive” for Safety?

jacquesthibsApr 18, 2025, 8:33 PM
80 points
23 comments2 min readLW link

Align­ment Does Not Need to Be Opaque! An In­tro­duc­tion to Fea­ture Steer­ing with Re­in­force­ment Learning

Jeremias FerraoApr 18, 2025, 7:34 PM
10 points
0 comments10 min readLW link

Eval­u­at­ing Col­lab­o­ra­tive AI Perfor­mance Sub­ject to Sab­o­tage

Matthew KhoriatyApr 18, 2025, 7:33 PM
2 points
0 comments19 min readLW link

In­side OpenAI’s Con­tro­ver­sial Plan to Aban­don its Non­profit Roots

garrisonApr 18, 2025, 6:46 PM
21 points
0 comments11 min readLW link
(garrisonlovely.substack.com)

Could LLMs Learn to De­tect Bias Au­tonomously, Like Tesla’s Self-Driv­ing Cars?

OmnipheasantApr 18, 2025, 6:45 PM
0 points
0 comments3 min readLW link

Scaf­fold­ing Skills

ScrewtapeApr 18, 2025, 5:39 PM
35 points
9 comments4 min readLW link

The Case for White Box Control

J RosserApr 18, 2025, 4:10 PM
5 points
1 comment5 min readLW link

[Rockville] Ra­tion­al­ist Shabbat

maiaApr 18, 2025, 3:38 PM
8 points
0 comments1 min readLW link

Han­dling schemers if shut­down is not an option

BuckApr 18, 2025, 2:39 PM
39 points
2 comments14 min readLW link

Bri­tish and Amer­i­can Connotations

jefftkApr 18, 2025, 1:00 PM
14 points
4 comments1 min readLW link
(www.jefftk.com)

Towards Un­der­stand­ing the Rep­re­sen­ta­tion of Belief State Geom­e­try in Transformers

Karthik ViswanathanApr 18, 2025, 12:39 PM
3 points
0 comments12 min readLW link

Train­ing AGI in Se­cret would be Un­safe and Unethical

Daniel KokotajloApr 18, 2025, 12:27 PM
139 points
15 comments6 min readLW link

Karma Tests in Log­i­cal Coun­ter­fac­tual Si­mu­la­tions mo­ti­vates strong agents to pro­tect weak agents

Knight LeeApr 18, 2025, 11:11 AM
9 points
8 comments3 min readLW link

What If Galax­ies Are Alive and Atoms Have Minds? A Thought Ex­per­i­ment on Life Across Scales

Saif KhanApr 18, 2025, 10:01 AM
−2 points
5 comments3 min readLW link

Three Months In, Eval­u­at­ing Three Ra­tion­al­ist Cases for Trump

Arjun PanicksseryApr 18, 2025, 8:27 AM
115 points
33 comments4 min readLW link

[Question] Com­pre­hen­sive up-to-date re­sources on the Chi­nese Com­mu­nist Party’s AI strat­egy, etc?

Mateusz BagińskiApr 18, 2025, 4:58 AM
14 points
6 comments1 min readLW link

Con­di­tional Fore­cast­ing as Model Parameterization

MollyApr 18, 2025, 2:35 AM
15 points
0 comments7 min readLW link
(cuttyshark.substack.com)

One Night in Delphi

EggsApr 18, 2025, 2:17 AM
4 points
2 comments3 min readLW link

0 Mo­ti­va­tion Map­ping through In­for­ma­tion Theory

P. JoãoApr 18, 2025, 12:53 AM
7 points
0 comments26 min readLW link

The Rus­sell Con­ju­ga­tion Illuminator

TimmyMApr 17, 2025, 7:33 PM
51 points
14 comments1 min readLW link
(russellconjugations.com)

An­nounc­ing Progress Con­fer­ence 2025

jasoncrawfordApr 17, 2025, 5:12 PM
12 points
0 comments1 min readLW link
(newsletter.rootsofprogress.org)

The Mir­ror Paradox

Jeremy KraybillApr 17, 2025, 4:23 PM
−6 points
0 comments1 min readLW link

Me­mory De­cod­ing Jour­nal Club

Devin WardApr 17, 2025, 4:19 PM
1 point
0 comments1 min readLW link

Host Keys and SSHing to EC2

jefftkApr 17, 2025, 3:10 PM
10 points
6 comments1 min readLW link
(www.jefftk.com)

AI #112: Re­lease the Everything

ZviApr 17, 2025, 3:10 PM
41 points
6 comments40 min readLW link
(thezvi.wordpress.com)

On AI personhood

p.b.Apr 17, 2025, 12:31 PM
4 points
7 comments1 min readLW link

8 PRIME IDENTITIES - An analisis

P. JoãoApr 17, 2025, 11:36 AM
−5 points
0 comments2 min readLW link

8 LATENT VALUES - A sim­plified con­struc­tion from MaxEnt In­for­ma­tional Effi­ciency in 4 questions

P. JoãoApr 17, 2025, 11:04 AM
3 points
5 comments3 min readLW link

Au­tomat­ing Mechanis­tic In­ter­pretabil­ity via Pro­gram Synthesis

Edy NastaseApr 17, 2025, 10:58 AM
1 point
1 comment1 min readLW link

Un­der­stand­ing and over­com­ing AGI apathy

Dhruv SumathiApr 17, 2025, 1:04 AM
25 points
1 comment13 min readLW link
(dhruvsumathi.substack.com)