King Lear—A Reinterpretation

Kailuo WangJan 21, 2025, 11:54 PM
2 points
1 comment14 min readLW link

Us­ing the prob­a­bil­is­tic method to bound the perfor­mance of toy transformers

Alex GibsonJan 21, 2025, 11:01 PM
1 point
0 comments3 min readLW link

Train­ing on Doc­u­ments About Re­ward Hack­ing In­duces Re­ward Hacking

Jan 21, 2025, 9:32 PM
131 points
15 comments2 min readLW link
(alignment.anthropic.com)

Veo-2 Can Pro­duce Real­is­tic Ads

Logan RiggsJan 21, 2025, 7:13 PM
14 points
0 comments1 min readLW link

Com­pu­ta­tional Limits on Efficiency

vibhumehJan 21, 2025, 6:29 PM
8 points
1 comment5 min readLW link

De­moc­ra­tiz­ing AI Gover­nance: Balanc­ing Ex­per­tise and Public Participation

Lucile Ter-MinassianJan 21, 2025, 6:29 PM
1 point
0 comments15 min readLW link

Hitler was not a monster

halgirJan 21, 2025, 6:21 PM
−11 points
5 comments1 min readLW link

Nat­u­ral In­tel­li­gence is Overhyped

CollisteruJan 21, 2025, 6:09 PM
15 points
0 comments7 min readLW link

14+ AI Safety Ad­vi­sors You Can Speak to – New AISafety.com Resource

Jan 21, 2025, 5:34 PM
24 points
0 comments1 min readLW link

[Linkpost] Why AI Safety Camp strug­gles with fundrais­ing (FBB #2)

gergogasparJan 21, 2025, 5:27 PM
3 points
0 comments1 min readLW link

The Man­hat­tan Trap: Why a Race to Ar­tifi­cial Su­per­in­tel­li­gence is Self-Defeating

Jan 21, 2025, 4:57 PM
87 points
11 commentsLW link
(www.convergenceanalysis.org)

Links and short notes, 2025-01-20

jasoncrawfordJan 21, 2025, 4:10 PM
8 points
0 comments1 min readLW link
(newsletter.rootsofprogress.org)

The Case Against AI Con­trol Research

johnswentworthJan 21, 2025, 4:03 PM
353 points
81 comments6 min readLW link

Will AI Re­silience pro­tect Devel­op­ing Na­tions?

ejk64Jan 21, 2025, 3:31 PM
4 points
0 comments8 min readLW link

Sleep, Diet, Ex­er­cise and GLP-1 Drugs

ZviJan 21, 2025, 12:20 PM
41 points
5 comments18 min readLW link
(thezvi.wordpress.com)

We don’t want to post again “This might be the last AI Safety Camp”

Jan 21, 2025, 12:03 PM
36 points
17 comments1 min readLW link
(manifund.org)

On Responsibility

silentbobJan 21, 2025, 10:47 AM
9 points
2 comments6 min readLW link

The ‘anti woke’ are po­si­tioned to win but can they cap­i­tal­ize?

HznJan 21, 2025, 9:52 AM
−8 points
0 comments2 min readLW link

Al­most all growth is ex­po­nen­tial growth

lemonhopeJan 21, 2025, 7:16 AM
19 points
7 comments1 min readLW link

Ar­bi­trage Drains Worse Mar­kets to Feeds Bet­ter Ones

CedarJan 21, 2025, 3:44 AM
25 points
1 comment1 min readLW link

On Con­tact, Part 1

james.lucassenJan 21, 2025, 3:10 AM
14 points
1 comment11 min readLW link

Ret­ro­spec­tive: 12 [sic] Months Since MIRI

james.lucassenJan 21, 2025, 2:52 AM
67 points
0 comments9 min readLW link

Easily Eval­u­ate SAE-Steered Models with EleutherAI Eval­u­a­tion Harness

Matthew KhoriatyJan 21, 2025, 2:02 AM
4 points
0 comments3 min readLW link

Why We Need More Shovel-Ready AI Notkil­lev­ery­oneism Me­gapro­ject Proposals

Peter BerggrenJan 20, 2025, 10:38 PM
36 points
1 comment6 min readLW link

Tips and Code for Em­piri­cal Re­search Workflows

Jan 20, 2025, 10:31 PM
94 points
14 comments20 min readLW link

Lec­ture Series on Tiling Agents #2

abramdemskiJan 20, 2025, 9:02 PM
16 points
0 comments1 min readLW link

An­nounce­ment: Learn­ing The­ory On­line Course

Jan 20, 2025, 7:55 PM
63 points
33 comments4 min readLW link

The Hid­den Sta­tus Game in Hospi­tal Slacking

EpistemicExplorerJan 20, 2025, 6:35 PM
2 points
4 comments3 min readLW link

Monthly Roundup #26: Jan­uary 2025

ZviJan 20, 2025, 3:30 PM
34 points
15 comments43 min readLW link
(thezvi.wordpress.com)

Things I have been us­ing LLMs for

Kaj_SotalaJan 20, 2025, 2:20 PM
50 points
6 comments7 min readLW link
(kajsotala.fi)

[Question] What are the chances that Su­per­hu­man Agents are already be­ing tested on the in­ter­net?

artemiumJan 20, 2025, 11:09 AM
3 points
1 comment1 min readLW link

Detroit Lions—over con­fi­dence is over rated?

HznJan 20, 2025, 10:53 AM
6 points
0 comments1 min readLW link

Log­its, log-odds, and loss for par­allel circuits

Dmitry VaintrobJan 20, 2025, 9:56 AM
57 points
4 comments11 min readLW link

Wor­ries about la­tent rea­son­ing in LLMs

Caleb BiddulphJan 20, 2025, 9:09 AM
45 points
6 comments7 min readLW link

SIGMI Cer­tifi­ca­tion Criteria

a littoral wizardJan 20, 2025, 2:41 AM
6 points
0 comments1 min readLW link

AXRP Epi­sode 38.5 - Adrià Gar­riga-Alonso on De­tect­ing AI Scheming

DanielFilanJan 20, 2025, 12:40 AM
9 points
0 comments16 min readLW link

The Mon­ster in Our Heads

testingthewatersJan 19, 2025, 11:58 PM
33 points
4 comments5 min readLW link

AI: How We Got Here—A Neu­ro­science Perspective

Mordechai RorvigJan 19, 2025, 11:51 PM
5 points
0 comments2 min readLW link
(www.kickstarter.com)

Agent Foun­da­tions 2025 at CMU

Jan 19, 2025, 11:48 PM
90 points
10 comments1 min readLW link

Who is mar­ket­ing AI al­ign­ment?

MrThinkJan 19, 2025, 9:37 PM
23 points
4 comments1 min readLW link

Some les­sons from the OpenAI-Fron­tierMath debacle

7vikJan 19, 2025, 9:09 PM
70 points
9 comments4 min readLW link

Max­i­mally Eggy Crepes

jefftkJan 19, 2025, 8:40 PM
12 points
0 comments1 min readLW link
(www.jefftk.com)

The sec­ond bit­ter les­son — there’s a fun­da­men­tal prob­lem with al­ign­ing dis­tributed AI

aelwoodJan 19, 2025, 7:00 PM
−5 points
0 comments5 min readLW link
(pursuingreality.substack.com)

The Gen­tle Romance

Richard_NgoJan 19, 2025, 6:29 PM
242 points
46 comments15 min readLW link
(www.asimov.press)

Is the­ory good or bad for AI safety?

Dmitry VaintrobJan 19, 2025, 10:32 AM
27 points
1 comment5 min readLW link

[Question] What’s the Right Way to think about In­for­ma­tion The­o­retic quan­tities in Neu­ral Net­works?

DalcyJan 19, 2025, 8:04 AM
45 points
13 comments3 min readLW link

Per Trib­al­is­mum ad Astra

Martin SustrikJan 19, 2025, 6:50 AM
30 points
5 comments2 min readLW link
(250bpm.substack.com)

Five Re­cent AI Tu­tor­ing Studies

Arjun PanicksseryJan 19, 2025, 3:53 AM
93 points
0 comments2 min readLW link
(arjunpanickssery.substack.com)

Shut Up and Calcu­late: Gam­bling, Div­ina­tion, and the Aba­cus as Tantra

leebriskCyranoJan 19, 2025, 3:03 AM
−1 points
0 comments5 min readLW link
(leebriskcyrano.com)

Does So­ciety need a cul­tural out­let in tur­bu­lent poli­ti­cal times?

Freya McneillJan 19, 2025, 2:45 AM
−3 points
0 comments7 min readLW link