[Question] How fa­mil­iar is the Less­wrong com­mu­nity as a whole with the con­cept of Re­ward-mod­el­ling?

OxidizeApr 9, 2025, 11:33 PM
1 point
8 comments1 min readLW link

What can we learn from ex­pert AGI fore­casts?

Benjamin_ToddApr 9, 2025, 9:34 PM
5 points
0 comments5 min readLW link
(80000hours.org)

Thoughts on AI 2027

Max HarmsApr 9, 2025, 9:26 PM
222 points
61 comments21 min readLW link
(intelligence.org)

The case for AGI by 2030

Benjamin_ToddApr 9, 2025, 8:35 PM
40 points
6 comments42 min readLW link
(80000hours.org)

Anti-au­toma­tion policy as a bot­tle­neck to eco­nomic growth

mhamptonApr 9, 2025, 8:12 PM
4 points
0 comments4 min readLW link

Rea­son­ing mod­els don’t always say what they think

Apr 9, 2025, 7:48 PM
28 points
4 comments1 min readLW link
(www.anthropic.com)

Re­v­erse en­g­ineer­ing the mem­ory lay­out of GPU inference

Paul BricmanApr 9, 2025, 3:40 PM
5 points
0 comments6 min readLW link
(noemaresearch.com)

How to defeat su­per­in­tel­li­gence, the Sta-Hi way

kilgoarApr 9, 2025, 1:58 PM
−8 points
0 comments3 min readLW link

Llama Does Not Look Good 4 Anything

ZviApr 9, 2025, 1:20 PM
31 points
1 comment16 min readLW link
(thezvi.wordpress.com)

Learned pain as a lead­ing cause of chronic pain

SoerenMindApr 9, 2025, 11:57 AM
203 points
38 comments9 min readLW link

Does the uni­verse’s recog­ni­tion of mea­sure­ment provide stronger ev­i­dence for be­ing in a simu­la­tion than uni­ver­sal fine-tun­ing?

ameliaApr 9, 2025, 8:20 AM
0 points
2 comments4 min readLW link

Tax­on­omy of possibility

dkl9Apr 9, 2025, 4:24 AM
13 points
1 comment5 min readLW link
(dkl9.net)

Short Timelines Don’t De­value Long Hori­zon Research

Vladimir_NesovApr 9, 2025, 12:42 AM
167 points
24 comments1 min readLW link

A Plat­form for Falsifi­able Con­jec­tures and Public Re­fu­ta­tion — Would This Be Use­ful?

PetrusNoniusApr 8, 2025, 9:09 PM
1 point
1 comment1 min readLW link

Quan­tify­ing SAE Qual­ity with Fea­ture Steer­abil­ity Metrics

phenomanonApr 8, 2025, 8:55 PM
2 points
0 comments4 min readLW link

MATS is hiring!

Apr 8, 2025, 8:45 PM
8 points
0 comments6 min readLW link

birds and mam­mals in­de­pen­dently evolved intelligence

bhauthApr 8, 2025, 8:00 PM
73 points
23 comments1 min readLW link
(www.quantamagazine.org)

Align­ment Fak­ing Re­vis­ited: Im­proved Clas­sifiers and Open Source Extensions

Apr 8, 2025, 5:32 PM
146 points
20 comments12 min readLW link

Lon­don Work­ing Group for Short/​Medium Term AI Risks

scronkfinkleApr 8, 2025, 5:32 PM
5 points
0 comments2 min readLW link

Think­ing Machines

Knight LeeApr 8, 2025, 5:27 PM
3 points
0 comments6 min readLW link

Digi­tal Er­ror Cor­rec­tion and Lock-In

alamertonApr 8, 2025, 3:46 PM
1 point
0 comments5 min readLW link
(alfielamerton.substack.com)

[Question] What faith­ful­ness met­rics should gen­eral claims about CoT faith­ful­ness be based upon?

Rauno ArikeApr 8, 2025, 3:27 PM
24 points
0 comments4 min readLW link

AI 2027: Responses

ZviApr 8, 2025, 12:50 PM
109 points
3 comments30 min readLW link
(thezvi.wordpress.com)

The first AI war will be in your com­puter

ViliamApr 8, 2025, 9:28 AM
43 points
10 comments3 min readLW link

Who wants to bet me $25k at 1:7 odds that there won’t be an AI mar­ket crash in the next year?

RemmeltApr 8, 2025, 8:31 AM
32 points
19 comments1 min readLW link

A Path­way to Fully Au­tonomous Therapists

Declan MolonyApr 8, 2025, 4:10 AM
5 points
2 comments6 min readLW link

Re­think­ing Fric­tion: Equity and Mo­ti­va­tion Across Domains

eltimbalinoApr 8, 2025, 3:58 AM
−1 points
0 comments2 min readLW link
(www.lesswrong.com)

On differ­ent dis­cus­sion traditions

Eugene ShcherbininApr 7, 2025, 11:00 PM
1 point
0 comments2 min readLW link

Mis­in­for­ma­tion is the de­fault, and in­for­ma­tion is the gov­ern­ment tel­ling you your tap wa­ter is safe to drink

danielechlinApr 7, 2025, 10:28 PM
10 points
2 comments9 min readLW link

Log-lin­ear Scal­ing is Worth the Cost due to Gains in Long-Hori­zon Tasks

shash42Apr 7, 2025, 9:50 PM
16 points
2 comments1 min readLW link

Paper High­lights, March ’25

gasteigerjoApr 7, 2025, 8:17 PM
8 points
0 comments9 min readLW link
(aisafetyfrontier.substack.com)

Fac­tory farm­ing in­tel­li­gent minds

Odd anonApr 7, 2025, 8:05 PM
2 points
5 comments20 min readLW link

What al­ign­ment-rele­vant abil­ities might Ter­ence Tao lack?

Towards_KeeperhoodApr 7, 2025, 7:44 PM
12 points
2 comments3 min readLW link

[Question] Are there any (semi-)de­tailed fu­ture sce­nar­ios where we win?

Jan BetleyApr 7, 2025, 7:13 PM
15 points
3 comments1 min readLW link

Austin Chen on Win­ning, Risk-Tak­ing, and FTX

ElizabethApr 7, 2025, 7:00 PM
35 points
3 comments1 min readLW link
(acesounderglass.com)

An Un­bi­ased Eval­u­a­tion of My De­bate with Thane Ruthe­nis—Run It Yourself

funnyfrancoApr 7, 2025, 6:56 PM
−24 points
14 comments2 min readLW link

Amer­i­can Col­lege Ad­mis­sions Doesn’t Need to Be So Com­pet­i­tive

Arjun PanicksseryApr 7, 2025, 5:35 PM
48 points
20 comments6 min readLW link
(arjunpanickssery.substack.com)

Cou­pling for Decouplers

Jacob FalkovichApr 7, 2025, 3:40 PM
15 points
3 comments8 min readLW link

Moon­light Reflected

Jacob FalkovichApr 7, 2025, 3:35 PM
11 points
0 comments9 min readLW link

Nav­i­ga­tion by Moonlight

Jacob FalkovichApr 7, 2025, 3:32 PM
24 points
39 comments8 min readLW link

You Are Not a Thought Experiment

Jacob FalkovichApr 7, 2025, 3:27 PM
5 points
0 comments9 min readLW link

Love is Love, Science is Fake

Jacob FalkovichApr 7, 2025, 3:19 PM
17 points
2 comments10 min readLW link

Cou­pling for De­cou­plers — Intro

Jacob FalkovichApr 7, 2025, 3:12 PM
9 points
0 comments1 min readLW link

The world ac­cord­ing to ChatGPT

Richard_Kennaway7 Apr 2025 13:44 UTC
11 points
0 comments2 min readLW link

AI 2027: Dwarkesh’s Pod­cast with Daniel Koko­ta­jlo and Scott Alexander

Zvi7 Apr 2025 13:40 UTC
67 points
2 comments26 min readLW link
(thezvi.wordpress.com)

Ar­gu­ing all sides with ChatGPT 4.5

Richard_Kennaway7 Apr 2025 13:10 UTC
6 points
0 comments8 min readLW link

The Same Heaven

Lukas Petersson7 Apr 2025 12:57 UTC
3 points
1 comment5 min readLW link
(lukaspetersson.com)

Break­ing down the MEAT of Alignment

JasonBrown7 Apr 2025 8:47 UTC
7 points
2 comments11 min readLW link

Well-found­ed­ness as an or­ga­niz­ing prin­ci­ple of healthy minds and societies

Richard_Ngo7 Apr 2025 0:31 UTC
35 points
7 comments6 min readLW link
(www.mindthefuture.info)

Arusha Per­pet­ual Chicken—an un­likely iter­ated game

James Stephen Brown6 Apr 2025 22:56 UTC
15 points
1 comment5 min readLW link
(nonzerosum.games)