A Plat­form for Falsifi­able Con­jec­tures and Public Re­fu­ta­tion — Would This Be Use­ful?

PetrusNoniusApr 8, 2025, 9:09 PM
1 point
1 comment1 min readLW link

Quan­tify­ing SAE Qual­ity with Fea­ture Steer­abil­ity Metrics

phenomanonApr 8, 2025, 8:55 PM
2 points
0 comments4 min readLW link

MATS is hiring!

Apr 8, 2025, 8:45 PM
8 points
0 comments6 min readLW link

birds and mam­mals in­de­pen­dently evolved intelligence

bhauthApr 8, 2025, 8:00 PM
73 points
23 comments1 min readLW link
(www.quantamagazine.org)

Align­ment Fak­ing Re­vis­ited: Im­proved Clas­sifiers and Open Source Extensions

Apr 8, 2025, 5:32 PM
146 points
20 comments12 min readLW link

Lon­don Work­ing Group for Short/​Medium Term AI Risks

scronkfinkleApr 8, 2025, 5:32 PM
5 points
0 comments2 min readLW link

Think­ing Machines

Knight LeeApr 8, 2025, 5:27 PM
3 points
0 comments6 min readLW link

Digi­tal Er­ror Cor­rec­tion and Lock-In

alamertonApr 8, 2025, 3:46 PM
1 point
0 comments5 min readLW link
(alfielamerton.substack.com)

[Question] What faith­ful­ness met­rics should gen­eral claims about CoT faith­ful­ness be based upon?

Rauno ArikeApr 8, 2025, 3:27 PM
24 points
0 comments4 min readLW link

AI 2027: Responses

ZviApr 8, 2025, 12:50 PM
109 points
3 comments30 min readLW link
(thezvi.wordpress.com)

The first AI war will be in your com­puter

ViliamApr 8, 2025, 9:28 AM
43 points
10 comments3 min readLW link

Who wants to bet me $25k at 1:7 odds that there won’t be an AI mar­ket crash in the next year?

RemmeltApr 8, 2025, 8:31 AM
32 points
19 comments1 min readLW link

A Path­way to Fully Au­tonomous Therapists

Declan MolonyApr 8, 2025, 4:10 AM
5 points
2 comments6 min readLW link

Re­think­ing Fric­tion: Equity and Mo­ti­va­tion Across Domains

eltimbalinoApr 8, 2025, 3:58 AM
−1 points
0 comments2 min readLW link
(www.lesswrong.com)

On differ­ent dis­cus­sion traditions

Eugene ShcherbininApr 7, 2025, 11:00 PM
1 point
0 comments2 min readLW link

Mis­in­for­ma­tion is the de­fault, and in­for­ma­tion is the gov­ern­ment tel­ling you your tap wa­ter is safe to drink

d_el_ezApr 7, 2025, 10:28 PM
10 points
2 comments9 min readLW link

Log-lin­ear Scal­ing is Worth the Cost due to Gains in Long-Hori­zon Tasks

shash42Apr 7, 2025, 9:50 PM
16 points
2 comments1 min readLW link

AI Safety at the Fron­tier: Paper High­lights, March ’25

gasteigerjoApr 7, 2025, 8:17 PM
9 points
0 comments9 min readLW link
(aisafetyfrontier.substack.com)

Fac­tory farm­ing in­tel­li­gent minds

Odd anonApr 7, 2025, 8:05 PM
2 points
5 comments20 min readLW link

What al­ign­ment-rele­vant abil­ities might Ter­ence Tao lack?

Towards_KeeperhoodApr 7, 2025, 7:44 PM
12 points
2 comments3 min readLW link

[Question] Are there any (semi-)de­tailed fu­ture sce­nar­ios where we win?

Jan BetleyApr 7, 2025, 7:13 PM
15 points
3 comments1 min readLW link

Austin Chen on Win­ning, Risk-Tak­ing, and FTX

ElizabethApr 7, 2025, 7:00 PM
35 points
3 comments1 min readLW link
(acesounderglass.com)

An Un­bi­ased Eval­u­a­tion of My De­bate with Thane Ruthe­nis—Run It Yourself

funnyfrancoApr 7, 2025, 6:56 PM
−24 points
14 comments2 min readLW link

Amer­i­can Col­lege Ad­mis­sions Doesn’t Need to Be So Com­pet­i­tive

Arjun PanicksseryApr 7, 2025, 5:35 PM
48 points
20 comments6 min readLW link
(arjunpanickssery.substack.com)

Cou­pling for Decouplers

Jacob FalkovichApr 7, 2025, 3:40 PM
15 points
3 comments8 min readLW link

Moon­light Reflected

Jacob FalkovichApr 7, 2025, 3:35 PM
11 points
0 comments9 min readLW link

Nav­i­ga­tion by Moonlight

Jacob FalkovichApr 7, 2025, 3:32 PM
24 points
39 comments8 min readLW link

You Are Not a Thought Experiment

Jacob FalkovichApr 7, 2025, 3:27 PM
5 points
0 comments9 min readLW link

Love is Love, Science is Fake

Jacob FalkovichApr 7, 2025, 3:19 PM
17 points
2 comments10 min readLW link

Cou­pling for De­cou­plers — Intro

Jacob FalkovichApr 7, 2025, 3:12 PM
9 points
0 comments1 min readLW link

The world ac­cord­ing to ChatGPT

Richard_KennawayApr 7, 2025, 1:44 PM
11 points
0 comments2 min readLW link

AI 2027: Dwarkesh’s Pod­cast with Daniel Koko­ta­jlo and Scott Alexander

ZviApr 7, 2025, 1:40 PM
67 points
2 comments26 min readLW link
(thezvi.wordpress.com)

Ar­gu­ing all sides with ChatGPT 4.5

Richard_KennawayApr 7, 2025, 1:10 PM
6 points
0 comments8 min readLW link

The Same Heaven

Lukas PeterssonApr 7, 2025, 12:57 PM
3 points
1 comment5 min readLW link
(lukaspetersson.com)

Break­ing down the MEAT of Alignment

JasonBrownApr 7, 2025, 8:47 AM
7 points
2 comments11 min readLW link

Well-found­ed­ness as an or­ga­niz­ing prin­ci­ple of healthy minds and societies

Richard_NgoApr 7, 2025, 12:31 AM
35 points
7 comments6 min readLW link
(www.mindthefuture.info)

Arusha Per­pet­ual Chicken—an un­likely iter­ated game

James Stephen BrownApr 6, 2025, 10:56 PM
15 points
1 comment5 min readLW link
(nonzerosum.games)

How Gay is the Vat­i­can?

rbaApr 6, 2025, 9:27 PM
58 points
32 comments7 min readLW link

RFC: a tool to cre­ate a ranked list of pro­jects in ex­plain­able AI

eamagApr 6, 2025, 9:18 PM
2 points
0 comments1 min readLW link
(eamag.me)

Aus­tralia’s AI Cross­roads: Elec­tion 2025 Town Hall

Peter HorniakApr 6, 2025, 9:17 PM
1 point
0 comments1 min readLW link

The Lizard­man and the Black Hat Bobcat

ScrewtapeApr 6, 2025, 7:02 PM
107 points
15 comments9 min readLW link

Would this solve the (outer) al­ign­ment prob­lem, or at least help?

Wes RApr 6, 2025, 6:49 PM
−2 points
1 comment13 min readLW link

[Question] What are the fun­da­men­tal differ­ences be­tween teach­ing the AIs and hu­mans?

StanislavKrymApr 6, 2025, 6:17 PM
3 points
0 comments1 min readLW link

An “Op­ti­mistic” 2027 Timeline

YitzApr 6, 2025, 4:39 PM
13 points
16 comments9 min readLW link

Thoughts on Creat­ing a Good Language

Towards_KeeperhoodApr 6, 2025, 3:57 PM
1 point
2 comments7 min readLW link

The REPHRASE Cir­cuit: How Fine-Tun­ing En­hances LLMs to REPHRASE Text

Karthik ViswanathanApr 6, 2025, 3:02 PM
4 points
0 comments5 min readLW link

[Re­search sprint] Sin­gle-model cross­coder fea­ture ab­la­tion and steering

Thomas ReadApr 6, 2025, 2:42 PM
8 points
0 comments12 min readLW link

Fer­rer, Pilar, and Me

AskwhoApr 6, 2025, 11:22 AM
21 points
1 comment4 min readLW link
(open.substack.com)

FlexChunk: En­abling 100M×100M Out-of-Core SpMV (~1.8 min, ~1.7 GB RAM) with Near-Lin­ear Scaling

Daniil StrizhovApr 6, 2025, 5:27 AM
1 point
0 comments7 min readLW link

A col­lec­tion of ap­proaches to con­fronting doom, and my thoughts on them

RubyApr 6, 2025, 2:11 AM
48 points
18 comments12 min readLW link