LW/​ACX So­cial Meetup

StefanMar 12, 2025, 11:13 PM
2 points
0 comments1 min readLW link

I grade ev­ery NBA bas­ket­ball game I watch based on enjoyability

proshowersingerMar 12, 2025, 9:46 PM
24 points
2 comments4 min readLW link

Kairos is hiring a Head of Oper­a­tions/​Found­ing Generalist

agucovaMar 12, 2025, 8:58 PM
6 points
0 commentsLW link

USAID Out­look: A Me­tac­u­lus Fore­cast­ing Series

ChristianWilliamsMar 12, 2025, 8:34 PM
9 points
0 commentsLW link
(www.metaculus.com)

What is in­stru­men­tal con­ver­gence?

Mar 12, 2025, 8:28 PM
2 points
0 comments2 min readLW link
(aisafety.info)

Re­vis­ing Stages-Over­sight Re­veals Greater Si­tu­a­tional Aware­ness in LLMs

Sanyu RajakumarMar 12, 2025, 5:56 PM
16 points
0 comments13 min readLW link

Why Obe­di­ent AI May Be the Real Catastrophe

G~Mar 12, 2025, 5:50 PM
5 points
2 comments3 min readLW link

Your Com­mu­ni­ca­tion Prefer­ences Aren’t Law

Jonathan MoregårdMar 12, 2025, 5:20 PM
25 points
4 comments1 min readLW link
(honestliving.substack.com)

Reflec­tions on Neuralese

Alice BlairMar 12, 2025, 4:29 PM
28 points
0 comments5 min readLW link

Field tests of semi-ra­tio­nal­ity in Brazilian mil­i­tary training

P. JoãoMar 12, 2025, 4:14 PM
31 points
0 comments2 min readLW link

Many life-sav­ing drugs fail for lack of fund­ing. But there’s a solu­tion: des­per­ate rich people

MvolzMar 12, 2025, 3:24 PM
17 points
0 comments1 min readLW link
(www.theguardian.com)

The Most For­bid­den Technique

ZviMar 12, 2025, 1:20 PM
143 points
9 comments17 min readLW link
(thezvi.wordpress.com)

You don’t ac­tu­ally need a phys­i­cal mul­ti­verse to ex­plain an­thropic fine-tun­ing.

FraserMar 12, 2025, 7:33 AM
7 points
8 comments3 min readLW link
(frvser.com)

AI Can’t Write Good Fiction

JustisMillsMar 12, 2025, 6:11 AM
38 points
24 comments7 min readLW link
(justismills.substack.com)

Ex­ist­ing UDTs test the limits of Bayesi­anism (and con­sis­tency)

Cole WyethMar 12, 2025, 4:09 AM
28 points
21 comments7 min readLW link

(Anti)Aging 101

George3d6Mar 12, 2025, 3:59 AM
5 points
2 comments3 min readLW link
(cerebralab.com)

The Grapes of Hardness

adamShimiMar 11, 2025, 9:01 PM
8 points
0 comments5 min readLW link
(formethods.substack.com)

Don’t over-up­date on Fron­tierMath results

David MatolcsiMar 11, 2025, 8:44 PM
51 points
7 comments9 min readLW link

Re­sponse to Scott Alexan­der on Imprisonment

ZviMar 11, 2025, 8:40 PM
40 points
4 comments9 min readLW link
(thezvi.wordpress.com)

Paths and waysta­tions in AI safety

Joe CarlsmithMar 11, 2025, 6:52 PM
41 points
1 comment11 min readLW link
(joecarlsmith.substack.com)

Meri­dian Cam­bridge Visit­ing Re­searcher Pro­gramme: Turn AI safety ideas into funded pro­jects in one week!

Meridian CambridgeMar 11, 2025, 5:46 PM
13 points
0 comments2 min readLW link

Elon Musk May Be Tran­si­tion­ing to Bipo­lar Type I

Cyborg25Mar 11, 2025, 5:45 PM
83 points
22 comments4 min readLW link

Scal­ing AI Reg­u­la­tion: Real­is­ti­cally, what Can (and Can’t) Be Reg­u­lated?

Katalina HernandezMar 11, 2025, 4:51 PM
3 points
1 comment3 min readLW link

How Lan­guage Models Un­der­stand Nullability

Mar 11, 2025, 3:57 PM
5 points
0 comments2 min readLW link
(dmodel.ai)

Forethought: a new AI macros­trat­egy group

Mar 11, 2025, 3:39 PM
18 points
0 comments3 min readLW link

Prepar­ing for the In­tel­li­gence Explosion

Mar 11, 2025, 3:38 PM
78 points
17 comments1 min readLW link
(www.forethought.org)

stop solv­ing prob­lems that have already been solved

dhruvmethiMar 11, 2025, 3:30 PM
10 points
3 comments8 min readLW link

AI Con­trol May In­crease Ex­is­ten­tial Risk

Jan_KulveitMar 11, 2025, 2:30 PM
98 points
13 comments1 min readLW link

When is it Bet­ter to Train on the Align­ment Proxy?

dil-leik-ogMar 11, 2025, 1:35 PM
14 points
0 comments9 min readLW link

A differ­ent take on the Musk v OpenAI pre­limi­nary in­junc­tion order

TFDMar 11, 2025, 12:46 PM
8 points
0 comments20 min readLW link
(www.thefloatingdroid.com)

Do rea­son­ing mod­els use their scratch­pad like we do? Ev­i­dence from dis­till­ing paraphrases

Fabien RogerMar 11, 2025, 11:52 AM
121 points
23 comments11 min readLW link
(alignment.anthropic.com)

A Hog­warts Guide to Citizenship

WillPetilloMar 11, 2025, 5:50 AM
7 points
1 comment3 min readLW link

Cog­ni­tive Refram­ing—How to Over­come Nega­tive Thought Pat­terns and Behaviors

Declan MolonyMar 11, 2025, 4:56 AM
11 points
0 comments4 min readLW link

Tro­jan Sky

Richard_NgoMar 11, 2025, 3:14 AM
245 points
39 comments12 min readLW link
(www.narrativeark.xyz)

OpenAI: De­tect­ing mis­be­hav­ior in fron­tier rea­son­ing models

Daniel KokotajloMar 11, 2025, 2:17 AM
183 points
26 comments4 min readLW link
(openai.com)

HPMOR An­niver­sary Par­ties: Co­or­di­na­tion, Re­sources, and Discussion

ScrewtapeMar 11, 2025, 1:30 AM
52 points
6 comments7 min readLW link

Po­si­tional ker­nels of at­ten­tion heads

Alex GibsonMar 10, 2025, 11:17 PM
9 points
0 comments12 min readLW link

Progress links and short notes, 2025-03-10

jasoncrawfordMar 10, 2025, 8:27 PM
8 points
0 comments4 min readLW link
(newsletter.rootsofprogress.org)

The Manus Mar­ket­ing Madness

ZviMar 10, 2025, 8:10 PM
54 points
0 comments24 min readLW link
(thezvi.wordpress.com)

You can just play

aswath krishnanMar 10, 2025, 8:00 PM
−5 points
0 comments2 min readLW link

How to Use Prompt Eng­ineer­ing to Rewire Your Brain

aswath krishnanMar 10, 2025, 8:00 PM
1 point
0 comments5 min readLW link
(www.aswathkrishnan.com)

When In­de­pen­dent Op­ti­miza­tion Is Worse Than Randomness

Chaotic rationalistMar 10, 2025, 7:46 PM
−4 points
0 comments2 min readLW link

Stress ex­ists only where the Mind makes it

NoahhMar 10, 2025, 7:44 PM
5 points
2 comments4 min readLW link

Coun­ter­ar­gu­ment to Godel’s Mo­dal On­tolog­i­cal Argument

WynnMar 10, 2025, 7:38 PM
−1 points
0 comments4 min readLW link

[Question] How much do fron­tier LLMs code and browse while in train­ing?

Joe RogeroMar 10, 2025, 7:34 PM
7 points
0 comments1 min readLW link

Ob­ser­va­tions on self-su­per­vised Learn­ing for vision

Dinkar JuyalMar 10, 2025, 7:31 PM
3 points
0 comments5 min readLW link

In­tro­duc­ing 11 New AI Safety Or­ga­ni­za­tions—Cat­alyze’s Win­ter 24/​25 Lon­don In­cu­ba­tion Pro­gram Cohort

Alexandra BosMar 10, 2025, 7:26 PM
70 points
0 commentsLW link

The Jack­pot Jinx (or why “Su­per­in­tel­li­gence Strat­egy” is wrong)

E.G. Blee-GoldmanMar 10, 2025, 7:18 PM
13 points
0 comments5 min readLW link

Effec­tive AI Outreach | A Data Driven Approach

NoahCWilsonMar 10, 2025, 7:18 PM
1 point
0 comments15 min readLW link

Emer­gent AI So­ciety. Tasks, Scarcity, Talks

Andrey SeryakovMar 10, 2025, 7:18 PM
1 point
0 comments5 min readLW link