The Dilemma’s Dilemma

James Stephen Brown19 Feb 2025 23:50 UTC
9 points
12 comments7 min readLW link
(nonzerosum.games)

[Question] Why do we have the NATO logo?

KvmanThinking19 Feb 2025 22:59 UTC
1 point
4 comments1 min readLW link

Me­tac­u­lus Q4 AI Bench­mark­ing: Bots Are Clos­ing The Gap

19 Feb 2025 22:42 UTC
13 points
0 comments13 min readLW link
(www.metaculus.com)

Sev­eral Ar­gu­ments Against the Math­e­mat­i­cal Uni­verse Hy­poth­e­sis

Vittu Perkele19 Feb 2025 22:13 UTC
−4 points
6 comments3 min readLW link
(open.substack.com)

Liter­a­ture Re­view of Text AutoEncoders

NickyP19 Feb 2025 21:54 UTC
20 points
5 comments8 min readLW link

Deep­Seek Made it Even Harder for US AI Com­pa­nies to Ever Reach Profitability

garrison19 Feb 2025 21:02 UTC
10 points
1 comment3 min readLW link
(garrisonlovely.substack.com)

Won’t vs. Can’t: Sand­bag­ging-like Be­hav­ior from Claude Models

19 Feb 2025 20:47 UTC
15 points
1 comment1 min readLW link
(alignment.anthropic.com)

AI Align­ment and the Fi­nan­cial War Against Nar­cis­sis­tic Manipulation

henophilia19 Feb 2025 20:42 UTC
−17 points
2 comments3 min readLW link

How to Make Superbabies

19 Feb 2025 20:39 UTC
625 points
358 comments31 min readLW link

The New­bie’s Guide to Nav­i­gat­ing AI Futures

keithjmenezes19 Feb 2025 20:37 UTC
−1 points
0 comments40 min readLW link

Against Un­limited Ge­nius for Baby-Killers

ggggg19 Feb 2025 20:33 UTC
−7 points
1 comment3 min readLW link
(ggggggggggggggggggggggg.substack.com)

New LLM Scal­ing Law

wrmedford19 Feb 2025 20:21 UTC
2 points
0 comments1 min readLW link
(github.com)

Go Grok Yourself

Zvi19 Feb 2025 20:20 UTC
57 points
2 comments17 min readLW link
(thezvi.wordpress.com)

[Question] Take over my pro­ject: do com­putable agents plan against the uni­ver­sal dis­tri­bu­tion pes­simisti­cally?

Cole Wyeth19 Feb 2025 20:17 UTC
25 points
3 comments3 min readLW link

When should we worry about AI power-seek­ing?

Joe Carlsmith19 Feb 2025 19:44 UTC
22 points
0 comments18 min readLW link
(joecarlsmith.substack.com)

Su­perBa­bies pod­cast with Gene Smith

Eneasz19 Feb 2025 19:36 UTC
35 points
1 comment1 min readLW link
(thebayesianconspiracy.substack.com)

Un­de­sir­able Con­clu­sions and Ori­gin Adjustment

Jerdle19 Feb 2025 18:35 UTC
3 points
0 comments5 min readLW link

How might we safely pass the buck to AI?

joshc19 Feb 2025 17:48 UTC
83 points
58 comments31 min readLW link

Us­ing Prompt Eval­u­a­tion to Com­bat Bio-Weapon Research

19 Feb 2025 12:39 UTC
11 points
2 comments3 min readLW link

In­tel­li­gence Is Jagged

Adam Train19 Feb 2025 7:08 UTC
6 points
1 comment3 min readLW link

Closed-ended ques­tions aren’t as hard as you think

electroswing19 Feb 2025 3:53 UTC
6 points
0 comments3 min readLW link

Un­der­grad AI Safety Conference

JoNeedsSleep19 Feb 2025 3:43 UTC
19 points
0 comments1 min readLW link

Per­ma­nent prop­er­ties of things are a self-fulfilling prophecy

YanLyutnev19 Feb 2025 0:08 UTC
4 points
0 comments9 min readLW link

Places of Lov­ing Grace [Story]

ank18 Feb 2025 23:49 UTC
−1 points
0 comments4 min readLW link

Are SAE fea­tures from the Base Model still mean­ingful to LLaVA?

Shan23Chen18 Feb 2025 22:16 UTC
8 points
2 comments10 min readLW link
(www.lesswrong.com)

Sparse Au­toen­coder Fea­tures for Clas­sifi­ca­tions and Transferability

Shan23Chen18 Feb 2025 22:14 UTC
5 points
0 comments1 min readLW link
(arxiv.org)

A fable on AI x-risk

bgaesop18 Feb 2025 20:15 UTC
8 points
4 comments1 min readLW link

The Un­earned Priv­ilege We Rarely Dis­cuss: Cog­ni­tive Capability

DiegoRojas18 Feb 2025 20:06 UTC
−21 points
7 comments3 min readLW link

Call for Ap­pli­ca­tions: XLab Sum­mer Re­search Fel­low­ship

JoNeedsSleep18 Feb 2025 19:19 UTC
9 points
0 comments1 min readLW link

AISN #48: Utility Eng­ineer­ing and EnigmaEval

18 Feb 2025 19:15 UTC
4 points
0 comments4 min readLW link
(newsletter.safe.ai)

Ab­stract Math­e­mat­i­cal Con­cepts vs. Ab­strac­tions Over Real-World Systems

Thane Ruthenis18 Feb 2025 18:04 UTC
32 points
10 comments4 min readLW link

How ac­cu­rate was my “Altered Traits” book re­view?

lsusr18 Feb 2025 17:00 UTC
43 points
3 comments3 min readLW link

Med­i­cal Roundup #4

Zvi18 Feb 2025 13:40 UTC
24 points
3 comments10 min readLW link
(thezvi.wordpress.com)

Dear AGI,

Nathan Young18 Feb 2025 10:48 UTC
88 points
11 comments3 min readLW link

There are a lot of up­com­ing re­treats/​con­fer­ences be­tween March and July (2025)

18 Feb 2025 9:30 UTC
6 points
0 comments1 min readLW link

Sea Change

Charlie Sanders18 Feb 2025 6:03 UTC
−2 points
2 comments5 min readLW link
(www.dailymicrofiction.com)

Born on Third Base: The Case for In­her­it­ing Noth­ing and Build­ing Every­thing

charlieoneill18 Feb 2025 0:47 UTC
−24 points
16 comments2 min readLW link

Do mod­els know when they are be­ing eval­u­ated?

17 Feb 2025 23:13 UTC
57 points
9 comments12 min readLW link

AGI Safety & Align­ment @ Google Deep­Mind is hiring

Rohin Shah17 Feb 2025 21:11 UTC
102 points
19 comments10 min readLW link

The Peeperi (un­finished) - By Katja Grace

Nathan Young17 Feb 2025 19:33 UTC
22 points
0 comments3 min readLW link
(docs.google.com)

Progress links and short notes, 2025-02-17

jasoncrawford17 Feb 2025 19:18 UTC
8 points
0 comments7 min readLW link
(newsletter.rootsofprogress.org)

Claude 3.5 Son­net (New)’s AGI scenario

Nathan Young17 Feb 2025 18:47 UTC
5 points
2 comments5 min readLW link

Talk­ing to lay­men about AI de­vel­op­ment

David Steel17 Feb 2025 18:42 UTC
8 points
0 comments1 min readLW link

On the Re­birth of Aris­toc­racy in the Amer­i­can Regime

shawkisukkar17 Feb 2025 16:18 UTC
−16 points
3 comments9 min readLW link
(shawkisukkar.substack.com)

Ascetic hedonism

dkl917 Feb 2025 15:56 UTC
15 points
9 comments2 min readLW link
(dkl9.net)

AIS Ber­lin, events, op­por­tu­ni­ties and the flipped game­board—Field­builders Newslet­ter, Fe­bru­ary 2025

17 Feb 2025 14:16 UTC
6 points
0 comments3 min readLW link

Monthly Roundup #27: Fe­bru­ary 2025

Zvi17 Feb 2025 14:10 UTC
27 points
3 comments44 min readLW link
(thezvi.wordpress.com)

What new x- or s-risk field­build­ing or­gani­sa­tions would you like to see? An EOI form. (FBB #3)

gergogaspar17 Feb 2025 12:39 UTC
6 points
0 comments2 min readLW link

A His­tory of the Fu­ture, 2025-2040

L Rudolf L17 Feb 2025 12:03 UTC
238 points
42 comments75 min readLW link
(nosetgauge.substack.com)

Ther­mo­dy­namic en­tropy = Kol­mogorov complexity

Aram Ebtekar17 Feb 2025 5:56 UTC
76 points
14 comments1 min readLW link
(doi.org)