20 Feb 2025 23:54 UTC

99 points

1 comment8 min readLW link

Biological humans collectively exert at most 400 gigabits/s of control over the world.

benwr20 Feb 2025 23:44 UTC

15 points

3 comments1 min readLW link

The first RCT for GLP-1 drugs and alcoholism isn’t what we hoped

dynomight20 Feb 2025 22:30 UTC

62 points

4 comments6 min readLW link

(dynomight.net)

Published report: Pathways to short TAI timelines

Zershaaneh Qureshi20 Feb 2025 22:10 UTC

22 points

0 comments17 min readLW link

(www.convergenceanalysis.org)

Neural Scaling Laws Rooted in the Data Distribution

aribrill20 Feb 2025 21:22 UTC

8 points

0 comments1 min readLW link

(arxiv.org)

Demonstrating specification gaming in reasoning models

Matrice Jacobine20 Feb 2025 19:26 UTC

4 points

0 comments1 min readLW link

(arxiv.org)

What makes a theory of intelligence useful?

Cole Wyeth20 Feb 2025 19:22 UTC

16 points

0 comments11 min readLW link

AI #104: American State Capacity on the Brink

Zvi20 Feb 2025 14:50 UTC

37 points

9 comments44 min readLW link

(thezvi.wordpress.com)

US AI Safety Institute will be ‘gutted,’ Axios reports

Matrice Jacobine20 Feb 2025 14:40 UTC

11 points

1 comment1 min readLW link

(www.zdnet.com)

Human-AI Relationality is Already Here

bridgebot20 Feb 2025 7:08 UTC

17 points

0 comments15 min readLW link

Safe Distillation With a Powerful Untrusted AI

Alek Westover20 Feb 2025 3:14 UTC

5 points

1 comment5 min readLW link

Modularity and assembly: AI safety via thinking smaller

D Wong20 Feb 2025 0:58 UTC

2 points

0 comments11 min readLW link

(criticalreason.substack.com)

Eliezer’s Lost Alignment Articles / The Arbital Sequence

Ruby and RobertM

20 Feb 2025 0:48 UTC

207 points

10 comments5 min readLW link

Arbital has been imported to LessWrong

RobertM, jimrandomh, Ben Pace and Ruby

20 Feb 2025 0:47 UTC

281 points

30 comments5 min readLW link

The Dilemma’s Dilemma

James Stephen Brown19 Feb 2025 23:50 UTC

9 points

12 comments7 min readLW link

(nonzerosum.games)

[Question] Why do we have the NATO logo?

KvmanThinking19 Feb 2025 22:59 UTC

1 point

4 comments1 min readLW link

Metaculus Q4 AI Benchmarking: Bots Are Closing The Gap

Molly and Tom Liptay

19 Feb 2025 22:42 UTC

13 points

0 comments13 min readLW link

(www.metaculus.com)

Several Arguments Against the Mathematical Universe Hypothesis

Vittu Perkele19 Feb 2025 22:13 UTC

−4 points

6 comments3 min readLW link

(open.substack.com)

Literature Review of Text AutoEncoders

NickyP19 Feb 2025 21:54 UTC

20 points

5 comments8 min readLW link

DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability

garrison19 Feb 2025 21:02 UTC

10 points

1 comment3 min readLW link

(garrisonlovely.substack.com)

Won’t vs. Can’t: Sandbagging-like Behavior from Claude Models

Joe Benton and Zachary Witten

19 Feb 2025 20:47 UTC

15 points

1 comment1 min readLW link

(alignment.anthropic.com)

AI Alignment and the Financial War Against Narcissistic Manipulation

henophilia19 Feb 2025 20:42 UTC

−17 points

2 comments3 min readLW link

How to Make Superbabies

GeneSmith and kman

19 Feb 2025 20:39 UTC

625 points

358 comments31 min readLW link

The Newbie’s Guide to Navigating AI Futures

keithjmenezes19 Feb 2025 20:37 UTC

−1 points

0 comments40 min readLW link

Against Unlimited Genius for Baby-Killers

ggggg19 Feb 2025 20:33 UTC

−7 points

1 comment3 min readLW link

(ggggggggggggggggggggggg.substack.com)

New LLM Scaling Law

wrmedford19 Feb 2025 20:21 UTC

2 points

0 comments1 min readLW link

(github.com)

Go Grok Yourself

Zvi19 Feb 2025 20:20 UTC

57 points

2 comments17 min readLW link

(thezvi.wordpress.com)

[Question] Take over my project: do computable agents plan against the universal distribution pessimistically?

Cole Wyeth19 Feb 2025 20:17 UTC

25 points

3 comments3 min readLW link

When should we worry about AI power-seeking?

Joe Carlsmith19 Feb 2025 19:44 UTC

22 points

0 comments18 min readLW link

(joecarlsmith.substack.com)

SuperBabies podcast with Gene Smith

Eneasz19 Feb 2025 19:36 UTC

35 points

1 comment1 min readLW link

(thebayesianconspiracy.substack.com)

Undesirable Conclusions and Origin Adjustment

Jerdle19 Feb 2025 18:35 UTC

3 points

0 comments5 min readLW link

How might we safely pass the buck to AI?

joshc19 Feb 2025 17:48 UTC

83 points

58 comments31 min readLW link

Using Prompt Evaluation to Combat Bio-Weapon Research

Stuart_Armstrong and rgorman

19 Feb 2025 12:39 UTC

11 points

2 comments3 min readLW link

Intelligence Is Jagged

Adam Train19 Feb 2025 7:08 UTC

6 points

1 comment3 min readLW link

Closed-ended questions aren’t as hard as you think

electroswing19 Feb 2025 3:53 UTC

6 points

0 comments3 min readLW link

Undergrad AI Safety Conference

JoNeedsSleep19 Feb 2025 3:43 UTC

19 points

0 comments1 min readLW link

Permanent properties of things are a self-fulfilling prophecy

YanLyutnev19 Feb 2025 0:08 UTC

4 points

0 comments9 min readLW link

Places of Loving Grace [Story]

ank18 Feb 2025 23:49 UTC

−1 points

0 comments4 min readLW link

Are SAE features from the Base Model still meaningful to LLaVA?

Shan23Chen18 Feb 2025 22:16 UTC

8 points

2 comments10 min readLW link

(www.lesswrong.com)

Sparse Autoencoder Features for Classifications and Transferability

Shan23Chen18 Feb 2025 22:14 UTC

5 points

0 comments1 min readLW link

(arxiv.org)

A fable on AI x-risk

bgaesop18 Feb 2025 20:15 UTC

8 points

4 comments1 min readLW link

The Unearned Privilege We Rarely Discuss: Cognitive Capability

DiegoRojas18 Feb 2025 20:06 UTC

−21 points

7 comments3 min readLW link

Call for Applications: XLab Summer Research Fellowship

JoNeedsSleep18 Feb 2025 19:19 UTC

9 points

0 comments1 min readLW link

AISN #48: Utility Engineering and EnigmaEval

Corin Katzke and Dan H

18 Feb 2025 19:15 UTC

4 points

0 comments4 min readLW link

(newsletter.safe.ai)

Abstract Mathematical Concepts vs. Abstractions Over Real-World Systems

Thane Ruthenis18 Feb 2025 18:04 UTC

32 points

10 comments4 min readLW link

How accurate was my “Altered Traits” book review?

lsusr18 Feb 2025 17:00 UTC

43 points

3 comments3 min readLW link

Medical Roundup #4

Zvi18 Feb 2025 13:40 UTC

24 points

3 comments10 min readLW link

(thezvi.wordpress.com)

Dear AGI,

Nathan Young18 Feb 2025 10:48 UTC

88 points

11 comments3 min readLW link

There are a lot of upcoming retreats/conferences between March and July (2025)

gergogaspar and ENAIS

18 Feb 2025 9:30 UTC

6 points

0 comments1 min readLW link

Sea Change

Charlie Sanders18 Feb 2025 6:03 UTC

−2 points

2 comments5 min readLW link

(www.dailymicrofiction.com)