All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28

The Dilemma’s Dilemma

James Stephen Brown19 Feb 2025 23:50 UTC

9 points

12 comments7 min readLW link

(nonzerosum.games)

[Question] Why do we have the NATO logo?

KvmanThinking19 Feb 2025 22:59 UTC

1 point

4 comments1 min readLW link

Metaculus Q4 AI Benchmarking: Bots Are Closing The Gap

Molly and Tom Liptay

19 Feb 2025 22:42 UTC

14 points

0 comments13 min readLW link

(www.metaculus.com)

Several Arguments Against the Mathematical Universe Hypothesis

Vittu Perkele19 Feb 2025 22:13 UTC

−4 points

6 comments3 min readLW link

(open.substack.com)

Literature Review of Text AutoEncoders

NickyP19 Feb 2025 21:54 UTC

22 points

5 comments8 min readLW link

DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability

garrison19 Feb 2025 21:02 UTC

10 points

1 comment3 min readLW link

(garrisonlovely.substack.com)

Won’t vs. Can’t: Sandbagging-like Behavior from Claude Models

Joe Benton and Zachary Witten

19 Feb 2025 20:47 UTC

15 points

1 comment1 min readLW link

(alignment.anthropic.com)

AI Alignment and the Financial War Against Narcissistic Manipulation

henophilia19 Feb 2025 20:42 UTC

−17 points

2 comments3 min readLW link

How to Make Superbabies

GeneSmith and kman

19 Feb 2025 20:39 UTC

649 points

363 comments31 min readLW link

The Newbie’s Guide to Navigating AI Futures

keithjmenezes19 Feb 2025 20:37 UTC

−1 points

0 comments40 min readLW link

Against Unlimited Genius for Baby-Killers

ggggg19 Feb 2025 20:33 UTC

−7 points

1 comment3 min readLW link

(ggggggggggggggggggggggg.substack.com)

New LLM Scaling Law

wrmedford19 Feb 2025 20:21 UTC

2 points

0 comments1 min readLW link

(github.com)

Go Grok Yourself

Zvi19 Feb 2025 20:20 UTC

57 points

2 comments17 min readLW link

(thezvi.wordpress.com)

[Question] Take over my project: do computable agents plan against the universal distribution pessimistically?

Cole Wyeth19 Feb 2025 20:17 UTC

25 points

4 comments3 min readLW link

When should we worry about AI power-seeking?

Joe Carlsmith19 Feb 2025 19:44 UTC

30 points

0 comments18 min readLW link

(joecarlsmith.substack.com)

SuperBabies podcast with Gene Smith

Eneasz19 Feb 2025 19:36 UTC

35 points

1 comment1 min readLW link

(thebayesianconspiracy.substack.com)

Undesirable Conclusions and Origin Adjustment

Jerdle19 Feb 2025 18:35 UTC

3 points

0 comments5 min readLW link

How might we safely pass the buck to AI?

joshc19 Feb 2025 17:48 UTC

91 points

58 comments31 min readLW link

Using Prompt Evaluation to Combat Bio-Weapon Research

Stuart_Armstrong and rgorman

19 Feb 2025 12:39 UTC

11 points

2 comments3 min readLW link

Intelligence Is Jagged

Adam Train19 Feb 2025 7:08 UTC

6 points

1 comment3 min readLW link

Closed-ended questions aren’t as hard as you think

electroswing19 Feb 2025 3:53 UTC

6 points

0 comments3 min readLW link

Undergrad AI Safety Conference

Jo Jiao19 Feb 2025 3:43 UTC

19 points

0 comments1 min readLW link

Permanent properties of things are a self-fulfilling prophecy

YanLyutnev19 Feb 2025 0:08 UTC

4 points

0 comments9 min readLW link

Places of Loving Grace [Story]

ank18 Feb 2025 23:49 UTC

−1 points

0 comments4 min readLW link

Are SAE features from the Base Model still meaningful to LLaVA?

Shan23Chen18 Feb 2025 22:16 UTC

8 points

2 comments10 min readLW link

(www.lesswrong.com)

Sparse Autoencoder Features for Classifications and Transferability

Shan23Chen18 Feb 2025 22:14 UTC

5 points

0 comments1 min readLW link

(arxiv.org)

A fable on AI x-risk

bgaesop18 Feb 2025 20:15 UTC

8 points

4 comments1 min readLW link

The Unearned Privilege We Rarely Discuss: Cognitive Capability

DiegoRojas18 Feb 2025 20:06 UTC

−21 points

7 comments3 min readLW link

Call for Applications: XLab Summer Research Fellowship

Jo Jiao18 Feb 2025 19:19 UTC

12 points

0 comments1 min readLW link

AISN #48: Utility Engineering and EnigmaEval

Corin Katzke and Dan H

18 Feb 2025 19:15 UTC

4 points

0 comments4 min readLW link

(newsletter.safe.ai)

Abstract Mathematical Concepts vs. Abstractions Over Real-World Systems

Thane Ruthenis18 Feb 2025 18:04 UTC

40 points

11 comments4 min readLW link

How accurate was my “Altered Traits” book review?

lsusr18 Feb 2025 17:00 UTC

43 points

6 comments3 min readLW link

Medical Roundup #4

Zvi18 Feb 2025 13:40 UTC

24 points

3 comments10 min readLW link

(thezvi.wordpress.com)

Dear AGI,

Nathan Young18 Feb 2025 10:48 UTC

90 points

11 comments3 min readLW link

There are a lot of upcoming retreats/conferences between March and July (2025)

gergogaspar and ENAIS

18 Feb 2025 9:30 UTC

6 points

0 comments1 min readLW link

Sea Change

Charlie Sanders18 Feb 2025 6:03 UTC

−2 points

2 comments5 min readLW link

(www.dailymicrofiction.com)

Born on Third Base: The Case for Inheriting Nothing and Building Everything

charlieoneill18 Feb 2025 0:47 UTC

−24 points

16 comments2 min readLW link

Do models know when they are being evaluated?

fidgetsinner, Giles, Joe Needham and Marius Hobbhahn

17 Feb 2025 23:13 UTC

57 points

9 comments12 min readLW link

AGI Safety & Alignment @ Google DeepMind is hiring

Rohin Shah17 Feb 2025 21:11 UTC

103 points

19 comments10 min readLW link

The Peeperi (unfinished) - By Katja Grace

Nathan Young17 Feb 2025 19:33 UTC

22 points

0 comments3 min readLW link

(docs.google.com)

Progress links and short notes, 2025-02-17

jasoncrawford17 Feb 2025 19:18 UTC

8 points

0 comments7 min readLW link

(newsletter.rootsofprogress.org)

Claude 3.5 Sonnet (New)’s AGI scenario

Nathan Young17 Feb 2025 18:47 UTC

5 points

2 comments5 min readLW link

Talking to laymen about AI development

David Steel17 Feb 2025 18:42 UTC

8 points

0 comments1 min readLW link

On the Rebirth of Aristocracy in the American Regime

shawkisukkar17 Feb 2025 16:18 UTC

−16 points

3 comments9 min readLW link

(shawkisukkar.substack.com)

Ascetic hedonism

dkl917 Feb 2025 15:56 UTC

15 points

9 comments2 min readLW link

(dkl9.net)

AIS Berlin, events, opportunities and the flipped gameboard—Fieldbuilders Newsletter, February 2025

gergogaspar and ENAIS

17 Feb 2025 14:16 UTC

6 points

0 comments3 min readLW link

Monthly Roundup #27: February 2025

Zvi17 Feb 2025 14:10 UTC

27 points

3 comments44 min readLW link

(thezvi.wordpress.com)

What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)

gergogaspar17 Feb 2025 12:39 UTC

6 points

0 comments2 min readLW link

A History of the Future, 2025-2040

L Rudolf L17 Feb 2025 12:03 UTC

253 points

42 comments75 min readLW link

(nosetgauge.substack.com)

Thermodynamic entropy = Kolmogorov complexity

Aram Ebtekar17 Feb 2025 5:56 UTC

77 points

14 comments1 min readLW link

(doi.org)