All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 171819 20 21 22 23 24 25 26 27 28 29 30

The Russell Conjugation Illuminator

TimmyM17 Apr 2025 19:33 UTC

51 points

14 comments1 min readLW link

(russellconjugations.com)

Announcing Progress Conference 2025

jasoncrawford17 Apr 2025 17:12 UTC

12 points

0 comments1 min readLW link

(newsletter.rootsofprogress.org)

The Mirror Paradox

Jeremy Kraybill17 Apr 2025 16:23 UTC

−6 points

0 comments1 min readLW link

Memory Decoding Journal Club

Devin Ward17 Apr 2025 16:19 UTC

1 point

0 comments1 min readLW link

Host Keys and SSHing to EC2

jefftk17 Apr 2025 15:10 UTC

10 points

6 comments1 min readLW link

(www.jefftk.com)

AI #112: Release the Everything

Zvi17 Apr 2025 15:10 UTC

41 points

6 comments40 min readLW link

(thezvi.wordpress.com)

On AI personhood

p.b.17 Apr 2025 12:31 UTC

4 points

7 comments1 min readLW link

Automating Mechanistic Interpretability via Program Synthesis

Edy Nastase17 Apr 2025 10:58 UTC

1 point

1 comment1 min readLW link

Understanding and overcoming AGI apathy

Dhruv Sumathi17 Apr 2025 1:04 UTC

25 points

1 comment13 min readLW link

(dhruvsumathi.substack.com)

ALLFED emergency appeal: Help us raise $800,000 to avoid cutting half of programs

denkenberger16 Apr 2025 21:47 UTC

49 points

9 comments3 min readLW link

Prodromes and Biomarkers in Chronic Disease

sarahconstantin16 Apr 2025 21:30 UTC

23 points

2 comments3 min readLW link

(sarahconstantin.substack.com)

The Practical Imperative for AI Control Research

Archana Vaidheeswaran16 Apr 2025 20:27 UTC

1 point

0 comments4 min readLW link

METR’s preliminary evaluation of o3 and o4-mini

Christopher King16 Apr 2025 20:23 UTC

14 points

7 comments1 min readLW link

(metr.github.io)

Mass Exposure Paradox

max-sixty16 Apr 2025 20:18 UTC

6 points

2 comments2 min readLW link

GPT-4.5 is Cognitive Empathy, Sonnet 3.5 is Affective Empathy

Jack16 Apr 2025 19:12 UTC

15 points

2 comments4 min readLW link

GPT-4.1 Is a Mini Upgrade

Zvi16 Apr 2025 19:00 UTC

31 points

6 comments8 min readLW link

(thezvi.wordpress.com)

Doing Prioritization Better

arvomm16 Apr 2025 18:46 UTC

3 points

1 comment19 min readLW link

(forum.effectivealtruism.org)

Kamelo: A Rule-Based Constructed Language for Universal, Logical Communication

Saif Khan16 Apr 2025 18:44 UTC

13 points

8 comments2 min readLW link

Understanding Trust: Overview Presentations

abramdemski16 Apr 2025 18:08 UTC

22 points

0 comments1 min readLW link

Understanding Trust—Overview Presentations

abramdemski16 Apr 2025 18:05 UTC

13 points

0 comments1 min readLW link

Telescoping

za3k16 Apr 2025 17:05 UTC

13 points

1 comment1 min readLW link

(blog.za3k.com)

Finance and AI Timelines

DAL16 Apr 2025 16:55 UTC

5 points

2 comments3 min readLW link

AI-enabled coups: a small group could use AI to seize power

Tom Davidson, Lukas Finnveden and rosehadshar

16 Apr 2025 16:51 UTC

138 points

23 comments7 min readLW link

Ctrl-Z: Controlling AI Agents via Resampling

Aryan Bhatt, Buck, Adam Kaufman and Tyler Tracy

16 Apr 2025 16:21 UTC

128 points

0 comments20 min readLW link

Gamify life from BayesianMind

Fire Brito de S, Gabriel16 Apr 2025 16:17 UTC

6 points

2 comments1 min readLW link

Top OpenAI Catastrophic Risk Official Steps Down Abruptly

garrison16 Apr 2025 16:04 UTC

14 points

0 comments5 min readLW link

(garrisonlovely.substack.com)

An artistic illustration of Scalable Oversight—“A world apart, neither gods nor mortals”

Marius Adrian Nicoară16 Apr 2025 12:41 UTC

1 point

0 comments1 min readLW link

Can LLM-based models do model-based planning?

Jennifer Lin16 Apr 2025 12:38 UTC

11 points

1 comment2 min readLW link

(docs.google.com)

The road from human-level to superintelligent AI may be short

Vishakha, Algon and steven0461

16 Apr 2025 8:35 UTC

10 points

0 comments2 min readLW link

(aisafety.info)

Human-level is not the limit

Vishakha, Algon and steven0461

16 Apr 2025 8:33 UTC

23 points

2 comments2 min readLW link

(aisafety.info)

AI may attain human-level soon

Vishakha, Algon and steven0461

16 Apr 2025 8:28 UTC

11 points

0 comments2 min readLW link

(aisafety.info)

AI is advancing fast

Vishakha, Algon and steven0461

16 Apr 2025 8:17 UTC

11 points

0 comments2 min readLW link

(aisafety.info)

How Logic “Really” Works: An Engineering Perspective

Daniil Strizhov16 Apr 2025 5:34 UTC

6 points

0 comments11 min readLW link

Opportunity to to learn more about AI Innovation & Security Policy

PolicyTakes16 Apr 2025 1:35 UTC

2 points

0 comments1 min readLW link

D&D.Sci Tax Day: Adventurers and Assessments

aphyer15 Apr 2025 23:43 UTC

47 points

14 comments2 min readLW link

Should AIs be Encouraged to Cooperate?

PeterMcCluskey15 Apr 2025 21:57 UTC

13 points

2 comments5 min readLW link

(bayesianinvestor.com)

OpenAI rewrote its Preparedness Framework

Zach Stein-Perlman15 Apr 2025 20:00 UTC

37 points

1 comment6 min readLW link

ASI existential risk: Reconsidering Alignment as a Goal

habryka15 Apr 2025 19:57 UTC

96 points

14 comments19 min readLW link

(michaelnotebook.com)

Nucleic Acid Observatory Updates, April 2025

jefftk15 Apr 2025 18:58 UTC

27 points

0 comments4 min readLW link

(naobservatory.org)

Some OthelloGPT Circuits

Alfred Wong15 Apr 2025 18:41 UTC

7 points

0 comments7 min readLW link

The Mirror Problem in AI: Why Language Models Say Whatever You Want

RobT15 Apr 2025 18:40 UTC

9 points

2 comments3 min readLW link

What happens when LLMs learn new things? & Continual learning forever.

sunchipsster15 Apr 2025 18:38 UTC

4 points

1 comment7 min readLW link

To be legible, evidence of misalignment probably has to be behavioral

ryan_greenblatt15 Apr 2025 18:14 UTC

58 points

19 comments3 min readLW link

AISN #51: AI Frontiers

Corin Katzke and Dan H

15 Apr 2025 16:01 UTC

8 points

1 comment5 min readLW link

(newsletter.safe.ai)

Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI

Kaj_Sotala15 Apr 2025 15:56 UTC

174 points

52 comments18 min readLW link

OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing

Zvi15 Apr 2025 15:30 UTC

48 points

3 comments12 min readLW link

(thezvi.wordpress.com)

The real reason AI benchmarks haven’t reflected economic impacts

Noosphere8915 Apr 2025 13:44 UTC

15 points

0 comments1 min readLW link

(epoch.ai)

Map of AI Safety v2

Bryce Robertson, Søren Elverlin and honeybee

15 Apr 2025 13:04 UTC

64 points

4 comments1 min readLW link

3M Subscriber YouTube Account ‘Channel 5’ Reporting On Rationalism

sakraf15 Apr 2025 13:02 UTC

4 points

0 comments1 min readLW link

(youtu.be)

Can SAE steering reveal sandbagging?

jordinne, Hoang Khiem, Felix Hofstätter and Cleo Nardo

15 Apr 2025 12:33 UTC

36 points

3 comments4 min readLW link