All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 171819 20 21 22 23 24 25 26 27 28 29 30 31

Your AI Safety focus is downstream of your AGI timeline

Michael Flood17 Jan 2025 21:24 UTC

9 points

0 comments4 min readLW link

Thoughts on the conservative assumptions in AI control

Buck17 Jan 2025 19:23 UTC

91 points

5 comments13 min readLW link

Timaeus is hiring researchers & engineers

Jesse Hoogland and Stan van Wingerden

17 Jan 2025 19:13 UTC

65 points

4 comments4 min readLW link

Model Amnesty Project

themis17 Jan 2025 18:53 UTC

3 points

2 comments3 min readLW link

Addressing doubts of AI progress: Why GPT-5 is not late, and why data scarcity isn’t a fundamental limiter near term.

LDJ17 Jan 2025 18:53 UTC

2 points

0 comments2 min readLW link

Playing Dixit with AI: How Well LLMs Detect ‘Me-ness’

Mariia Koroliuk17 Jan 2025 18:52 UTC

5 points

0 comments2 min readLW link

Doing a self-randomized study of the impacts of glycine on sleep (Science is hard)

thedissonance.net17 Jan 2025 18:49 UTC

11 points

5 comments11 min readLW link

How sci-fi can have drama without dystopia or doomerism

jasoncrawford17 Jan 2025 15:22 UTC

19 points

3 comments3 min readLW link

(newsletter.rootsofprogress.org)

[Question] What do you mean with ‘alignment is solvable in principle’?

Remmelt17 Jan 2025 15:03 UTC

3 points

9 comments1 min readLW link

Meta Pivots on Content Moderation

Zvi17 Jan 2025 14:20 UTC

47 points

3 comments10 min readLW link

(thezvi.wordpress.com)

Tax Price Gouging?

jefftk17 Jan 2025 14:10 UTC

55 points

22 comments3 min readLW link

(www.jefftk.com)

The quantum red pill or: They lied to you, we live in the (density) matrix

Dmitry Vaintrob17 Jan 2025 13:58 UTC

37 points

34 comments12 min readLW link

Bednets -- 4 longer malaria studies

Hzn17 Jan 2025 8:47 UTC

4 points

0 comments4 min readLW link

Patent Trolling to Save the World

Double17 Jan 2025 4:13 UTC

23 points

7 comments3 min readLW link

Call Booth External Monitor

jefftk17 Jan 2025 3:10 UTC

15 points

0 comments1 min readLW link

(www.jefftk.com)

[Cross-post] Welcome to the Essay Meta

davekasten16 Jan 2025 23:36 UTC

14 points

2 comments8 min readLW link

AI for Resolving Forecasting Questions: An Early Exploration

ozziegooen16 Jan 2025 21:41 UTC

10 points

2 comments9 min readLW link

[Question] How Do You Interpret the Goal of LessWrong and Its Community?

ashen846116 Jan 2025 19:08 UTC

−2 points

2 comments1 min readLW link

Experts’ AI timelines are longer than you have been told?

Vasco Grilo16 Jan 2025 18:03 UTC

10 points

4 comments3 min readLW link

(bayes.net)

Numberwang: LLMs Doing Autonomous Research, and a Call for Input

eggsyntax and ncase

16 Jan 2025 17:20 UTC

71 points

30 comments31 min readLW link

Topological Debate Framework

lunatic_at_large16 Jan 2025 17:19 UTC

10 points

5 comments9 min readLW link

AI #99: Farewell to Biden

Zvi16 Jan 2025 14:20 UTC

54 points

5 comments58 min readLW link

(thezvi.wordpress.com)

Deceptive Alignment and Homuncularity

Oliver Sourbut and TurnTrout

16 Jan 2025 13:55 UTC

26 points

12 comments22 min readLW link

Introducing the WeirdML Benchmark

Håvard Tveit Ihle16 Jan 2025 11:38 UTC

57 points

13 comments11 min readLW link

The Mathematical Reason You should have 9 Kids

Zero Contradictions16 Jan 2025 11:24 UTC

−9 points

6 comments1 min readLW link

(eternalanglo.com)

Quantum without complication

Optimization Process and Orborde

16 Jan 2025 8:53 UTC

30 points

2 comments10 min readLW link

Permanents: much more than you wanted to know

Dmitry Vaintrob16 Jan 2025 8:04 UTC

17 points

2 comments15 min readLW link

Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses

TurnTrout16 Jan 2025 2:14 UTC

65 points

3 comments1 min readLW link

(turntrout.com)

What Is The Alignment Problem?

johnswentworth16 Jan 2025 1:20 UTC

181 points

49 comments25 min readLW link

Improving Our Safety Cases Using Upper and Lower Bounds

Yonatan Cale16 Jan 2025 0:01 UTC

23 points

0 comments3 min readLW link

Unregulated Peptides: Does BPC-157 hold its promises?

ChristianKl15 Jan 2025 23:36 UTC

28 points

7 comments4 min readLW link

New, improved multiple-choice TruthfulQA

Owain_Evans, James Chua and Steph Lin

15 Jan 2025 23:32 UTC

72 points

1 comment3 min readLW link

The Difference Between Prediction Markets and Debate (Argument) Maps

Jamie Joyce15 Jan 2025 23:19 UTC

7 points

3 comments3 min readLW link

A Novel Emergence of Meta-Awareness in LLM Fine-Tuning

rife15 Jan 2025 22:59 UTC

57 points

32 comments2 min readLW link

Six Small Cohabitive Games

Screwtape15 Jan 2025 21:59 UTC

40 points

7 comments13 min readLW link

LLMs are really good at k-order thinking (where k is even)

charlieoneill15 Jan 2025 20:43 UTC

7 points

0 comments2 min readLW link

Everywhere I Look, I See Kat Woods

just_browsing15 Jan 2025 19:29 UTC

19 points

45 comments5 min readLW link

[untitled post]

Emre15 Jan 2025 18:52 UTC

−1 points

0 comments1 min readLW link

“Pick Two” AI Trilemma: Generality, Agency, Alignment.

Black Flag15 Jan 2025 18:52 UTC

7 points

0 comments2 min readLW link

Myths about Nonduality and Science by Gary Weber

Vadim Golub15 Jan 2025 18:33 UTC

2 points

0 comments23 min readLW link

Marx and the Machine

DAL15 Jan 2025 18:33 UTC

5 points

2 comments9 min readLW link

Code4Compassion 2025: a hackathon transforming animal advocacy through technology

superbeneficiary15 Jan 2025 18:31 UTC

3 points

0 comments1 min readLW link

Applications Open for the Cooperative AI Summer School 2025!

JesseClifton15 Jan 2025 18:16 UTC

7 points

0 comments1 min readLW link

List of AI safety papers from companies, 2023–2024

Zach Stein-Perlman15 Jan 2025 18:00 UTC

11 points

0 comments1 min readLW link

AI Alignment Meme Viruses

RationalDino15 Jan 2025 15:55 UTC

5 points

0 comments2 min readLW link

Looking for humanness in the world wide social

Itay Dreyfus15 Jan 2025 14:50 UTC

11 points

0 comments6 min readLW link

(productidentity.co)

On the OpenAI Economic Blueprint

Zvi15 Jan 2025 14:30 UTC

81 points

2 comments9 min readLW link

(thezvi.wordpress.com)

A problem shared by many different alignment targets

ThomasCederborg15 Jan 2025 14:22 UTC

13 points

18 comments36 min readLW link

LLMs for language learning

Benquo15 Jan 2025 14:08 UTC

10 points

2 comments7 min readLW link

(benjaminrosshoffman.com)

Feature request: comment bookmarks

dirk15 Jan 2025 6:45 UTC

18 points

2 comments1 min readLW link