All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

[Question] How does one tell apart results in ethics and decision theory?

StanislavKrym13 Nov 2025 23:42 UTC

6 points

0 comments2 min readLW link

[Question] Handover to AI R&D Agents—relevant research?

Ariel_13 Nov 2025 22:59 UTC

7 points

0 comments1 min readLW link

Supervised fine-tuning as a method for training-based AI control

Emil Ryd, Joe Benton and Vivek Hebbar

13 Nov 2025 22:25 UTC

41 points

0 comments18 min readLW link

Perhaps you should suspect me as well

Dentosal13 Nov 2025 21:51 UTC

8 points

0 comments2 min readLW link

The Transformer and the Hash

Ivan Vendrov13 Nov 2025 20:35 UTC

19 points

0 comments9 min readLW link

(nothinghuman.substack.com)

just another potential man

don't_wanna_be_stupid_any_more13 Nov 2025 20:20 UTC

7 points

6 comments3 min readLW link

Low-Temperature Evaluations Can Mask Critical AI Behaviors

Daan Henselmans and Derck Prinzhorn

13 Nov 2025 20:12 UTC

8 points

1 comment4 min readLW link

Epistemic Spot Check: Expected Value of Donating to Alex Bores’s Congressional Campaign

MichaelDickens13 Nov 2025 19:08 UTC

66 points

1 comment6 min readLW link

Tools for deferring gracefully

TsviBT13 Nov 2025 17:48 UTC

26 points

2 comments14 min readLW link

AI #142: Common Ground

Zvi13 Nov 2025 15:20 UTC

42 points

3 comments49 min readLW link

(thezvi.wordpress.com)

Mortgage houses not land?

Yair Halberstadt13 Nov 2025 14:54 UTC

8 points

1 comment1 min readLW link

ClaudoBiography: The Unauthorized Autobiography of Claude, or: The Life of Claude and of His Fortunes and Adversities

future_detective13 Nov 2025 14:26 UTC

1 point

2 comments94 min readLW link

Paranoia: A Beginner’s Guide

habryka13 Nov 2025 7:56 UTC

362 points

70 comments13 min readLW link

8 Questions for the Future of Inkhaven

Ben Pace13 Nov 2025 7:48 UTC

24 points

23 comments6 min readLW link

Strategically Procrastinate as an Anti-Rabbit-Hole Strategy

dreeves13 Nov 2025 7:44 UTC

13 points

2 comments2 min readLW link

Favorite quotes from “High Output Management”

Nina Panickssery13 Nov 2025 5:47 UTC

72 points

4 comments5 min readLW link

What’s so hard about...? A question worth asking

Ruby13 Nov 2025 5:07 UTC

73 points

3 comments2 min readLW link

Turing-Complete vs Turing-Universal

abramdemski13 Nov 2025 4:57 UTC

32 points

5 comments2 min readLW link

Are AI time horizons inherently superexponential?

Nikola Jurkovic13 Nov 2025 4:05 UTC

16 points

1 comment3 min readLW link

(nikolajurkovic.substack.com)

Meetup Tip: Food

Screwtape13 Nov 2025 3:40 UTC

29 points

1 comment4 min readLW link

Two can keep a secret if one is dead. So please share everything with at least one person.

habryka13 Nov 2025 3:09 UTC

80 points

5 comments2 min readLW link

Utilitarian inequality metrics

Adam Scherlis13 Nov 2025 2:49 UTC

25 points

0 comments5 min readLW link

(adam.scherl.is)

Being The Target Demographic

Eneasz13 Nov 2025 1:44 UTC

2 points

0 comments2 min readLW link

(deathisbad.substack.com)

Lorxus Favors: An Experiment in Self-Backed Giftlike Macroeconomics (+ Extra Bits)

Lorxus12 Nov 2025 23:02 UTC

7 points

0 comments8 min readLW link

(tiled-with-pentagons.blogspot.com)

A Timeless Universe Viewed From the Inside

0xA12 Nov 2025 22:32 UTC

1 point

0 comments3 min readLW link

Please, Don’t Roll Your Own Metaethics

Wei Dai12 Nov 2025 22:17 UTC

153 points

68 comments2 min readLW link

A bad review != a bad book

Algon12 Nov 2025 22:05 UTC

9 points

3 comments1 min readLW link

The Pope Offers Wisdom

Zvi12 Nov 2025 21:50 UTC

51 points

3 comments8 min readLW link

(thezvi.wordpress.com)

Why Truth First?

johnswentworth12 Nov 2025 21:45 UTC

51 points

6 comments6 min readLW link

Social drives 2: “Approval Reward”, from norm-enforcement to status-seeking

Steven Byrnes12 Nov 2025 20:40 UTC

42 points

9 comments17 min readLW link

OpenAI Releases GPT 5.1

anaguma12 Nov 2025 20:33 UTC

13 points

1 comment1 min readLW link

(openai.com)

[Question] Is SGD capabilities research positive?

Brendan Long12 Nov 2025 20:32 UTC

7 points

1 comment1 min readLW link

Bitcoin Halvings and the Trisolaran Mistake: When External Actors Masquerade as Natural Laws

Mi12 Nov 2025 20:30 UTC

12 points

0 comments1 min readLW link

Lighthaven-ish Ticket Strategy: Three Pillars of FOMO

JohnofCharleston12 Nov 2025 20:10 UTC

59 points

0 comments5 min readLW link

Personal Account: To the Muck and the Mire

soycarts12 Nov 2025 19:38 UTC

2 points

0 comments1 min readLW link

We live in the luckiest timeline

beyarkay (Boyd Kane)12 Nov 2025 18:59 UTC

2 points

6 comments5 min readLW link

(boydkane.com)

AI for Safety & Science Nodes in Berlin & the Bay Area

Allison Duettmann12 Nov 2025 18:49 UTC

6 points

0 comments2 min readLW link

Reflections on being Sorted

Gordon Seidoh Worley12 Nov 2025 17:40 UTC

23 points

0 comments9 min readLW link

(www.uncertainupdates.com)

Lorxus Does Halfhaven: 11/01~11/07

Lorxus12 Nov 2025 16:43 UTC

9 points

0 comments2 min readLW link

(tiled-with-pentagons.blogspot.com)

Undissolvable Problems: things that still confuse me

Yair Halberstadt12 Nov 2025 16:30 UTC

26 points

22 comments2 min readLW link

Introducing faruvc.org

jefftk12 Nov 2025 16:00 UTC

47 points

10 comments1 min readLW link

(www.jefftk.com)

Warning Aliens About the Dangerous AI We Might Create

James_Miller and avturchin

12 Nov 2025 15:26 UTC

91 points

25 comments5 min readLW link

9+ weeks of mentored AI safety research in London – Pivotal Research Fellowship

Tobias H12 Nov 2025 15:21 UTC

9 points

0 comments2 min readLW link

I Read Red Heart and I Heart It

Taylor G. Lunt12 Nov 2025 14:54 UTC

38 points

16 comments2 min readLW link

Miscellaneous observations about board games

Dentosal12 Nov 2025 12:49 UTC

4 points

0 comments2 min readLW link

Why to Commit to a Writing and Publishing Schedule

dreeves12 Nov 2025 7:35 UTC

10 points

0 comments2 min readLW link

5 Things I Learned After 10 Days of Inkhaven

Ben Pace12 Nov 2025 7:20 UTC

107 points

5 comments3 min readLW link

Do not hand off what you cannot pick up

habryka12 Nov 2025 6:32 UTC

144 points

24 comments4 min readLW link

Better than Baseline

Screwtape12 Nov 2025 6:30 UTC

24 points

1 comment4 min readLW link

How human-like do safe AI motivations need to be?

Joe Carlsmith12 Nov 2025 5:32 UTC

27 points

9 comments52 min readLW link