All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

All Jan Feb Mar Apr May Jun Jul AugSepOct

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Evaluating Prediction in Acausal Mixed-Motive Settings

Tim Chan31 Aug 2025 22:58 UTC

14 points

0 comments6 min readLW link

My AI Predictions for 2027

Taylor G. Lunt31 Aug 2025 22:00 UTC

37 points

73 comments16 min readLW link

Hedonium is AI Alignment

Tahmatem and Coil

31 Aug 2025 19:46 UTC

−16 points

0 comments6 min readLW link

To Raemon: bet in My (personal) Goals

P. João31 Aug 2025 15:48 UTC

3 points

0 comments3 min readLW link

Legal Personhood—The First Amendment (Part 2)

Stephen Martin31 Aug 2025 12:06 UTC

2 points

0 comments2 min readLW link

A quantum equivalent to Bayes’ rule

dr_s31 Aug 2025 10:06 UTC

51 points

17 comments8 min readLW link

ACX Meetup Wellington

NotEvil31 Aug 2025 5:13 UTC

1 point

1 comment1 min readLW link

Sleeping Experts in the (reflective) Solomonoff Prior

Daniel C and Cole Wyeth

31 Aug 2025 4:55 UTC

16 points

0 comments3 min readLW link

Hacking The Spectrum For Profit (Maybe Fun)

Elek Szid31 Aug 2025 4:49 UTC

7 points

3 comments3 min readLW link

AI agents and painted facades

leni, zef and kaivu

30 Aug 2025 23:13 UTC

38 points

3 comments2 min readLW link

(fulcrumresearch.ai)

ACX Everywhere fall 2025 - Newton, MA

duck_master30 Aug 2025 22:02 UTC

1 point

1 comment1 min readLW link

[via bsky, found paper] “AI Consciousness: A Centrist Manifesto”

the gears to ascension30 Aug 2025 21:05 UTC

13 points

0 comments1 min readLW link

(philpapers.org)

Female sexual attractiveness seems more egalitarian than people acknowledge

lc30 Aug 2025 18:09 UTC

53 points

27 comments3 min readLW link

AI Sleeper Agents: How Anthropic Trains and Catches Them—Video

Writer30 Aug 2025 17:53 UTC

9 points

0 comments7 min readLW link

(youtu.be)

Understanding LLMs: Insights from Mechanistic Interpretability

Stephen McAleese30 Aug 2025 16:50 UTC

40 points

2 comments30 min readLW link

Legal Personhood—The First Amendment (Part 1)

Stephen Martin30 Aug 2025 13:20 UTC

4 points

0 comments3 min readLW link

Method Iteration: An LLM Prompting Technique

Davey Morse30 Aug 2025 0:08 UTC

−12 points

1 comment2 min readLW link

[Question] How to bet on myself? From expectations to robust goals

P. João29 Aug 2025 18:33 UTC

4 points

3 comments1 min readLW link

AI Security London Hackathon

Prince Kumar29 Aug 2025 18:23 UTC

4 points

0 comments1 min readLW link

Summary of our Workshop on Post-AGI Outcomes

David Duvenaud, Raymond Douglas, Nora_Ammann and Jan_Kulveit

29 Aug 2025 17:14 UTC

96 points

3 comments3 min readLW link

Wikipedia, but written by AIs

Viliam29 Aug 2025 16:37 UTC

32 points

9 comments4 min readLW link

60 U.K. Lawmakers Accuse Google of Breaking AI Safety Pledge

Joseph Miller29 Aug 2025 16:09 UTC

50 points

1 comment1 min readLW link

(time.com)

AI #131 Part 2: Various Misaligned Things

Zvi29 Aug 2025 15:00 UTC

34 points

7 comments41 min readLW link

(thezvi.wordpress.com)

The Gabian History of Mathematics

remember and Gabriel Alfour

29 Aug 2025 13:48 UTC

21 points

9 comments2 min readLW link

(cognition.cafe)

Qualified rights for AI agents

Gauraventh29 Aug 2025 12:42 UTC

4 points

1 comment5 min readLW link

(robertandgaurav.substack.com)

I am trying to write the history of transhumanism-related communities

Ihor Kendiukhov29 Aug 2025 11:37 UTC

7 points

4 comments1 min readLW link

Claude Plays… Whatever it Wants

Adam B29 Aug 2025 10:57 UTC

37 points

4 comments7 min readLW link

Not stepping on bugs

Gauraventh29 Aug 2025 10:08 UTC

1 point

6 comments2 min readLW link

(y1d2.com)

Defensiveness does not equal guilt

Kaj_Sotala29 Aug 2025 6:14 UTC

60 points

16 comments3 min readLW link

Truth

Kabir Kumar28 Aug 2025 20:53 UTC

6 points

0 comments2 min readLW link

(kkumar97.blogspot.com)

Here’s 18 Applications of Deception Probes

Cleo Nardo, Avi Parrack and jordine

28 Aug 2025 18:59 UTC

38 points

0 comments22 min readLW link

LW@Dragoncon Meetup

Error28 Aug 2025 18:40 UTC

7 points

0 comments1 min readLW link

If we can educate AIs, why not apply that education to people? - A Simulation with Claude

P. João28 Aug 2025 16:37 UTC

3 points

0 comments7 min readLW link

AI #131 Part 1: Gemini 2.5 Flash Image is Cool

Zvi28 Aug 2025 16:20 UTC

39 points

4 comments30 min readLW link

(thezvi.wordpress.com)

Von Neumann’s Fallacy and You

incident-recipient28 Aug 2025 15:52 UTC

98 points

29 comments4 min readLW link

AI misbehaviour in the wild from Andon Labs’ Safety Report

Lukas Petersson28 Aug 2025 15:10 UTC

39 points

0 comments1 min readLW link

(andonlabs.com)

The Other Alignment Problems: How epistemic, moral and aesthetic norms get entangled

James Diacoumis28 Aug 2025 11:26 UTC

3 points

0 comments5 min readLW link

We should think about the pivotal act again. Here’s a better version of it.

otto.barten28 Aug 2025 9:29 UTC

11 points

2 comments3 min readLW link

Elaborative reading

DirectedEvolution28 Aug 2025 8:55 UTC

20 points

0 comments9 min readLW link

Profanity causes emergent misalignment, but with qualitatively different results than insecure code

megasilverfist28 Aug 2025 8:22 UTC

21 points

2 comments8 min readLW link

Using Psycholinguistic Signals to Improve AI Safety

Jkreindler27 Aug 2025 22:30 UTC

−2 points

0 comments4 min readLW link

Transition and Social Dynamics of a post-coordination world

Lessbroken27 Aug 2025 22:23 UTC

1 point

0 comments7 min readLW link

Technical AI Safety research taxonomy attempt (2025)

Benjamin Plaut27 Aug 2025 22:17 UTC

2 points

0 comments2 min readLW link

The Future of AI Agents

kavya27 Aug 2025 21:58 UTC

6 points

8 comments5 min readLW link

Against “Model Welfare” in 2025

Haley Moller27 Aug 2025 21:56 UTC

−10 points

8 comments4 min readLW link

Are They Starting To Take Our Jobs?

Zvi27 Aug 2025 18:50 UTC

44 points

6 comments5 min readLW link

(thezvi.wordpress.com)

Will Any Crap Cause Emergent Misalignment?

J Bostock27 Aug 2025 18:20 UTC

192 points

37 comments3 min readLW link

Open Global Investment as a Governance Model for AGI

Nick Bostrom27 Aug 2025 17:42 UTC

152 points

47 comments39 min readLW link

(nickbostrom.com)

Uncertain Updates August 2025

Gordon Seidoh Worley27 Aug 2025 17:31 UTC

11 points

1 comment2 min readLW link

(uncertainupdates.substack.com)

Attaching requirements to model releases has serious downsides (relative to a different deadline for these requirements)

ryan_greenblatt27 Aug 2025 17:04 UTC

99 points

2 comments3 min readLW link