All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 242526 27 28 29 30 31

What You Can Give Instead of Advice

Karl Faulks24 Oct 2024 23:10 UTC

13 points

2 comments1 min readLW link

[Question] is it possible to comment anonymously on a post?

KvmanThinking24 Oct 2024 22:24 UTC

3 points

2 comments1 min readLW link

Logical Proof for the Emergence and Substrate Independence of Sentience

rife24 Oct 2024 21:08 UTC

4 points

31 comments1 min readLW link

(awakenmoon.ai)

Against Job Boards: Human Capital and the Legibility Trap

vaishnav9224 Oct 2024 20:50 UTC

6 points

1 comment5 min readLW link

IAPS: Mapping Technical Safety Research at AI Companies

Zach Stein-Perlman24 Oct 2024 20:30 UTC

42 points

13 comments1 min readLW link

(www.iaps.ai)

Our Digital and Biological Children

Eneasz24 Oct 2024 18:36 UTC

28 points

0 comments3 min readLW link

(deathisbad.substack.com)

Reflections on the Metastrategies Workshop

gw24 Oct 2024 18:30 UTC

41 points

5 comments11 min readLW link

How Should We Measure Intelligence Models: Why Use Frequency of Elemental Information Operations

hwj2024 Oct 2024 16:54 UTC

1 point

0 comments5 min readLW link

Meta AI (FAIR) latest paper integrates system-1 and system-2 thinking into reasoning models.

happy friday24 Oct 2024 16:54 UTC

8 points

0 comments1 min readLW link

Balancing Label Quantity and Quality for Scalable Elicitation

Alex Mallen24 Oct 2024 16:49 UTC

31 points

1 comment2 min readLW link

Claude Sonnet 3.5.1 and Haiku 3.5

Zvi24 Oct 2024 14:50 UTC

51 points

9 comments16 min readLW link

(thezvi.wordpress.com)

Big tech transitions are slow (with implications for AI)

jasoncrawford24 Oct 2024 14:25 UTC

36 points

16 comments4 min readLW link

(blog.rootsofprogress.org)

Derivative AT a discontinuity

Alok Singh24 Oct 2024 2:48 UTC

10 points

5 comments10 min readLW link

how to rapidly assimilate new information

dhruvmethi24 Oct 2024 2:18 UTC

9 points

3 comments8 min readLW link

Ex-OpenAI researcher says OpenAI mass-violated copyright law

Remmelt24 Oct 2024 1:00 UTC

0 points

0 comments1 min readLW link

(suchir.net)

Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded

garrison23 Oct 2024 23:40 UTC

118 points

1 comment7 min readLW link

(garrisonlovely.substack.com)

A metaphor: what “green lights” for AGI would look like

Lorec23 Oct 2024 23:24 UTC

−1 points

6 comments2 min readLW link

Motte-and-Bailey: a Short Explanation

Lorec23 Oct 2024 22:29 UTC

12 points

0 comments1 min readLW link

Self-prediction acts as an emergent regularizer

Cameron Berg, Kvee, Mike Vaiana, Diogo de Lucena, florin_pop and Trent Hodgeson

23 Oct 2024 22:27 UTC

92 points

9 comments4 min readLW link

Technical Risks of (Lethal) Autonomous Weapons Systems

Heramb23 Oct 2024 20:41 UTC

2 points

0 comments1 min readLW link

(encodejustice.org)

Appealing to the Public

jefftk23 Oct 2024 19:00 UTC

16 points

0 comments5 min readLW link

(www.jefftk.com)

Introducing Transluce — A Letter from the Founders

jsteinhardt23 Oct 2024 18:10 UTC

74 points

3 comments3 min readLW link

(bounded-regret.ghost.io)

Are we dropping the ball on Recommendation AIs?

Charbel-Raphaël23 Oct 2024 17:48 UTC

53 points

17 comments6 min readLW link

A bird’s eye view of ARC’s research

Jacob_Hilton23 Oct 2024 15:50 UTC

121 points

12 comments7 min readLW link

(www.alignment.org)

AI safety tax dynamics

owencb23 Oct 2024 12:18 UTC

22 points

0 comments6 min readLW link

(strangecities.substack.com)

What is malevolence? On the nature, measurement, and distribution of dark traits

David Althaus, Chi Nguyen and Clare

23 Oct 2024 8:41 UTC

94 points

22 comments52 min readLW link

Join a LessWrong Team for the Unaging System Challenge

Crissman23 Oct 2024 6:01 UTC

15 points

5 comments1 min readLW link

Word Spaghetti

Gordon Seidoh Worley23 Oct 2024 5:39 UTC

20 points

9 comments3 min readLW link

Monosemanticity & Quantization

Rahul Chand22 Oct 2024 22:57 UTC

1 point

0 comments9 min readLW link

[Question] What is the alpha in one bit of evidence?

J Bostock22 Oct 2024 21:57 UTC

20 points

13 comments1 min readLW link

Catastrophic sabotage as a major threat model for human-level AI systems

evhub22 Oct 2024 20:57 UTC

97 points

13 comments15 min readLW link

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

Elizabeth22 Oct 2024 18:20 UTC

76 points

82 comments1 min readLW link

(acesounderglass.com)

Decision-Making Under Uncertainty: Lessons From AI

Jonasb22 Oct 2024 17:54 UTC

−1 points

0 comments5 min readLW link

(www.denominations.io)

Testing Genetic Engineering Detection with Spike-Ins

jefftk22 Oct 2024 17:20 UTC

9 points

0 comments4 min readLW link

(naobservatory.org)

Predictions as Public Works Project — What Metaculus Is Building Next

ChristianWilliams22 Oct 2024 16:35 UTC

5 points

0 comments5 min readLW link

(www.metaculus.com)

Gorges of gender on a terrain of traits

dkl922 Oct 2024 16:18 UTC

−7 points

1 comment3 min readLW link

(dkl9.net)

A Defense of Peer Review

Niko_McCarty and delton137

22 Oct 2024 16:16 UTC

24 points

2 comments22 min readLW link

(www.asimov.press)

BIG-Bench Canary Contamination in GPT-4

Jozdien22 Oct 2024 15:40 UTC

141 points

19 comments4 min readLW link 1 review

[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF

Leon Lang22 Oct 2024 13:57 UTC

51 points

2 comments18 min readLW link

(arxiv.org)

[Intuitive self-models] 6. Awakening / Enlightenment / PNSE

Steven Byrnes22 Oct 2024 13:23 UTC

78 points

14 comments20 min readLW link 2 reviews

Resolving von Neumann-Morgenstern Inconsistent Preferences

niplav22 Oct 2024 11:45 UTC

39 points

5 comments58 min readLW link

Lenses of Control

WillPetillo22 Oct 2024 7:51 UTC

15 points

0 comments9 min readLW link

A Brief Explanation of AI Control

Aaron_Scher22 Oct 2024 7:00 UTC

8 points

1 comment6 min readLW link

Longevity, AI, and Cognitive Research Hackathon @ MIT

ekkolápto22 Oct 2024 6:19 UTC

1 point

0 comments1 min readLW link

Conversational Signposts—How to stop having boring social interactions

Declan Molony22 Oct 2024 5:37 UTC

14 points

6 comments2 min readLW link

I got dysentery so you don’t have to

eukaryote22 Oct 2024 4:55 UTC

340 points

8 comments17 min readLW link 2 reviews

(eukaryotewritesblog.com)

Transformers Explained (Again)

RohanS22 Oct 2024 4:06 UTC

4 points

0 comments18 min readLW link

Sleeping on Stage

jefftk22 Oct 2024 0:50 UTC

26 points

3 comments1 min readLW link

(www.jefftk.com)

The Mask Comes Off: At What Price?

Zvi21 Oct 2024 23:50 UTC

72 points

16 comments8 min readLW link

(thezvi.wordpress.com)

Distinguishing ways AI can be “concentrated”

Matthew Barnett21 Oct 2024 22:21 UTC

34 points

2 comments4 min readLW link