All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 234 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

MonoPoly Restricted Trust

ymeskhout2 Jan 2024 23:02 UTC

31 points

37 comments9 min readLW link

Agent membranes and causal distance

Chris Lakin2 Jan 2024 22:43 UTC

20 points

3 comments3 min readLW link

Focusing on Mal-Alignment

John Fisher2 Jan 2024 19:51 UTC

1 point

0 comments1 min readLW link

Gentleness and the artificial Other

Joe Carlsmith2 Jan 2024 18:21 UTC

321 points

34 comments11 min readLW link 1 review

Otherness and control in the age of AGI

Joe Carlsmith2 Jan 2024 18:15 UTC

51 points

1 comment7 min readLW link 1 review

Apologizing is a Core Rationalist Skill

johnswentworth2 Jan 2024 17:47 UTC

157 points

42 comments5 min readLW link

Cortés, AI Risk, and the Dynamics of Competing Conquerors

James_Miller2 Jan 2024 16:37 UTC

14 points

3 comments3 min readLW link

OpenAI’s Preparedness Framework: Praise & Recommendations

Orpheus162 Jan 2024 16:20 UTC

66 points

1 comment7 min readLW link

Dating Roundup #2: If At First You Don’t Succeed

Zvi2 Jan 2024 16:00 UTC

54 points

29 comments47 min readLW link

(thezvi.wordpress.com)

Looking for Reading Recommendations: Content Moderation, Power & Censorship

Joerg Weiss2 Jan 2024 11:37 UTC

2 points

7 comments1 min readLW link

AI Is Not Software

Davidmanheim2 Jan 2024 7:58 UTC

63 points

29 comments5 min readLW link

Are Metaculus AI Timelines Inconsistent?

Chris_Leong2 Jan 2024 6:47 UTC

17 points

7 comments2 min readLW link

Boston Solstice 2023 Retrospective

jefftk2 Jan 2024 3:10 UTC

33 points

0 comments6 min readLW link

(www.jefftk.com)

Steering Llama-2 with contrastive activation additions

Nina Panickssery, Wuschel Schulz, NickGabs, Meg, evhub and TurnTrout

2 Jan 2024 0:47 UTC

125 points

29 comments8 min readLW link

(arxiv.org)

Twin Cities ACX Meetup—January 2024

Timothy M.1 Jan 2024 21:13 UTC

1 point

2 comments1 min readLW link

San Francisco ACX Meetup “First Saturday”

guenael1 Jan 2024 20:58 UTC

1 point

1 comment1 min readLW link

Mech Interp Challenge: January—Deciphering the Caesar Cipher Model

CallumMcDougall1 Jan 2024 18:03 UTC

17 points

0 comments3 min readLW link

Aldix and the Book of Life

ville1 Jan 2024 17:23 UTC

1 point

0 comments4 min readLW link

(medium.com)

Metaculus Hosts ACX 2024 Prediction Contest

ChristianWilliams1 Jan 2024 16:38 UTC

4 points

0 comments1 min readLW link

(www.metaculus.com)

The Act Itself: Exceptionless Moral Norms

SebastianG 1 Jan 2024 16:06 UTC

5 points

11 comments6 min readLW link

Deception Chess

Chris Land1 Jan 2024 15:40 UTC

7 points

2 comments4 min readLW link

Stop talking about p(doom)

Isaac King1 Jan 2024 10:57 UTC

42 points

22 comments3 min readLW link

[Question] What should a non-genius do in the face of rapid progress in GAI to ensure a decent life?

kaler1 Jan 2024 8:22 UTC

12 points

16 comments1 min readLW link

A hermeneutic net for agency

TsviBT1 Jan 2024 8:06 UTC

60 points

4 comments30 min readLW link

2023 in AI predictions

jessicata1 Jan 2024 5:23 UTC

109 points

35 comments5 min readLW link

Rhythm Stage Setup Components

jefftk1 Jan 2024 3:10 UTC

10 points

4 comments2 min readLW link

(www.jefftk.com)

Bayesian updating in real life is mostly about understanding your hypotheses

Max H1 Jan 2024 0:10 UTC

70 points

4 comments11 min readLW link

Dark Art: Inception

Abu Ibrahim31 Dec 2023 21:09 UTC

11 points

0 comments3 min readLW link

A case for AI alignment being difficult

jessicata31 Dec 2023 19:55 UTC

106 points

59 comments15 min readLW link 1 review

(unstableontology.com)

The Roots of Progress 2023 in review

jasoncrawford31 Dec 2023 18:16 UTC

22 points

0 comments11 min readLW link

(rootsofprogress.org)

Extended Navel-Gazing On My 2023 Donations

jenn31 Dec 2023 18:10 UTC

8 points

0 comments8 min readLW link

(jenn.site)

aisafety.info, the Table of Content

Charbel-Raphaël31 Dec 2023 13:57 UTC

23 points

1 comment11 min readLW link

AIOS

samhealy31 Dec 2023 13:23 UTC

−3 points

5 comments6 min readLW link

AI Alignment Metastrategy

Vanessa Kosoy31 Dec 2023 12:06 UTC

127 points

13 comments7 min readLW link

[Question] Does the hardness of AI alignment undermine FOOM?

TruePath31 Dec 2023 11:05 UTC

8 points

14 comments1 min readLW link

Speed of Failing

nano_brasca31 Dec 2023 10:39 UTC

8 points

0 comments2 min readLW link

[Question] Estimating Returns to Intelligence vs Numbers, Strength and Looks

TruePath31 Dec 2023 10:03 UTC

3 points

6 comments1 min readLW link

Planning to build a cryptographic box with perfect secrecy

Lysandre Terrisse31 Dec 2023 9:31 UTC

40 points

6 comments11 min readLW link

Does ChatGPT know what a tragedy is?

Bill Benzon31 Dec 2023 7:10 UTC

2 points

4 comments5 min readLW link

Taking responsibility and partial derivatives

Ruby31 Dec 2023 4:33 UTC

42 points

1 comment3 min readLW link

The proper response to mistakes that have harmed others?

Ruby31 Dec 2023 4:06 UTC

59 points

12 comments4 min readLW link

Conversation Visualizer

nomagicpill and niplav

31 Dec 2023 1:18 UTC

28 points

4 comments5 min readLW link

(210ethan.github.io)

Paper Summary: The Koha Code—A Biological Theory of Memory

jakej30 Dec 2023 22:37 UTC

24 points

2 comments3 min readLW link

shoes with springs

bhauth30 Dec 2023 21:46 UTC

72 points

9 comments4 min readLW link 2 reviews

(www.bhauth.com)

[Question] Techniques to fix incorrect memorization?

Brendan Long30 Dec 2023 21:32 UTC

19 points

4 comments1 min readLW link

How to develop a photographic memory 2/3

PhilosophicalSoul30 Dec 2023 20:18 UTC

28 points

7 comments12 min readLW link

If Clarity Seems Like Death to Them

Zack_M_Davis30 Dec 2023 17:40 UTC

50 points

192 comments87 min readLW link 1 review

(unremediatedgender.space)

When Can Optimization Be Done Safely?

StrivingForLegibility30 Dec 2023 1:24 UTC

12 points

0 comments3 min readLW link

Optimization Markets

StrivingForLegibility30 Dec 2023 1:24 UTC

13 points

2 comments2 min readLW link

The Plan − 2023 Version

johnswentworth29 Dec 2023 23:34 UTC

153 points

40 comments31 min readLW link 1 review