All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

My Effortless Weightloss Story: A Quick Runthrough

CuoreDiVetro30 Sep 2023 23:02 UTC

124 points

78 comments9 min readLW link

Arguments for moral indefinability

Richard_Ngo30 Sep 2023 22:40 UTC

47 points

16 comments7 min readLW link

(www.thinkingcomplete.com)

Conditionals All The Way Down

Alexander_Heckett30 Sep 2023 21:06 UTC

33 points

2 comments3 min readLW link

Focusing your impact on short vs long TAI timelines

kuhanj30 Sep 2023 19:34 UTC

4 points

0 comments10 min readLW link

How model editing could help with the alignment problem

Michael Ripa30 Sep 2023 17:47 UTC

12 points

1 comment15 min readLW link

My submission to the ALTER Prize

Lorxus30 Sep 2023 16:07 UTC

11 points

0 comments1 min readLW link

(www.docdroid.net)

Anki deck for learning the main AI safety orgs, projects, and programs

Bryce Robertson30 Sep 2023 16:06 UTC

2 points

0 comments1 min readLW link

The Lighthaven Campus is open for bookings

habryka30 Sep 2023 1:08 UTC

210 points

18 comments4 min readLW link

(www.lighthaven.space)

Headphones hook

philh29 Sep 2023 22:50 UTC

21 points

1 comment3 min readLW link

(reasonableapproximation.net)

Paul Christiano’s views on “doom” (video explainer)

Michaël Trazzi29 Sep 2023 21:56 UTC

15 points

0 comments1 min readLW link

(youtu.be)

The Retroactive Funding Landscape: Innovations for Donors and Grantmakers

Dawn Drescher29 Sep 2023 17:39 UTC

13 points

0 comments19 min readLW link

(impactmarkets.substack.com)

Bids To Defer On Value Judgements

johnswentworth29 Sep 2023 17:07 UTC

58 points

6 comments3 min readLW link

Announcing FAR Labs, an AI safety coworking space

Ben Goldhaber29 Sep 2023 16:52 UTC

95 points

0 comments1 min readLW link

A tool for searching rationalist & EA webs

Daniel_Friedrich29 Sep 2023 15:23 UTC

4 points

0 comments1 min readLW link

(ratsearch.blogspot.com)

Basic Mathematics of Predictive Coding

Adam Shai29 Sep 2023 14:38 UTC

52 points

6 comments9 min readLW link

“Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation

titotal29 Sep 2023 14:01 UTC

161 points

81 comments20 min readLW link

(titotal.substack.com)

Steering subsystems: capabilities, agency, and alignment

Seth Herd29 Sep 2023 13:45 UTC

31 points

0 comments8 min readLW link

Apply to Usable Security Prize by September 30

Allison Duettmann29 Sep 2023 13:39 UTC

4 points

0 comments1 min readLW link

List of how people have become more hard-working

Chi Nguyen29 Sep 2023 11:30 UTC

72 points

7 comments3 min readLW link

Resolving moral uncertainty with randomization

B Jacobs and Jobst Heitzig

29 Sep 2023 11:23 UTC

7 points

1 comment11 min readLW link

EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem

Elizabeth28 Sep 2023 23:30 UTC

334 points

250 comments22 min readLW link 2 reviews

(acesounderglass.com)

Competitive, Cooperative, and Cohabitive

Screwtape28 Sep 2023 23:25 UTC

51 points

13 comments5 min readLW link 1 review

The Coming Wave

PeterMcCluskey28 Sep 2023 22:59 UTC

27 points

1 comment6 min readLW link

(bayesianinvestor.com)

High-level interpretability: detecting an AI’s objectives

Paul Colognese and Jozdien

28 Sep 2023 19:30 UTC

72 points

4 comments21 min readLW link

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

JanB, Owain_Evans and SoerenMind

28 Sep 2023 18:53 UTC

187 points

39 comments3 min readLW link 1 review

Responsible scaling policy TLDR

lemonhope28 Sep 2023 18:51 UTC

9 points

0 comments1 min readLW link

Alignment Workshop talks

Richard_Ngo28 Sep 2023 18:26 UTC

37 points

1 comment1 min readLW link

(www.alignment-workshop.com)

My Current Thoughts on the AI Strategic Landscape

Jeffrey Heninger28 Sep 2023 17:59 UTC

11 points

28 comments14 min readLW link

My Arrogant Plan for Alignment

MrArrogant28 Sep 2023 17:51 UTC

2 points

6 comments6 min readLW link

Discursive Competence in ChatGPT, Part 2: Memory for Texts

Bill Benzon28 Sep 2023 16:34 UTC

1 point

0 comments3 min readLW link

Different views of alignment have different consequences for imperfect methods

Stuart_Armstrong28 Sep 2023 16:31 UTC

33 points

0 comments1 min readLW link

AI #31: It Can Do What Now?

Zvi28 Sep 2023 16:00 UTC

90 points

6 comments40 min readLW link

(thezvi.wordpress.com)

The point of a game is not to win, and you shouldn’t even pretend that it is

mako yass28 Sep 2023 15:54 UTC

53 points

27 comments4 min readLW link

(makopool.com)

Cohabitive Games so Far

mako yass28 Sep 2023 15:41 UTC

140 points

146 comments19 min readLW link 2 reviews

(makopool.com)

Wobbly Table Theorem in Practice

Morpheus28 Sep 2023 14:33 UTC

25 points

0 comments2 min readLW link

Weighing Animal Worth

jefftk28 Sep 2023 13:50 UTC

25 points

11 comments2 min readLW link

(www.jefftk.com)

ARC Evals: Responsible Scaling Policies

Zach Stein-Perlman28 Sep 2023 4:30 UTC

40 points

10 comments2 min readLW link 1 review

(evals.alignment.org)

Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it)

Ruby28 Sep 2023 2:48 UTC

66 points

73 comments6 min readLW link

Alibaba Group releases Qwen, 14B parameter LLM

Nikola Jurkovic28 Sep 2023 0:12 UTC

5 points

1 comment1 min readLW link

(qianwen-res.oss-cn-beijing.aliyuncs.com)

Metaculus Launches 2023/2024 FluSight Challenge Supporting CDC, $5K in Prizes

ChristianWilliams27 Sep 2023 21:35 UTC

5 points

0 comments1 min readLW link

(www.metaculus.com)

Projects I would like to see (possibly at AI Safety Camp)

Linda Linsefors27 Sep 2023 21:27 UTC

22 points

12 comments4 min readLW link

Towards Better Milestones for Monitoring AI Capabilities

snewman27 Sep 2023 21:18 UTC

11 points

0 comments14 min readLW link

[Question] Is Bjorn Lomborg roughly right about climate change policy?

yhoiseth27 Sep 2023 20:06 UTC

29 points

14 comments2 min readLW link

(www.sciencedirect.com)

Commonsense Good, Creative Good

jefftk27 Sep 2023 19:50 UTC

70 points

11 comments3 min readLW link

(www.jefftk.com)

Petrov Day [Spoiler Warning]

lsusr27 Sep 2023 19:20 UTC

6 points

5 comments1 min readLW link

The Hidden Complexity of Wishes—The Animation

Writer27 Sep 2023 17:59 UTC

33 points

0 comments1 min readLW link

(youtu.be)

MMLU’s Moral Scenarios Benchmark Doesn’t Measure What You Think it Measures

corey morris27 Sep 2023 17:54 UTC

18 points

3 comments4 min readLW link

(medium.com)

[Question] What’s your standard for good work performance?

Chi Nguyen27 Sep 2023 16:58 UTC

30 points

3 comments1 min readLW link

The Role of Groups in the Progression of Human Understanding

Chris_Leong27 Sep 2023 15:09 UTC

11 points

0 comments2 min readLW link

The Great Disembedding

rogersbacon27 Sep 2023 14:53 UTC

16 points

6 comments16 min readLW link

(www.secretorum.life)