All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 456 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

“Absence of Evidence is Not Evidence of Absence” As a Limit

transhumanist_atom_understanderOct 1, 2023, 8:15 AM

16 points

1 comment2 min readLW link

New Tool: the Residual Stream Viewer

AdamYedidiaOct 1, 2023, 12:49 AM

32 points

7 comments4 min readLW link

(tinyurl.com)

My Effortless Weightloss Story: A Quick Runthrough

CuoreDiVetroSep 30, 2023, 11:02 PM

124 points

78 comments9 min readLW link

Arguments for moral indefinability

Richard_NgoSep 30, 2023, 10:40 PM

47 points

16 comments7 min readLW link

(www.thinkingcomplete.com)

Conditionals All The Way Down

lunatic_at_largeSep 30, 2023, 9:06 PM

33 points

2 comments3 min readLW link

Focusing your impact on short vs long TAI timelines

kuhanjSep 30, 2023, 7:34 PM

4 points

0 comments10 min readLW link

How model editing could help with the alignment problem

Michael RipaSep 30, 2023, 5:47 PM

12 points

1 comment15 min readLW link

My submission to the ALTER Prize

LorxusSep 30, 2023, 4:07 PM

6 points

0 comments1 min readLW link

(www.docdroid.net)

Anki deck for learning the main AI safety orgs, projects, and programs

Bryce RobertsonSep 30, 2023, 4:06 PM

2 points

0 comments1 min readLW link

The Lighthaven Campus is open for bookings

habrykaSep 30, 2023, 1:08 AM

209 points

18 comments4 min readLW link

(www.lighthaven.space)

Headphones hook

philhSep 29, 2023, 10:50 PM

21 points

1 comment3 min readLW link

(reasonableapproximation.net)

Paul Christiano’s views on “doom” (video explainer)

Michaël TrazziSep 29, 2023, 9:56 PM

15 points

0 comments1 min readLW link

(youtu.be)

The Retroactive Funding Landscape: Innovations for Donors and Grantmakers

Dawn DrescherSep 29, 2023, 5:39 PM

13 points

0 comments LW link

(impactmarkets.substack.com)

Bids To Defer On Value Judgements

johnswentworthSep 29, 2023, 5:07 PM

58 points

6 comments3 min readLW link

Announcing FAR Labs, an AI safety coworking space

Ben GoldhaberSep 29, 2023, 4:52 PM

95 points

0 comments1 min readLW link

A tool for searching rationalist & EA webs

Daniel_FriedrichSep 29, 2023, 3:23 PM

4 points

0 comments1 min readLW link

(ratsearch.blogspot.com)

Basic Mathematics of Predictive Coding

Adam ShaiSep 29, 2023, 2:38 PM

49 points

6 comments9 min readLW link

“Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation

titotalSep 29, 2023, 2:01 PM

160 points

79 comments LW link

(titotal.substack.com)

Steering subsystems: capabilities, agency, and alignment

Seth HerdSep 29, 2023, 1:45 PM

31 points

0 comments8 min readLW link

Apply to Usable Security Prize by September 30

Allison DuettmannSep 29, 2023, 1:39 PM

4 points

0 comments1 min readLW link

List of how people have become more hard-working

Chi NguyenSep 29, 2023, 11:30 AM

69 points

7 comments LW link

Resolving moral uncertainty with randomization

B Jacobs and Jobst Heitzig

Sep 29, 2023, 11:23 AM

7 points

1 comment11 min readLW link

EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem

ElizabethSep 28, 2023, 11:30 PM

323 points

250 comments22 min readLW link 2 reviews

(acesounderglass.com)

Competitive, Cooperative, and Cohabitive

ScrewtapeSep 28, 2023, 11:25 PM

49 points

13 comments5 min readLW link 1 review

The Coming Wave

PeterMcCluskeySep 28, 2023, 10:59 PM

27 points

1 comment6 min readLW link

(bayesianinvestor.com)

High-level interpretability: detecting an AI’s objectives

Paul Colognese and Jozdien

Sep 28, 2023, 7:30 PM

72 points

4 comments21 min readLW link

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

JanB, Owain_Evans and SoerenMind

Sep 28, 2023, 6:53 PM

187 points

39 comments3 min readLW link 1 review

Responsible scaling policy TLDR

lemonhopeSep 28, 2023, 6:51 PM

9 points

0 comments1 min readLW link

Alignment Workshop talks

Richard_NgoSep 28, 2023, 6:26 PM

37 points

1 comment1 min readLW link

(www.alignment-workshop.com)

My Current Thoughts on the AI Strategic Landscape

Jeffrey HeningerSep 28, 2023, 5:59 PM

11 points

28 comments14 min readLW link

My Arrogant Plan for Alignment

MrArrogantSep 28, 2023, 5:51 PM

2 points

6 comments6 min readLW link

Discursive Competence in ChatGPT, Part 2: Memory for Texts

Bill BenzonSep 28, 2023, 4:34 PM

1 point

0 comments3 min readLW link

Different views of alignment have different consequences for imperfect methods

Stuart_ArmstrongSep 28, 2023, 4:31 PM

31 points

0 comments1 min readLW link

AI #31: It Can Do What Now?

ZviSep 28, 2023, 4:00 PM

90 points

6 comments40 min readLW link

(thezvi.wordpress.com)

The point of a game is not to win, and you shouldn’t even pretend that it is

mako yassSep 28, 2023, 3:54 PM

51 points

27 comments4 min readLW link

(makopool.com)

Cohabitive Games so Far

mako yassSep 28, 2023, 3:41 PM

131 points

146 comments19 min readLW link 2 reviews

(makopool.com)

Wobbly Table Theorem in Practice

MorpheusSep 28, 2023, 2:33 PM

24 points

0 comments2 min readLW link

Weighing Animal Worth

jefftkSep 28, 2023, 1:50 PM

25 points

11 comments2 min readLW link

(www.jefftk.com)

ARC Evals: Responsible Scaling Policies

Zach Stein-PerlmanSep 28, 2023, 4:30 AM

40 points

10 comments2 min readLW link 1 review

(evals.alignment.org)

Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it)

RubySep 28, 2023, 2:48 AM

66 points

73 comments6 min readLW link

Jimmy Apples, source of the rumor that OpenAI has achieved AGI internally, is a credible insider.

JorterderSep 28, 2023, 1:20 AM

−6 points

2 comments1 min readLW link

(twitter.com)

Investigating the rumors of OpenAI achieving AGI

Jorterder28 Sep 2023 1:17 UTC

−4 points

1 comment1 min readLW link

Alibaba Group releases Qwen, 14B parameter LLM

Nikola Jurkovic28 Sep 2023 0:12 UTC

5 points

1 comment1 min readLW link

(qianwen-res.oss-cn-beijing.aliyuncs.com)

Metaculus Launches 2023/2024 FluSight Challenge Supporting CDC, $5K in Prizes

ChristianWilliams27 Sep 2023 21:35 UTC

5 points

0 comments LW link

(www.metaculus.com)

Projects I would like to see (possibly at AI Safety Camp)

Linda Linsefors27 Sep 2023 21:27 UTC

22 points

12 comments4 min readLW link

Towards Better Milestones for Monitoring AI Capabilities

snewman27 Sep 2023 21:18 UTC

11 points

0 comments14 min readLW link

[Question] Is Bjorn Lomborg roughly right about climate change policy?

yhoiseth27 Sep 2023 20:06 UTC

29 points

14 comments2 min readLW link

(www.sciencedirect.com)

Commonsense Good, Creative Good

jefftk27 Sep 2023 19:50 UTC

44 points

11 comments3 min readLW link

(www.jefftk.com)

Petrov Day [Spoiler Warning]

lsusr27 Sep 2023 19:20 UTC

6 points

6 comments1 min readLW link

The Hidden Complexity of Wishes—The Animation

Writer27 Sep 2023 17:59 UTC

33 points

0 comments1 min readLW link

(youtu.be)