All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 212223 24 25 26 27 28 29 30 31

Thoughts On (Solving) Deep Deception

Jozdien21 Oct 2023 22:40 UTC

72 points

6 comments6 min readLW link

Best effort beliefs

Adam Zerner21 Oct 2023 22:05 UTC

14 points

9 comments4 min readLW link

How toy models of ontology changes can be misleading

Stuart_Armstrong21 Oct 2023 21:13 UTC

42 points

0 comments2 min readLW link

Soups as Spreads

jefftk21 Oct 2023 20:30 UTC

22 points

0 comments1 min readLW link

(www.jefftk.com)

Which COVID booster to get?

Sameerishere21 Oct 2023 19:43 UTC

8 points

0 comments2 min readLW link

Alignment Implications of LLM Successes: a Debate in One Act

Zack_M_Davis21 Oct 2023 15:22 UTC

266 points

56 comments13 min readLW link 2 reviews

How to find a good moving service

Ziyue Wang21 Oct 2023 4:59 UTC

8 points

0 comments3 min readLW link

Apply for MATS Winter 2023-24!

utilistrutil, Ryan Kidd and LauraVaughan

21 Oct 2023 2:27 UTC

104 points

6 comments5 min readLW link

[Question] Can we isolate neurons that recognize features vs. those which have some other role?

Joshua Clancy21 Oct 2023 0:30 UTC

4 points

2 comments3 min readLW link

Muddling Along Is More Likely Than Dystopia

Jeffrey Heninger20 Oct 2023 21:25 UTC

88 points

10 comments8 min readLW link

What’s Hard About The Shutdown Problem

johnswentworth20 Oct 2023 21:13 UTC

98 points

33 comments4 min readLW link

Holly Elmore and Rob Miles dialogue on AI Safety Advocacy

Bird Concept, Robert Miles and Holly_Elmore

20 Oct 2023 21:04 UTC

163 points

30 comments27 min readLW link

TOMORROW: the largest AI Safety protest ever!

Holly_Elmore20 Oct 2023 18:15 UTC

105 points

26 comments2 min readLW link

The Overkill Conspiracy Hypothesis

ymeskhout20 Oct 2023 16:51 UTC

27 points

9 comments7 min readLW link

I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines

307th20 Oct 2023 16:37 UTC

125 points

33 comments9 min readLW link

Internal Target Information for AI Oversight

Paul Colognese20 Oct 2023 14:53 UTC

15 points

0 comments5 min readLW link

On the proper date for solstice celebrations

jchan20 Oct 2023 13:55 UTC

16 points

0 comments4 min readLW link

Are (at least some) Large Language Models Holographic Memory Stores?

Bill Benzon20 Oct 2023 13:07 UTC

11 points

4 comments6 min readLW link

Mechanistic interpretability of LLM analogy-making

Sergii20 Oct 2023 12:53 UTC

2 points

0 comments4 min readLW link

(grgv.xyz)

How To Socialize With Psycho(logist)s

Sable20 Oct 2023 11:33 UTC

37 points

11 comments3 min readLW link

(affablyevil.substack.com)

Revealing Intentionality In Language Models Through AdaVAE Guided Sampling

jdp20 Oct 2023 7:32 UTC

119 points

15 comments22 min readLW link

Features and Adversaries in MemoryDT

Joseph Bloom and Jay Bailey

20 Oct 2023 7:32 UTC

31 points

6 comments25 min readLW link

AI Safety Hub Serbia Soft Launch

DusanDNesic20 Oct 2023 7:11 UTC

64 points

1 comment3 min readLW link

(forum.effectivealtruism.org)

Announcing new round of “Key Phenomena in AI Risk” Reading Group

DusanDNesic and Nora_Ammann

20 Oct 2023 7:11 UTC

15 points

2 comments1 min readLW link

Unpacking the dynamics of AGI conflict that suggest the necessity of a premptive pivotal act

Eli Tyre20 Oct 2023 6:48 UTC

63 points

2 comments8 min readLW link

Genocide isn’t Decolonization

robotelvis20 Oct 2023 4:14 UTC

33 points

20 comments5 min readLW link

(messyprogress.substack.com)

Trying to understand John Wentworth’s research agenda

johnswentworth, habryka and David Lorell

20 Oct 2023 0:05 UTC

96 points

13 comments12 min readLW link

Boost your productivity, happiness and health with this one weird trick

ajc58619 Oct 2023 23:30 UTC

9 points

9 comments1 min readLW link

A Good Explanation of Differential Gears

Johannes C. Mayer19 Oct 2023 23:07 UTC

48 points

4 comments1 min readLW link

(youtu.be)

Evening Wiki(pedia) Workout

mcint19 Oct 2023 21:29 UTC

1 point

1 comment1 min readLW link

New roles on my team: come build Open Phil’s technical AI safety program with me!

Ajeya Cotra19 Oct 2023 16:47 UTC

83 points

6 comments4 min readLW link

[Question] Infinite tower of meta-probability

fryolysis19 Oct 2023 16:44 UTC

6 points

5 comments3 min readLW link

A NotKillEveryoneIsm Argument for Accelerating Deep Learning Research

Logan Zoellner19 Oct 2023 16:28 UTC

−6 points

6 comments5 min readLW link

(midwitalignment.substack.com)

Knowledge Base 5: Business model

iwis19 Oct 2023 16:06 UTC

−4 points

2 comments1 min readLW link

AI #34: Chipping Away at Chip Exports

Zvi19 Oct 2023 15:00 UTC

36 points

19 comments59 min readLW link

(thezvi.wordpress.com)

Is Yann LeCun strawmanning AI x-risks?

Chris_Leong19 Oct 2023 11:35 UTC

26 points

4 comments1 min readLW link

[Video] Too much Empiricism kills you

Johannes C. Mayer19 Oct 2023 5:08 UTC

19 points

0 comments1 min readLW link

(youtu.be)

Are humans misaligned with evolution?

TekhneMakre and jacob_cannell

19 Oct 2023 3:14 UTC

42 points

13 comments18 min readLW link

Brains, Planes, Blimps, and Algorithms

ai dan18 Oct 2023 21:26 UTC

1 point

0 comments6 min readLW link

The (partial) fallacy of dumb superintelligence

Seth Herd18 Oct 2023 21:25 UTC

38 points

5 comments4 min readLW link

[Question] Does AI governance needs a “Federalist papers” debate?

azsantosk18 Oct 2023 21:08 UTC

40 points

4 comments1 min readLW link

Metaculus Launches Conditional Cup to Explore Linked Forecasts

ChristianWilliams18 Oct 2023 20:41 UTC

9 points

0 comments1 min readLW link

(www.metaculus.com)

AI Safety 101 : Reward Misspecification

markov18 Oct 2023 20:39 UTC

32 points

4 comments31 min readLW link

2023 East Coast Rationalist Megameetup

Screwtape18 Oct 2023 20:33 UTC

8 points

0 comments1 min readLW link

Superforecasting the premises in “Is power-seeking AI an existential risk?”

Joe Carlsmith18 Oct 2023 20:23 UTC

31 points

3 comments5 min readLW link

The Real Fanfic Is The Friends We Made Along The Way

Eneasz18 Oct 2023 19:21 UTC

92 points

1 comment27 min readLW link 1 review

(deathisbad.substack.com)

AISN #24: Kissinger Urges US-China Cooperation on AI, China’s New AI Law, US Export Controls, International Institutions, and Open Source AI

Dan H and Corin Katzke

18 Oct 2023 17:06 UTC

14 points

0 comments6 min readLW link

(newsletter.safe.ai)

Back to the Past to the Future

Prometheus18 Oct 2023 16:51 UTC

5 points

0 comments1 min readLW link

How to Eradicate Global Extreme Poverty [RA video with fundraiser!]

aggliu and Writer

18 Oct 2023 15:51 UTC

50 points

5 comments9 min readLW link

(youtu.be)

On Interpretability’s Robustness

WCargo18 Oct 2023 13:18 UTC

11 points

0 comments4 min readLW link