All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 234 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

“25 Lessons from 25 Years of Marriage” by honorary rationalist Ferrett Steinmetz

CronoDAS2 Oct 2024 22:42 UTC

24 points

2 comments1 min readLW link

(theferrett.substack.com)

MIT FutureTech are hiring for a Head of Operations role

peterslattery2 Oct 2024 17:11 UTC

8 points

0 comments4 min readLW link

Can AI Quantity beat AI Quality?

Gianluca Calcagni2 Oct 2024 15:21 UTC

2 points

0 comments5 min readLW link

[Intuitive self-models] 3. The Active Self

Steven Byrnes2 Oct 2024 15:20 UTC

80 points

46 comments27 min readLW link

AI Safety University Organizing: Early Takeaways from Thirteen Groups

agucova2 Oct 2024 15:14 UTC

33 points

0 comments9 min readLW link

Three main arguments that AI will save humans and one meta-argument

avturchin2 Oct 2024 11:39 UTC

9 points

8 comments2 min readLW link

Should we abstain from voting? (In nondeterministic elections)

B Jacobs2 Oct 2024 10:07 UTC

5 points

8 comments4 min readLW link

(bobjacobs.substack.com)

AI Safety at the Frontier: Paper Highlights, September ’24

gasteigerjo2 Oct 2024 9:49 UTC

13 points

0 comments7 min readLW link

(aisafetyfrontier.substack.com)

Self-Help Corner: Loop Detection

adamShimi2 Oct 2024 8:33 UTC

88 points

6 comments2 min readLW link

(formethods.substack.com)

The murderous shortcut: a toy model of instrumental convergence

Thomas Kwa2 Oct 2024 6:48 UTC

37 points

0 comments2 min readLW link

Switching to a Yamaha P-121 Keyboard

jefftk2 Oct 2024 2:20 UTC

11 points

0 comments2 min readLW link

(www.jefftk.com)

Foresight Vision Weekend 2024

Allison Duettmann1 Oct 2024 21:59 UTC

8 points

0 comments1 min readLW link

Happy simulations

FateGrinder1 Oct 2024 21:05 UTC

−5 points

0 comments2 min readLW link

Three Subtle Examples of Data Leakage

abstractapplic1 Oct 2024 20:45 UTC

183 points

17 comments4 min readLW link 1 review

AI Safety Newsletter #42: Newsom Vetoes SB 1047 Plus, OpenAI’s o1, and AI Governance Summary

Corin Katzke, Corin Katzke, Julius, Alexa Pan, andrewz and Dan H

1 Oct 2024 20:35 UTC

8 points

0 comments6 min readLW link

(newsletter.safe.ai)

Retrieval Augmented Genesis

João Ribeiro Medeiros1 Oct 2024 20:18 UTC

6 points

0 comments29 min readLW link

Likelihood calculation with duobels

Martin Gerdes1 Oct 2024 16:21 UTC

5 points

0 comments6 min readLW link

Is Text Watermarking a lost cause?

egor.timatkov1 Oct 2024 16:20 UTC

17 points

13 comments10 min readLW link

Information dark matter

Logan Kieller1 Oct 2024 15:05 UTC

36 points

4 comments28 min readLW link

(logankieller.substack.com)

Conventional footnotes considered harmful

dkl91 Oct 2024 14:54 UTC

25 points

16 comments1 min readLW link

(dkl9.net)

Newsom Vetoes SB 1047

Zvi1 Oct 2024 12:20 UTC

85 points

6 comments32 min readLW link

(thezvi.wordpress.com)

Will AI and Humanity Go to War?

Simon Goldstein1 Oct 2024 6:35 UTC

17 points

4 comments6 min readLW link

[Question] AMA: International School Student in China

Novice1 Oct 2024 6:00 UTC

5 points

0 comments1 min readLW link

AGI Farm

Rahul Chand1 Oct 2024 4:29 UTC

1 point

0 comments8 min readLW link

Why comparative advantage does not help horses

Sherrinford30 Sep 2024 22:27 UTC

111 points

17 comments3 min readLW link 2 reviews

Intelligence explosion: a rational assessment.

p4rziv4l30 Sep 2024 21:17 UTC

1 point

0 comments1 min readLW link

(docs.google.com)

Peak Human Capital

PeterMcCluskey30 Sep 2024 21:13 UTC

72 points

3 comments5 min readLW link

(bayesianinvestor.com)

Sam Altman’s Business Negging

Julian Bradshaw30 Sep 2024 21:06 UTC

13 points

0 comments1 min readLW link

(www.bloomberg.com)

In-Context Learning: An Alignment Survey

Alfie Lamerton30 Sep 2024 18:44 UTC

8 points

0 comments20 min readLW link

(docs.google.com)

Not Just For Therapy Chatbots: The Case For Compassion In AI Moral Alignment Research

kenneth_diao30 Sep 2024 18:37 UTC

2 points

0 comments12 min readLW link

Exploring Decomposability of SAE Features

Vikram_N30 Sep 2024 18:28 UTC

1 point

0 comments3 min readLW link

Knowledge Base 1: Could it increase intelligence and make it safer?

iwis30 Sep 2024 16:00 UTC

−4 points

0 comments4 min readLW link

Point of Failure: Semiconductor-Grade Quartz

Annapurna30 Sep 2024 15:57 UTC

41 points

8 comments2 min readLW link

(jorgevelez.substack.com)

on bacteria, on teeth

bhauth30 Sep 2024 15:56 UTC

62 points

9 comments6 min readLW link

(bhauth.com)

SB 1047 gets vetoed

ryan_b30 Sep 2024 15:49 UTC

25 points

1 comment1 min readLW link

(www.reuters.com)

Of Birds and Bees

RussellThor30 Sep 2024 10:52 UTC

7 points

9 comments2 min readLW link

A new process for mapping discussions

Nathan Young30 Sep 2024 8:57 UTC

29 points

8 comments6 min readLW link

(open.substack.com)

MATS Alumni Impact Analysis

utilistrutil, Juan Gil, yams, LauraVaughan, K Richards and Ryan Kidd

30 Sep 2024 2:35 UTC

62 points

7 comments11 min readLW link

[Question] Most capable publicly available agents?

Gabe30 Sep 2024 0:04 UTC

2 points

0 comments1 min readLW link

the case for CoT unfaithfulness is overstated

nostalgebraist29 Sep 2024 22:07 UTC

272 points

45 comments11 min readLW link 1 review

0.836 Bits of Evidence In Favor of Futarchy

niplav and Claude+

29 Sep 2024 21:57 UTC

39 points

0 comments3 min readLW link

Pomodoro Method Randomized Self Experiment

niplav29 Sep 2024 21:55 UTC

16 points

2 comments1 min readLW link

Toy Models of Superposition: Simplified by Hand

Axel Sorensen29 Sep 2024 21:19 UTC

9 points

3 comments8 min readLW link

LLMs are likely not conscious

research_prime_space29 Sep 2024 20:57 UTC

6 points

9 comments1 min readLW link

A Policy Proposal

phdead29 Sep 2024 20:45 UTC

10 points

4 comments4 min readLW link

Do Sparse Autoencoders (SAEs) transfer across base and finetuned language models?

Taras Kutsyk, Tommaso Mencattini and Ciprian Florea

29 Sep 2024 19:37 UTC

28 points

8 comments25 min readLW link

Models of life

Abhishaike Mahajan29 Sep 2024 19:24 UTC

8 points

0 comments16 min readLW link

(www.asimov.press)

Interpreting the effects of Jailbreak Prompts in LLMs

Harsh Raj29 Sep 2024 19:01 UTC

9 points

0 comments5 min readLW link

New Capabilities, New Risks? - Evaluating Agentic General Assistants using Elements of GAIA & METR Frameworks

Tej Lander29 Sep 2024 18:58 UTC

5 points

0 comments29 min readLW link

Developmental Stages in Multi-Problem Grokking

James Sullivan29 Sep 2024 18:58 UTC

5 points

0 comments6 min readLW link