All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 282930 31

October 2024 Progress in Guaranteed Safe AI

Quinn28 Oct 2024 23:34 UTC

7 points

0 comments1 min readLW link

(gsai.substack.com)

5 homegrown EA projects, seeking small donors

Austin Chen28 Oct 2024 23:24 UTC

85 points

4 comments2 min readLW link

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)

Joe Carlsmith28 Oct 2024 21:57 UTC

54 points

5 comments32 min readLW link

Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations

ozziegooen28 Oct 2024 21:44 UTC

7 points

0 comments15 min readLW link

AI & wisdom 3: AI effects on amortised optimisation

L Rudolf L28 Oct 2024 21:08 UTC

18 points

0 comments14 min readLW link

(rudolf.website)

AI & wisdom 2: growth and amortised optimisation

L Rudolf L28 Oct 2024 21:07 UTC

18 points

0 comments8 min readLW link

(rudolf.website)

AI & wisdom 1: wisdom, amortised optimisation, and AI

L Rudolf L28 Oct 2024 21:02 UTC

29 points

0 comments15 min readLW link

(rudolf.website)

Finishing The SB-1047 Documentary In 6 Weeks

Michaël Trazzi28 Oct 2024 20:17 UTC

94 points

7 comments4 min readLW link

(manifund.org)

Towards the Operationalization of Philosophy & Wisdom

Thane Ruthenis28 Oct 2024 19:45 UTC

20 points

2 comments33 min readLW link

(aiimpacts.org)

Quantitative Trading Bootcamp [Nov 6-10]

Ricki Heicklen28 Oct 2024 18:39 UTC

7 points

0 comments1 min readLW link

Winners of the Essay competition on the Automation of Wisdom and Philosophy

owencb and AI Impacts

28 Oct 2024 17:10 UTC

40 points

3 comments30 min readLW link

(blog.aiimpacts.org)

Miles Brundage: Finding Ways to Credibly Signal the Benignness of AI Development and Deployment is an Urgent Priority

Zach Stein-Perlman28 Oct 2024 17:00 UTC

22 points

4 comments3 min readLW link

(milesbrundage.substack.com)

[Question] somebody explain the word “epistemic” to me

KvmanThinking28 Oct 2024 16:40 UTC

7 points

8 comments1 min readLW link

~80 Interesting Questions about Foundation Model Agent Safety

RohanS and Govind Pimpale

28 Oct 2024 16:37 UTC

48 points

4 comments15 min readLW link

AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels

Corin Katzke, Corin Katzke, Alexa Pan and Dan H

28 Oct 2024 16:03 UTC

6 points

0 comments6 min readLW link

(newsletter.safe.ai)

Death notes − 7 thoughts on death

Nathan Young28 Oct 2024 15:01 UTC

26 points

1 comment5 min readLW link

(nathanpmyoung.substack.com)

SAEs you can See: Applying Sparse Autoencoders to Clustering

Robert_AIZI28 Oct 2024 14:48 UTC

27 points

0 comments10 min readLW link

Bridging the VLM and mech interp communities for multimodal interpretability

Sonia Joseph28 Oct 2024 14:41 UTC

19 points

5 comments15 min readLW link

How Likely Are Various Precursors of Existential Risk?

NunoSempere28 Oct 2024 13:27 UTC

55 points

4 comments15 min readLW link

(blog.sentinel-team.org)

Care Doesn’t Scale

stavros28 Oct 2024 11:57 UTC

27 points

1 comment1 min readLW link

(stevenscrawls.com)

Your memory eventually drives confidence in each hypothesis to 1 or 0

Crazy philosopher28 Oct 2024 9:00 UTC

3 points

6 comments1 min readLW link

Nerdtrition: simple diets via spreadsheet abuse

dkl927 Oct 2024 21:45 UTC

9 points

0 comments3 min readLW link

(dkl9.net)

AGI Fermi Paradox

jrincayc27 Oct 2024 20:14 UTC

0 points

2 comments2 min readLW link

Substituting Talkbox for Breath Controller

jefftk27 Oct 2024 19:10 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

27 Oct 2024 18:46 UTC

48 points

4 comments5 min readLW link

Hiring a writer to co-author with me (Spencer Greenberg for ClearerThinking.org)

spencerg27 Oct 2024 17:34 UTC

16 points

0 comments1 min readLW link

Interview with Bill O’Rourke—Russian Corruption, Putin, Applied Ethics, and More

JohnGreer27 Oct 2024 17:11 UTC

2 points

0 comments6 min readLW link

On Shifgrethor

JustisMills27 Oct 2024 15:30 UTC

67 points

18 comments2 min readLW link

(justismills.substack.com)

The hostile telepaths problem

Valentine27 Oct 2024 15:26 UTC

398 points

92 comments15 min readLW link

[Question] What are some good ways to form opinions on controversial subjects in the current and upcoming era?

Terence Coelho27 Oct 2024 14:33 UTC

9 points

21 comments1 min readLW link

Video lectures on the learning-theoretic agenda

Vanessa Kosoy27 Oct 2024 12:01 UTC

75 points

0 comments1 min readLW link

(www.youtube.com)

Dario Amodei’s “Machines of Loving Grace” sound incredibly dangerous, for Humans

Super AGI27 Oct 2024 5:05 UTC

8 points

1 comment1 min readLW link

Electrostatic Airships?

DaemonicSigil27 Oct 2024 4:32 UTC

64 points

14 comments3 min readLW link

(pbement.com)

A suite of Vision Sparse Autoencoders

Louka Ewington-Pitsos and RRGoyal

27 Oct 2024 4:05 UTC

25 points

0 comments1 min readLW link

Ways to think about alignment

Abhimanyu Pallavi Sudhir27 Oct 2024 1:40 UTC

6 points

0 comments4 min readLW link

[Question] Is there a CFAR handbook audio option?

FinalFormal226 Oct 2024 17:08 UTC

16 points

0 comments1 min readLW link

Retrieval Augmented Genesis II — Holy Texts Semantics Analysis

João Ribeiro Medeiros26 Oct 2024 17:00 UTC

−1 points

0 comments11 min readLW link

A superficially plausible promising alternate Earth without lockstep

Lorec26 Oct 2024 16:04 UTC

−2 points

3 comments4 min readLW link

Galatea and the windup toy

Nicolas Villarreal26 Oct 2024 14:52 UTC

−3 points

0 comments13 min readLW link

(nicolasdvillarreal.substack.com)

Why is there Nothing rather than Something?

Logan Zoellner26 Oct 2024 12:37 UTC

27 points

3 comments4 min readLW link

The Summoned Heroine’s Prediction Markets Keep Providing Financial Services To The Demon King!

abstractapplic26 Oct 2024 12:34 UTC

167 points

16 comments7 min readLW link

AI Safety Camp 10

Robert Kralisch, Linda Linsefors and Remmelt

26 Oct 2024 11:08 UTC

38 points

9 comments18 min readLW link

Arithmetic Models: Better Than You Think

kqr26 Oct 2024 9:42 UTC

28 points

4 comments11 min readLW link

(entropicthoughts.com)

The Case For Bullying

Alexej Gerstmaier26 Oct 2024 4:56 UTC

−50 points

8 comments1 min readLW link

(lexposedtruth.com)

Is the Power Grid Sustainable?

jefftk26 Oct 2024 2:30 UTC

36 points

38 comments2 min readLW link

(www.jefftk.com)

[Question] (i no longer endorse this post) - cryonics is a pascal’s mugging?

KvmanThinking25 Oct 2024 23:24 UTC

−12 points

4 comments1 min readLW link

A Case for Conscious Significance rather than Free Will.

James Stephen Brown25 Oct 2024 23:20 UTC

10 points

2 comments6 min readLW link

Introducing Kairos: a new AI safety fieldbuilding organization (the new home for SPAR and FSP)

agucova25 Oct 2024 21:59 UTC

19 points

0 comments2 min readLW link

Brief analysis of OP Technical AI Safety Funding

22tom25 Oct 2024 19:37 UTC

76 points

5 comments1 min readLW link

UK AISI: Early lessons from evaluating frontier AI systems

Zach Stein-Perlman25 Oct 2024 19:00 UTC

26 points

0 comments2 min readLW link

(www.aisi.gov.uk)