All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The slingshot helps with learning

Wilson WuOct 31, 2024, 11:18 PM

33 points

0 comments8 min readLW link

Toward Safety Case Inspired Basic Research

Lucas Teixeira, Lauren Greenspan, Dmitry Vaintrob and Eric Winsor

Oct 31, 2024, 11:06 PM

55 points

3 comments13 min readLW link

Spooky Recommendation System Scaling

phdeadOct 31, 2024, 10:00 PM

11 points

0 comments4 min readLW link

‘Meta’, ‘mesa’, and mountains

LorecOct 31, 2024, 5:25 PM

1 point

0 comments3 min readLW link

Toward Safety Cases For AI Scheming

Mikita Balesni and Marius Hobbhahn

Oct 31, 2024, 5:20 PM

60 points

1 comment2 min readLW link

AI #88: Thanks for the Memos

ZviOct 31, 2024, 3:00 PM

46 points

5 comments77 min readLW link

(thezvi.wordpress.com)

The Compendium, A full argument about extinction risk from AGI

adamShimi, Gabriel Alfour, Connor Leahy, Chris Scammell and Andrea_Miotti

Oct 31, 2024, 12:01 PM

195 points

52 comments2 min readLW link

(www.thecompendium.ai)

Some Preliminary Notes on the Promise of a Wisdom Explosion

Chris_LeongOct 31, 2024, 9:21 AM

2 points

0 comments1 min readLW link

(aiimpacts.org)

What TMS is like

SableOct 31, 2024, 12:44 AM

208 points

23 comments6 min readLW link

(affablyevil.substack.com)

AI Safety at the Frontier: Paper Highlights, October ’24

gasteigerjoOct 31, 2024, 12:09 AM

3 points

0 comments9 min readLW link

(aisafetyfrontier.substack.com)

Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution

Kola AyonrindeOct 30, 2024, 10:50 PM

27 points

0 comments12 min readLW link

Generic advice caveats

Saul MunnOct 30, 2024, 9:03 PM

27 points

1 comment3 min readLW link

(www.brasstacks.blog)

I turned decision theory problems into memes about trolleys

TapataktOct 30, 2024, 8:13 PM

104 points

23 comments1 min readLW link

AI as a powerful meme, via CGP Grey

TheManxLoinerOct 30, 2024, 6:31 PM

46 points

8 comments4 min readLW link

[Question] How might language influence how an AI “thinks”?

bodryOct 30, 2024, 5:41 PM

3 points

0 comments1 min readLW link

Motivation control

Joe CarlsmithOct 30, 2024, 5:15 PM

45 points

7 comments52 min readLW link

Updating the NAO Simulator

jefftkOct 30, 2024, 1:50 PM

11 points

0 comments2 min readLW link

(www.jefftk.com)

Occupational Licensing Roundup #1

ZviOct 30, 2024, 11:00 AM

65 points

11 comments11 min readLW link

(thezvi.wordpress.com)

Three Notions of “Power”

johnswentworthOct 30, 2024, 6:10 AM

92 points

44 comments4 min readLW link

Introduction to Choice set Misspecification in Reward Inference

Rahul ChandOct 29, 2024, 10:57 PM

1 point

0 comments8 min readLW link

Gothenburg LW/ACX meetup

StefanOct 29, 2024, 8:40 PM

2 points

0 comments1 min readLW link

The Alignment Trap: AI Safety as Path to Power

crispweedOct 29, 2024, 3:21 PM

57 points

17 comments5 min readLW link

(upcoder.com)

Housing Roundup #10

ZviOct 29, 2024, 1:50 PM

32 points

2 comments32 min readLW link

(thezvi.wordpress.com)

[Intuitive self-models] 7. Hearing Voices, and Other Hallucinations

Steven ByrnesOct 29, 2024, 1:36 PM

51 points

2 comments16 min readLW link

Review: “The Case Against Reality”

David GrossOct 29, 2024, 1:13 PM

20 points

9 comments5 min readLW link

A Poem Is All You Need: Jailbreaking ChatGPT, Meta & More

Sharat Jacob JacobOct 29, 2024, 12:41 PM

12 points

0 comments9 min readLW link

Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence

EuanMcLeanOct 29, 2024, 12:16 PM

45 points

9 comments26 min readLW link

AI #87: Staying in Character

ZviOct 29, 2024, 7:10 AM

57 points

3 comments33 min readLW link

(thezvi.wordpress.com)

A path to human autonomy

Nathan Helm-BurgerOct 29, 2024, 3:02 AM

53 points

16 comments20 min readLW link

D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset

aphyerOct 29, 2024, 1:21 AM

47 points

13 comments6 min readLW link

Gwern: Why So Few Matt Levines?

kaveOct 29, 2024, 1:07 AM

78 points

10 comments1 min readLW link

(gwern.net)

October 2024 Progress in Guaranteed Safe AI

QuinnOct 28, 2024, 11:34 PM

7 points

0 comments1 min readLW link

(gsai.substack.com)

5 homegrown EA projects, seeking small donors

Austin ChenOct 28, 2024, 11:24 PM

85 points

4 comments LW link

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)

Joe CarlsmithOct 28, 2024, 9:57 PM

54 points

5 comments32 min readLW link

Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations

ozziegooenOct 28, 2024, 9:44 PM

7 points

0 comments LW link

AI & wisdom 3: AI effects on amortised optimisation

L Rudolf LOct 28, 2024, 9:08 PM

18 points

0 comments14 min readLW link

(rudolf.website)

AI & wisdom 2: growth and amortised optimisation

L Rudolf LOct 28, 2024, 9:07 PM

18 points

0 comments8 min readLW link

(rudolf.website)

AI & wisdom 1: wisdom, amortised optimisation, and AI

L Rudolf LOct 28, 2024, 9:02 PM

29 points

0 comments15 min readLW link

(rudolf.website)

Finishing The SB-1047 Documentary In 6 Weeks

Michaël TrazziOct 28, 2024, 8:17 PM

94 points

7 comments4 min readLW link

(manifund.org)

Towards the Operationalization of Philosophy & Wisdom

Thane RuthenisOct 28, 2024, 7:45 PM

20 points

2 comments33 min readLW link

(aiimpacts.org)

Quantitative Trading Bootcamp [Nov 6-10]

Ricki HeicklenOct 28, 2024, 6:39 PM

7 points

0 comments1 min readLW link

Winners of the Essay competition on the Automation of Wisdom and Philosophy

owencb and AI Impacts

Oct 28, 2024, 5:10 PM

40 points

3 comments30 min readLW link

(blog.aiimpacts.org)

Miles Brundage: Finding Ways to Credibly Signal the Benignness of AI Development and Deployment is an Urgent Priority

Zach Stein-PerlmanOct 28, 2024, 5:00 PM

22 points

4 comments3 min readLW link

(milesbrundage.substack.com)

[Question] somebody explain the word “epistemic” to me

KvmanThinkingOct 28, 2024, 4:40 PM

7 points

8 comments1 min readLW link

~80 Interesting Questions about Foundation Model Agent Safety

RohanS and Govind Pimpale

Oct 28, 2024, 4:37 PM

46 points

4 comments15 min readLW link

AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels

Corin Katzke, Corin Katzke, Alexa Pan and Dan H

Oct 28, 2024, 4:03 PM

6 points

0 comments6 min readLW link

(newsletter.safe.ai)

Death notes − 7 thoughts on death

Nathan YoungOct 28, 2024, 3:01 PM

26 points

1 comment5 min readLW link

(nathanpmyoung.substack.com)

SAEs you can See: Applying Sparse Autoencoders to Clustering

Robert_AIZIOct 28, 2024, 2:48 PM

27 points

0 comments10 min readLW link

Bridging the VLM and mech interp communities for multimodal interpretability

Sonia Joseph28 Oct 2024 14:41 UTC

19 points

5 comments15 min readLW link

How Likely Are Various Precursors of Existential Risk?

NunoSempere28 Oct 2024 13:27 UTC

55 points

4 comments15 min readLW link

(blog.sentinel-team.org)