All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Gothenburg LW / ACX meetup

Stefan8 Jan 2025 21:39 UTC

2 points

0 comments1 min readLW link

Aristocracy and Hostage Capital

Arjun Panickssery8 Jan 2025 19:38 UTC

108 points

7 comments3 min readLW link

(arjunpanickssery.substack.com)

[Question] What is the most impressive game LLMs can play well?

Cole Wyeth8 Jan 2025 19:38 UTC

19 points

20 comments1 min readLW link

The Type of Writing that Pushes Women Away

Dahlia8 Jan 2025 18:54 UTC

25 points

4 comments2 min readLW link

Ann Altman has filed a lawsuit in US federal court alleging that she was sexually abused by Sam Altman

quanticle8 Jan 2025 14:59 UTC

7 points

4 comments1 min readLW link

AI Safety Outreach Seminar & Social (online)

Linda Linsefors8 Jan 2025 13:25 UTC

9 points

0 comments1 min readLW link

XX by Rian Hughes: Pretentious Bullshit

Yair Halberstadt8 Jan 2025 13:02 UTC

33 points

5 comments5 min readLW link

Activation space interpretability may be doomed

bilalchughtai and Lucius Bushnaq

8 Jan 2025 12:49 UTC

154 points

34 comments8 min readLW link

AI Safety as a YC Startup

Lukas Petersson8 Jan 2025 10:46 UTC

58 points

9 comments5 min readLW link

The absolute basics of representation theory of finite groups

Dmitry Vaintrob8 Jan 2025 9:47 UTC

21 points

1 comment10 min readLW link

Implications of the AI Security Gap

Dan Braun8 Jan 2025 8:31 UTC

48 points

0 comments9 min readLW link

What are polysemantic neurons?

Vishakha and Algon

8 Jan 2025 7:35 UTC

9 points

0 comments4 min readLW link

(aisafety.info)

Tips On Empirical Research Slides

James Chua, John Hughes, Ethan Perez and Owain_Evans

8 Jan 2025 5:06 UTC

117 points

4 comments6 min readLW link

On Eating the Sun

jessicata8 Jan 2025 4:57 UTC

96 points

99 comments3 min readLW link

(unstablerontology.substack.com)

Book review: Range by David Epstein

PatrickDFarley8 Jan 2025 4:27 UTC

14 points

0 comments15 min readLW link

Can we have Epiphanies and Eureka moments more frequently?

CstineSublime8 Jan 2025 2:20 UTC

2 points

0 comments4 min readLW link

Job Opening: SWE to help improve grant-making software

Ethan Ashkie8 Jan 2025 0:54 UTC

22 points

1 comment2 min readLW link

(survivalandflourishing.com)

Markov’s Inequality Explained

criticalpoints8 Jan 2025 0:31 UTC

13 points

2 comments3 min readLW link

(eregis.github.io)

Stream Entry

lsusr7 Jan 2025 23:56 UTC

86 points

12 comments4 min readLW link

Don’t fall for ontology pyramid schemes

Lorec7 Jan 2025 23:29 UTC

16 points

8 comments2 min readLW link

Bridgewater x Metaculus Forecasting Contest Goes Global — Feb 3, $25k, Opportunities

ChristianWilliams7 Jan 2025 21:40 UTC

10 points

0 comments1 min readLW link

(www.metaculus.com)

A Principled Cartoon Guide to NVC

plex and Espedair Street

7 Jan 2025 21:01 UTC

52 points

9 comments5 min readLW link

Disagreement on AGI Suggests It’s Near

tangerine7 Jan 2025 20:42 UTC

34 points

15 comments1 min readLW link

Role embeddings: making authorship more salient to LLMs

Nina Panickssery and Christopher Ackerman

7 Jan 2025 20:13 UTC

50 points

0 comments8 min readLW link

Will bird flu be the next Covid? “Little chance” says my dashboard.

Nathan Young7 Jan 2025 20:10 UTC

27 points

0 comments1 min readLW link

[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty

tandem7 Jan 2025 19:11 UTC

164 points

9 comments1 min readLW link

Predicting AI Releases Through Side Channels

Reworr R7 Jan 2025 19:06 UTC

16 points

2 comments1 min readLW link

Responses to ~all criticisms of AIXI

Cole Wyeth7 Jan 2025 17:41 UTC

26 points

17 comments14 min readLW link

OpenAI #10: Reflections

Zvi7 Jan 2025 17:00 UTC

149 points

7 comments11 min readLW link

(thezvi.wordpress.com)

Some implications of radical empathy

MichaelStJules7 Jan 2025 16:10 UTC

3 points

0 comments7 min readLW link

Actualism, asymmetry and extinction

MichaelStJules7 Jan 2025 16:02 UTC

8 points

4 comments9 min readLW link

Meditation insights as phase shifts in your self-model

Jonas Hallgren7 Jan 2025 10:09 UTC

15 points

3 comments3 min readLW link

D&D.Sci Dungeonbuilding: the Dungeon Tournament Evaluation & Ruleset

aphyer7 Jan 2025 5:02 UTC

34 points

8 comments5 min readLW link

Incredibow

jefftk7 Jan 2025 3:30 UTC

17 points

3 comments1 min readLW link

(www.jefftk.com)

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety

Lauren Greenspan7 Jan 2025 3:08 UTC

37 points

2 comments12 min readLW link

My Experience With A Magnet Implant

Vale7 Jan 2025 3:01 UTC

9 points

2 comments1 min readLW link

(vale.rocks)

You should delay engineering-heavy research in light of R&D automation

Daniel Paleka7 Jan 2025 2:11 UTC

44 points

3 comments5 min readLW link

(newsletter.danielpaleka.com)

Testing for Scheming with Model Deletion

Guive7 Jan 2025 1:54 UTC

59 points

21 comments21 min readLW link

(guive.substack.com)

Guilt, Shame, and Depravity

Benquo7 Jan 2025 1:16 UTC

22 points

12 comments4 min readLW link

Turning up the Heat on Deceptively-Misaligned AI

J Bostock7 Jan 2025 0:13 UTC

19 points

16 comments4 min readLW link

(My) self-referential reason to believe in free will

jacek6 Jan 2025 23:35 UTC

12 points

6 comments1 min readLW link

Definition of alignment science I like

quetzal_rainbow6 Jan 2025 20:40 UTC

21 points

0 comments3 min readLW link

How will we update about scheming?

ryan_greenblatt6 Jan 2025 20:21 UTC

177 points

21 comments37 min readLW link

What Indicators Should We Watch to Disambiguate AGI Timelines?

snewman6 Jan 2025 19:57 UTC

144 points

57 comments13 min readLW link

Generating Cognateful Sentences with Large Language Models

vkethana6 Jan 2025 18:40 UTC

11 points

1 comment10 min readLW link

Really radical empathy

MichaelStJules6 Jan 2025 17:46 UTC

19 points

0 comments10 min readLW link

Independent research article analyzing consistent self-reports of experience in ChatGPT and Claude

rife6 Jan 2025 17:34 UTC

4 points

20 comments1 min readLW link

(awakenmoon.ai)

[Question] Meal Replacements in 2025?

alkjash6 Jan 2025 15:37 UTC

30 points

11 comments1 min readLW link

AI safety content you could create

Adam Jones6 Jan 2025 15:35 UTC

19 points

0 comments5 min readLW link

(adamjones.me)

Childhood and Education #8: Dealing with the Internet

Zvi6 Jan 2025 14:00 UTC

42 points

7 comments13 min readLW link

(thezvi.wordpress.com)