All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 678 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

(My) self-referential reason to believe in free will

jacek6 Jan 2025 23:35 UTC

12 points

6 comments1 min readLW link

Definition of alignment science I like

quetzal_rainbow6 Jan 2025 20:40 UTC

21 points

0 comments3 min readLW link

How will we update about scheming?

ryan_greenblatt6 Jan 2025 20:21 UTC

177 points

21 comments37 min readLW link

What Indicators Should We Watch to Disambiguate AGI Timelines?

snewman6 Jan 2025 19:57 UTC

144 points

57 comments13 min readLW link

Generating Cognateful Sentences with Large Language Models

vkethana6 Jan 2025 18:40 UTC

11 points

1 comment10 min readLW link

Really radical empathy

MichaelStJules6 Jan 2025 17:46 UTC

19 points

0 comments10 min readLW link

Independent research article analyzing consistent self-reports of experience in ChatGPT and Claude

rife6 Jan 2025 17:34 UTC

4 points

20 comments1 min readLW link

(awakenmoon.ai)

[Question] Meal Replacements in 2025?

alkjash6 Jan 2025 15:37 UTC

30 points

11 comments1 min readLW link

AI safety content you could create

Adam Jones6 Jan 2025 15:35 UTC

19 points

0 comments5 min readLW link

(adamjones.me)

Childhood and Education #8: Dealing with the Internet

Zvi6 Jan 2025 14:00 UTC

42 points

7 comments13 min readLW link

(thezvi.wordpress.com)

Latent Adversarial Training (LAT) Improves the Representation of Refusal

alexandraabbas, nlpet and hal2k

6 Jan 2025 10:24 UTC

21 points

6 comments10 min readLW link

Alternative Cancer Care As Biohacking & Book Review: Surviving “Terminal” Cancer

DenizT6 Jan 2025 7:43 UTC

34 points

8 comments15 min readLW link

Estimating the benefits of a new flu drug (BXM)

DirectedEvolution6 Jan 2025 4:31 UTC

41 points

2 comments3 min readLW link

Measuring Nonlinear Feature Interactions in Sparse Crosscoders [Project Proposal]

Jason Gross and rajashree

6 Jan 2025 4:22 UTC

19 points

0 comments12 min readLW link

“We know how to build AGI”—Sam Altman

Nikola Jurkovic6 Jan 2025 2:05 UTC

62 points

5 comments1 min readLW link

(blog.samaltman.com)

[Question] Is “hidden complexity of wishes problem” solved?

Roman Malov5 Jan 2025 22:59 UTC

11 points

4 comments1 min readLW link

A Ground-Level Perspective on Capacity Building in International Development

Sean Aubin5 Jan 2025 20:36 UTC

12 points

1 comment8 min readLW link

Why Linear AI Safety Hits a Wall and How Fractal Intelligence Unlocks Non-Linear Solutions

Andy E Williams5 Jan 2025 17:08 UTC

−5 points

6 comments5 min readLW link

How to Do a PhD (in AI Safety)

Lewis Hammond5 Jan 2025 16:57 UTC

15 points

0 comments18 min readLW link

(lewishammond.com)

Reasons for and against working on technical AI safety at a frontier AI lab

bilalchughtai5 Jan 2025 14:49 UTC

101 points

12 comments12 min readLW link

Oppression and production are competing explanations for wealth inequality.

Benquo5 Jan 2025 14:13 UTC

45 points

16 comments8 min readLW link

(benjaminrosshoffman.com)

Maximizing Communication, not Traffic

jefftk5 Jan 2025 13:00 UTC

162 points

10 comments1 min readLW link

(www.jefftk.com)

Policymakers don’t have access to paywalled articles

Adam Jones5 Jan 2025 10:56 UTC

73 points

11 comments2 min readLW link

(adamjones.me)

Capital Ownership Will Not Prevent Human Disempowerment

beren5 Jan 2025 6:00 UTC

164 points

21 comments14 min readLW link

Chinese Researchers Crack ChatGPT: Replicating OpenAI’s Advanced AI Model

Evan_Gaensbauer5 Jan 2025 3:50 UTC

−8 points

1 comment1 min readLW link

(www.geeky-gadgets.com)

Orange and Strawberry Truffles

jefftk5 Jan 2025 1:50 UTC

10 points

1 comment1 min readLW link

(www.jefftk.com)

AXRP Episode 38.4 - Shakeel Hashim on AI Journalism

DanielFilan5 Jan 2025 0:20 UTC

11 points

0 comments12 min readLW link

How i’m building my ai system, how it’s going so far, and my thoughts on it

ollie_4 Jan 2025 18:20 UTC

−9 points

3 comments5 min readLW link

Parkinson’s Law and the Ideology of Statistics

Benquo4 Jan 2025 15:49 UTC

130 points

7 comments8 min readLW link

(benjaminrosshoffman.com)

The Laws of Large Numbers

Dmitry Vaintrob4 Jan 2025 11:54 UTC

38 points

11 comments12 min readLW link

The Golden Opportunity for American AI

Annapurna4 Jan 2025 10:26 UTC

2 points

8 comments1 min readLW link

(blogs.microsoft.com)

A Generalization of the Good Regulator Theorem

Alfred Harwood4 Jan 2025 9:55 UTC

21 points

6 comments10 min readLW link

Logic vs intuition ⇔ algorithm vs ML

pchvykov4 Jan 2025 9:06 UTC

5 points

0 comments7 min readLW link

debating buying NVDA in 2019

bhauth4 Jan 2025 5:06 UTC

28 points

3 comments3 min readLW link

(bhauth.com)

Making progress bars for Alignment

Kabir Kumar3 Jan 2025 21:25 UTC

2 points

0 comments1 min readLW link

(lu.ma)

The Intelligence Curse

lukedrago3 Jan 2025 19:07 UTC

155 points

27 comments18 min readLW link

(lukedrago.substack.com)

Introducing Squiggle AI

ozziegooen3 Jan 2025 17:53 UTC

92 points

15 comments8 min readLW link

Human study on AI spear phishing campaigns

Simon Lermen, Fred Heiding and Andrew Kao

3 Jan 2025 15:11 UTC

81 points

8 comments5 min readLW link

Mearsheimer’s Double Standard: Realism for Russia, Idealism for Israel

Ghdz3 Jan 2025 13:52 UTC

−15 points

2 comments4 min readLW link

The subset parity learning problem: much more than you wanted to know

Dmitry Vaintrob3 Jan 2025 9:13 UTC

107 points

19 comments11 min readLW link

Building AI safety benchmark environments on themes of universal human values

Roland Pihlakas and Three Laws

3 Jan 2025 4:24 UTC

18 points

3 comments12 min readLW link

(docs.google.com)

Emotional Superrationality

nullproxy2 Jan 2025 22:54 UTC

−6 points

4 comments11 min readLW link

Playing with Otamatones

jefftk2 Jan 2025 19:50 UTC

12 points

0 comments1 min readLW link

(www.jefftk.com)

7. Iterate the Game: Racing Where?

Allison Duettmann2 Jan 2025 19:06 UTC

11 points

0 comments9 min readLW link

6. Increase Intelligence: Welcome AI Players

Allison Duettmann2 Jan 2025 19:06 UTC

6 points

1 comment19 min readLW link

5. Uphold Voluntarism: Digital Defense

Allison Duettmann2 Jan 2025 19:05 UTC

3 points

0 comments18 min readLW link

4. Uphold Voluntarism: Physical Defense

Allison Duettmann2 Jan 2025 19:04 UTC

6 points

2 comments23 min readLW link

3. Improve Cooperation: Better Technologies

Allison Duettmann2 Jan 2025 19:03 UTC

5 points

2 comments23 min readLW link

2. Skim the Manual: Intelligent Voluntary Cooperation

Allison Duettmann2 Jan 2025 19:02 UTC

13 points

3 comments18 min readLW link

1. Meet the Players: Value Diversity

Allison Duettmann2 Jan 2025 19:00 UTC

32 points

2 comments10 min readLW link