All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 141516 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Lecture Series on Tiling Agents

abramdemski14 Jan 2025 21:34 UTC

38 points

14 comments1 min readLW link

Is AI Physical?

Lauren Greenspan14 Jan 2025 21:21 UTC

23 points

6 comments7 min readLW link

Heritability: Five Battles

Steven Byrnes14 Jan 2025 18:21 UTC

94 points

23 comments60 min readLW link

The Philosophical Glossary of AI

David Gross14 Jan 2025 17:36 UTC

11 points

0 comments1 min readLW link

(www.aiglossary.co.uk)

I’m offering free math consultations!

Gurkenglas14 Jan 2025 16:30 UTC

83 points

7 comments1 min readLW link

Why abandon “probability is in the mind” when it comes to quantum dynamics?

Maxwell Peterson14 Jan 2025 15:53 UTC

23 points

24 comments1 min readLW link

How do you deal w/ Super Stimuli?

Logan Riggs14 Jan 2025 15:14 UTC

112 points

25 comments3 min readLW link

curate

technicalities14 Jan 2025 14:40 UTC

12 points

0 comments2 min readLW link

Our new video about goal misgeneralization, plus an apology

Writer14 Jan 2025 14:07 UTC

33 points

0 comments7 min readLW link

(youtu.be)

NYC Congestion Pricing: Early Days

Zvi14 Jan 2025 14:00 UTC

29 points

0 comments15 min readLW link

(thezvi.wordpress.com)

Do humans really learn from “little” data?

Alice Wanderland14 Jan 2025 10:46 UTC

14 points

5 comments1 min readLW link

(aliceandbobinwanderland.substack.com)

Basics of Bayesian learning

Dmitry Vaintrob14 Jan 2025 10:00 UTC

12 points

0 comments13 min readLW link

[Question] Why do futurists care about the culture war?

Knight Lee14 Jan 2025 7:35 UTC

23 points

22 comments2 min readLW link

Don’t Legalize Drugs

Mr. Keating14 Jan 2025 6:51 UTC

38 points

10 comments9 min readLW link

Mini Go: Gateway Game

jefftk14 Jan 2025 3:30 UTC

32 points

1 comment1 min readLW link

(www.jefftk.com)

Finding Features Causally Upstream of Refusal

Daniel Lee, Eric Breck and Andy Arditi

14 Jan 2025 2:30 UTC

54 points

5 comments12 min readLW link

Implications of the inference scaling paradigm for AI safety

Ryan Kidd14 Jan 2025 2:14 UTC

96 points

70 comments5 min readLW link

Biden administration unveils global AI export controls aimed at China

Chris_Leong14 Jan 2025 1:01 UTC

9 points

0 comments1 min readLW link

(www.axios.com)

My latest attempt to understand decision theory: I asked ChatGPT to debate me.

bokov13 Jan 2025 19:37 UTC

−8 points

5 comments19 min readLW link

AI models inherently alter “human values.” So, alignment-based AI safety approaches must better account for value drift

bfitzgerald313213 Jan 2025 19:22 UTC

5 points

2 comments13 min readLW link

Chance is in the Map, not the Territory

Daniel Herrmann, ben_levinstein and Aydin Mohseni

13 Jan 2025 19:17 UTC

67 points

18 comments7 min readLW link

Progress links and short notes, 2025-01-13

jasoncrawford13 Jan 2025 18:35 UTC

13 points

2 comments3 min readLW link

(newsletter.rootsofprogress.org)

Better antibodies by engineering targets, not engineering antibodies (Nabla Bio)

Abhishaike Mahajan13 Jan 2025 15:05 UTC

4 points

0 comments14 min readLW link

(www.owlposting.com)

Zvi’s 2024 In Movies

Zvi13 Jan 2025 13:40 UTC

44 points

4 comments15 min readLW link

(thezvi.wordpress.com)

Paper club: He et al. on modular arithmetic (part I)

Dmitry Vaintrob13 Jan 2025 11:18 UTC

14 points

0 comments8 min readLW link

Cast it into the fire! Destroy it!

Aram Panasenco13 Jan 2025 7:30 UTC

6 points

9 comments2 min readLW link

Moderately More Than You Wanted To Know: Depressive Realism

JustisMills13 Jan 2025 2:57 UTC

73 points

4 comments6 min readLW link

(justismills.substack.com)

Applying traditional economic thinking to AGI: a trilemma

Steven Byrnes13 Jan 2025 1:23 UTC

153 points

32 comments3 min readLW link

Building AI Research Fleets

Ben Goldhaber and Jesse Hoogland

12 Jan 2025 18:23 UTC

132 points

11 comments5 min readLW link

Do Antidepressants work? (First Take)

Jacob Goldsmith12 Jan 2025 17:11 UTC

7 points

9 comments7 min readLW link

A Novel Idea for Harnessing Magnetic Reconnection as an Energy Source

resonova12 Jan 2025 17:11 UTC

0 points

8 comments3 min readLW link

How quickly could robots scale up?

Benjamin_Todd12 Jan 2025 17:01 UTC

46 points

25 comments1 min readLW link

(benjamintodd.substack.com)

AGI Will Not Make Labor Worthless

Maxwell Tabarrok12 Jan 2025 15:09 UTC

−8 points

16 comments5 min readLW link

(www.maximum-progress.com)

The purposeful drunkard

Dmitry Vaintrob12 Jan 2025 12:27 UTC

98 points

13 comments6 min readLW link

No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR

Mikhail Samin12 Jan 2025 11:43 UTC

81 points

21 comments1 min readLW link

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and Open Challenges.

Roland Pihlakas12 Jan 2025 3:37 UTC

47 points

7 comments12 min readLW link

Extending control evaluations to non-scheming threats

joshc12 Jan 2025 1:42 UTC

30 points

1 comment12 min readLW link

Rolling Thresholds for AGI Scaling Regulation

Larks12 Jan 2025 1:30 UTC

40 points

6 comments6 min readLW link

AI Safety at the Frontier: Paper Highlights, December ’24

gasteigerjo11 Jan 2025 22:54 UTC

7 points

2 comments7 min readLW link

(aisafetyfrontier.substack.com)

Fluoridation: The RCT We Still Haven’t Run (But Should)

ChristianKl11 Jan 2025 21:02 UTC

22 points

5 comments2 min readLW link

In Defense of a Butlerian Jihad

sloonz11 Jan 2025 19:30 UTC

10 points

25 comments9 min readLW link

Near term discussions need something smaller and more concrete than AGI

ryan_b11 Jan 2025 18:24 UTC

13 points

0 comments6 min readLW link

A proposal for iterated interpretability with known-interpretable narrow AIs

Peter Berggren11 Jan 2025 14:43 UTC

6 points

0 comments2 min readLW link

Have frontier AI systems surpassed the self-replicating red line?

nsage11 Jan 2025 5:31 UTC

4 points

0 comments4 min readLW link

We need a universal definition of ‘agency’ and related words

CstineSublime11 Jan 2025 3:22 UTC

18 points

1 comment5 min readLW link

[Question] AI for medical care for hard-to-treat diseases?

CronoDAS10 Jan 2025 23:55 UTC

12 points

1 comment1 min readLW link

Beliefs and state of mind into 2025

RussellThor10 Jan 2025 22:07 UTC

18 points

10 comments7 min readLW link

Recommendations for Technical AI Safety Research Directions

Sam Marks10 Jan 2025 19:34 UTC

64 points

1 comment17 min readLW link

(alignment.anthropic.com)

Is AI Alignment Enough?

Aram Panasenco10 Jan 2025 18:57 UTC

30 points

6 comments6 min readLW link

[Question] What are some scenarios where an aligned AGI actually helps humanity, but many/most people don’t like it?

RomanS10 Jan 2025 18:13 UTC

14 points

6 comments3 min readLW link