All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 242526 27 28

Metacompilation

Donald Hobson24 Feb 2025 22:58 UTC

11 points

1 comment4 min readLW link

The manifest manifesto

dkl924 Feb 2025 22:13 UTC

6 points

2 comments2 min readLW link

(dkl9.net)

Credit Suisse collapse obfuscated Parreaux, Thiébaud & Partners scandal

pocock24 Feb 2025 21:28 UTC

3 points

0 comments1 min readLW link

(juristgate.com)

Topological Data Analysis and Mechanistic Interpretability

Gunnar Carlsson24 Feb 2025 19:56 UTC

17 points

4 comments7 min readLW link

Zizian comparisons / connections in the open source & Linux communities

pocock24 Feb 2025 19:55 UTC

−17 points

0 comments1 min readLW link

Local Trust

ben_levinstein, Daniel Herrmann and Aydin Mohseni

24 Feb 2025 19:53 UTC

21 points

4 comments5 min readLW link

Nationwide Action Workshop: Contact Congress about AI safety!

Felix De Simone24 Feb 2025 19:36 UTC

7 points

0 comments1 min readLW link

Anthropic releases Claude 3.7 Sonnet with extended thinking mode

LawrenceC24 Feb 2025 19:32 UTC

88 points

8 comments4 min readLW link

(www.anthropic.com)

Training AI to do alignment research we don’t already know how to do

joshc24 Feb 2025 19:19 UTC

45 points

24 comments7 min readLW link

Conference Report: Threshold 2030 - Modeling AI Economic Futures

Deric Cheng, Justin Bullock, Deger Turan and Elliot Mckernon

24 Feb 2025 18:56 UTC

52 points

0 comments10 min readLW link

(www.convergenceanalysis.org)

Evaluating “What 2026 Looks Like” So Far

Jonny Spicer24 Feb 2025 18:55 UTC

79 points

7 comments7 min readLW link

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

Yoshua Bengio, Jesse Richardson, dwk and mattmacdermott

24 Feb 2025 18:31 UTC

45 points

15 comments11 min readLW link

Understanding Agent Preferences

martinkunev24 Feb 2025 17:46 UTC

6 points

2 comments14 min readLW link

What We Can Do to Prevent Extinction by AI

Joe Rogero24 Feb 2025 17:15 UTC

13 points

0 comments11 min readLW link

Dream, Truth, & Good

abramdemski24 Feb 2025 16:59 UTC

50 points

11 comments4 min readLW link

Forecasting Frontier Language Model Agent Capabilities

fidgetsinner, Axel Højmark, Jérémy Scheurer and Marius Hobbhahn

24 Feb 2025 16:51 UTC

35 points

0 comments5 min readLW link

(www.apolloresearch.ai)

A City Within a City

Declan Molony24 Feb 2025 15:51 UTC

67 points

2 comments7 min readLW link

Grok Grok

Zvi24 Feb 2025 14:20 UTC

36 points

2 comments19 min readLW link

(thezvi.wordpress.com)

if you’re not happy single, you won’t be happy immortal

daijin24 Feb 2025 13:23 UTC

2 points

1 comment1 min readLW link

[NSFW] The Fuzzy Handcuffs of Liberation

lsusr24 Feb 2025 13:05 UTC

24 points

11 comments2 min readLW link

Dayton, Ohio, HPMOR 10 year Anniversary meetup

Lunawarrior24 Feb 2025 12:55 UTC

1 point

0 comments1 min readLW link

An Alternate History of the Future, 2025-2040

Mr Beastly24 Feb 2025 5:53 UTC

5 points

11 comments10 min readLW link

Export Surplusses

lsusr24 Feb 2025 5:53 UTC

29 points

21 comments3 min readLW link

AI alignment for mental health supports

hiki_t24 Feb 2025 4:21 UTC

1 point

1 comment1 min readLW link

The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research

Arthur Conmy and Neel Nanda

24 Feb 2025 2:17 UTC

48 points

1 comment7 min readLW link

Poll on AI opinions.

Niclas Kupper23 Feb 2025 22:39 UTC

1 point

2 comments1 min readLW link

The Geometry of Linear Regression versus PCA

criticalpoints23 Feb 2025 21:01 UTC

20 points

7 comments6 min readLW link

(eregis.github.io)

Judgements: Merging Prediction & Evidence

abramdemski23 Feb 2025 19:35 UTC

109 points

9 comments6 min readLW link

Intelligence as Privilege Escalation

Cole Wyeth23 Feb 2025 19:31 UTC

29 points

2 comments5 min readLW link

[Question] Have LLMs Generated Novel Insights?

abramdemski and Cole Wyeth

23 Feb 2025 18:22 UTC

171 points

45 comments2 min readLW link

The case for corporal punishment

Yair Halberstadt23 Feb 2025 15:05 UTC

28 points

5 comments2 min readLW link

Reflections on the state of the race to superintelligence, February 2025

Mitchell_Porter23 Feb 2025 13:58 UTC

22 points

7 comments4 min readLW link

List of most interesting ideas I encountered in my life, ranked

Lucien23 Feb 2025 12:36 UTC

21 points

6 comments1 min readLW link

Test of the Bene Gesserit

lsusr23 Feb 2025 11:51 UTC

19 points

3 comments3 min readLW link

Moral gauge theory: A speculative suggestion for AI alignment

James Diacoumis23 Feb 2025 11:42 UTC

6 points

3 comments8 min readLW link

[Question] Does human (mis)alignment pose a significant and imminent existential threat?

jr23 Feb 2025 10:03 UTC

6 points

3 comments1 min readLW link

Deep sparse autoencoders yield interpretable features too

Armaan A. Abraham23 Feb 2025 5:46 UTC

31 points

8 comments8 min readLW link

New Report: Multi-Agent Risks from Advanced AI

Lewis Hammond23 Feb 2025 0:32 UTC

26 points

0 comments2 min readLW link

(www.cooperativeai.com)

Power Lies Trembling: a three-book review

Richard_Ngo22 Feb 2025 22:57 UTC

218 points

29 comments15 min readLW link

(www.mindthefuture.info)

Transformer Dynamics: a neuro-inspired approach to MechInterp

guitchounts and jfernando

22 Feb 2025 21:33 UTC

11 points

0 comments5 min readLW link

Recursive Cognitive Refinement (RCR): A Self-Correcting Approach for LLM Hallucinations

mxTheo22 Feb 2025 21:32 UTC

0 points

0 comments2 min readLW link

Gradual Disempowerment: Simplified

Annapurna22 Feb 2025 16:59 UTC

10 points

1 comment1 min readLW link

(jorgevelez.substack.com)

AI Apocalypse and the Buddha

pchvykov22 Feb 2025 16:33 UTC

−17 points

6 comments9 min readLW link

Unaligned AGI & Brief History of Inequality

ank22 Feb 2025 16:26 UTC

−20 points

4 comments7 min readLW link

HPMOR Anniversary Guide

Screwtape22 Feb 2025 16:17 UTC

64 points

7 comments3 min readLW link

Forecasting Uncontrolled Spread of AI

Alvin Ånestrand22 Feb 2025 13:05 UTC

2 points

0 comments10 min readLW link

(forecastingaifutures.substack.com)

Seeing Through the Eyes of the Algorithm

silentbob22 Feb 2025 11:54 UTC

18 points

3 comments10 min readLW link

Proselytizing

lsusr22 Feb 2025 11:54 UTC

49 points

3 comments2 min readLW link

Workshop: Interpretability in LLMs using Geometric and Statistical Methods

Karthik Viswanathan22 Feb 2025 9:39 UTC

17 points

0 comments8 min readLW link

Information throughput of biological humans and frontier LLMs

benwr22 Feb 2025 7:15 UTC

12 points

0 comments1 min readLW link