All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 123 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Memory Decoding Journal Club: Motor learning selectively strengthens cortical and striatal synapses of motor engram neurons

Devin Ward1 May 2025 23:52 UTC

1 point

0 comments1 min readLW link

My Research Process: Understanding and Cultivating Research Taste

Neel Nanda1 May 2025 23:08 UTC

36 points

3 comments9 min readLW link

AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions

peterbarnett and Aaron_Scher

1 May 2025 22:46 UTC

109 points

7 comments8 min readLW link

(techgov.intelligence.org)

How to specify an alignment target

Richard Juggins1 May 2025 21:11 UTC

14 points

2 comments12 min readLW link

Obstacles in ARC’s agenda: Mechanistic Anomaly Detection

David Matolcsi1 May 2025 20:51 UTC

43 points

1 comment11 min readLW link

AI-Generated GitHub repo backdated with junk then filled with my systems work. Has anyone seen this before?

rgunther1 May 2025 20:14 UTC

7 points

1 comment1 min readLW link

What is Inadequate about Bayesianism for AI Alignment: Motivating Infra-Bayesianism

Brittany Gelb1 May 2025 19:06 UTC

61 points

3 comments7 min readLW link

Can LLMs Simulate Internal Evaluation? A Case Study in Self-Generated Recommendations

The Neutral Mind1 May 2025 19:04 UTC

4 points

0 comments2 min readLW link

Superhuman Coders in AI 2027 - Not So Fast

dschwarz and FutureSearch

1 May 2025 18:56 UTC

68 points

0 comments5 min readLW link

AI #114: Liars, Sycophants and Cheaters

Zvi1 May 2025 14:00 UTC

40 points

6 comments63 min readLW link

(thezvi.wordpress.com)

Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall

Vladimir_Nesov1 May 2025 13:54 UTC

202 points

35 comments5 min readLW link

Anthropomorphizing AI might be good, actually

Seth Herd1 May 2025 13:50 UTC

35 points

6 comments3 min readLW link

Dont focus on updating P doom

Algon1 May 2025 11:10 UTC

7 points

3 comments2 min readLW link

Prioritizing Work

jefftk1 May 2025 2:00 UTC

110 points

11 comments1 min readLW link

(www.jefftk.com)

Don’t rely on a “race to the top”

sjadler1 May 2025 0:33 UTC

10 points

0 comments1 min readLW link

Meta-Technicalities: Safeguarding Values in Formal Systems

LTM30 Apr 2025 23:43 UTC

2 points

0 comments3 min readLW link

(routecause.substack.com)

Obstacles in ARC’s agenda: Finding explanations

David Matolcsi30 Apr 2025 23:03 UTC

128 points

10 comments17 min readLW link

GPT-4o Responds to Negative Feedback

Zvi30 Apr 2025 20:20 UTC

45 points

2 comments18 min readLW link

(thezvi.wordpress.com)

State of play of AI progress (and related brakes on an intelligence explosion) [Linkpost]

Noosphere8930 Apr 2025 19:58 UTC

7 points

0 comments5 min readLW link

(www.interconnects.ai)

Don’t accuse your interlocutor of being insufficiently truth-seeking

TFD30 Apr 2025 19:38 UTC

31 points

15 comments2 min readLW link

(www.thefloatingdroid.com)

How can we solve diffuse threats like research sabotage with AI control?

Vivek Hebbar30 Apr 2025 19:23 UTC

54 points

1 comment8 min readLW link

[Question] Can Narrowing One’s Reference Class Undermine the Doomsday Argument?

Iannoose n.30 Apr 2025 18:24 UTC

2 points

1 comment1 min readLW link

[Question] Does there exist an interactive reasoning map tool that lets users visually lay out claims, assign probabilities and confidence levels, and dynamically adjust their beliefs based on weighted influences between connected assertions?

Zack Friedman30 Apr 2025 18:22 UTC

5 points

4 comments1 min readLW link

Distilling the Internal Model Principle part II

JoseFaustino30 Apr 2025 17:56 UTC

15 points

0 comments19 min readLW link

Research Priorities for Hardware-Enabled Mechanisms (HEMs)

aog30 Apr 2025 17:43 UTC

18 points

3 comments15 min readLW link

(www.longview.org)

Video and transcript of talk on automating alignment research

Joe Carlsmith30 Apr 2025 17:43 UTC

31 points

0 comments24 min readLW link

(joecarlsmith.com)

Can we safely automate alignment research?

Joe Carlsmith30 Apr 2025 17:37 UTC

65 points

30 comments48 min readLW link

(joecarlsmith.com)

Investigating task-specific prompts and sparse autoencoders for activation monitoring

Henk Tillman30 Apr 2025 17:09 UTC

23 points

0 comments1 min readLW link

(arxiv.org)

European Links (30.04.25)

Martin Sustrik30 Apr 2025 15:40 UTC

15 points

1 comment8 min readLW link

(250bpm.substack.com)

Scaling Laws for Scalable Oversight

Subhash Kantamneni, Josh Engels, David Baek and Max Tegmark

30 Apr 2025 12:13 UTC

38 points

1 comment9 min readLW link

Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis

jeanne_ and eeeee

30 Apr 2025 11:06 UTC

219 points

11 comments11 min readLW link

[Paper] Automated Feature Labeling with Token-Space Gradient Descent

Wuschel Schulz30 Apr 2025 10:22 UTC

4 points

0 comments4 min readLW link

A single principle related to many Alignment subproblems?

Q Home30 Apr 2025 9:49 UTC

43 points

34 comments17 min readLW link

What if Brain Computer Interfaces went exponential?

Stephen Martin30 Apr 2025 5:07 UTC

−1 points

0 comments12 min readLW link

Interpreting the METR Time Horizons Post

snewman30 Apr 2025 3:03 UTC

70 points

13 comments10 min readLW link

(amistrongeryet.substack.com)

Should we expect the future to be good?

Neil Crawford30 Apr 2025 0:36 UTC

15 points

0 comments14 min readLW link

Judging types of consequentialism by influence and normativity

Cole Wyeth29 Apr 2025 23:25 UTC

19 points

0 comments2 min readLW link

Bandwidth Rules Everything Around Me: Oliver Habryka on OpenPhil and GoodVentures

Elizabeth29 Apr 2025 20:40 UTC

81 points

15 comments1 min readLW link

(acesounderglass.com)

The Grand Encyclopedia of Eponymous Laws

rogersbacon29 Apr 2025 19:30 UTC

29 points

9 comments16 min readLW link

(www.secretorum.life)

Misrepresentation as a Barrier for Interp (Part I)

johnswentworth and Steve Petersen

29 Apr 2025 17:07 UTC

113 points

12 comments7 min readLW link

AISN #53: An Open Letter Attempts to Block OpenAI Restructuring

Corin Katzke and Dan H

29 Apr 2025 16:13 UTC

7 points

0 comments4 min readLW link

What could Alphafold 4 look like?

Abhishaike Mahajan29 Apr 2025 15:45 UTC

8 points

0 comments1 min readLW link

Sealed Computation: Towards Low-Friction Proof of Locality

Paul Bricman29 Apr 2025 15:26 UTC

4 points

0 comments10 min readLW link

(noemaresearch.com)

Dating Roundup #4: An App for That

Zvi29 Apr 2025 13:10 UTC

18 points

5 comments16 min readLW link

(thezvi.wordpress.com)

Talk on letters to AI (London)

ukc1001429 Apr 2025 9:50 UTC

3 points

0 comments1 min readLW link

Memory Decoding Journal Club: “Motor learning selectively strengthens cortical and striatal synapses of motor engram neurons”

Devin Ward29 Apr 2025 2:26 UTC

1 point

0 comments1 min readLW link

D&D.Sci Tax Day: Adventurers and Assessments Evaluation & Ruleset

aphyer29 Apr 2025 2:00 UTC

28 points

10 comments5 min readLW link

How to Build a Third Place on Focusmate

Parker Conley28 Apr 2025 23:46 UTC

100 points

10 comments5 min readLW link

(parconley.com)

Methods of defense against AGI manipulation

MarkelKori28 Apr 2025 21:03 UTC

3 points

0 comments2 min readLW link

China’s Petition System: It Looks Like Democracy — But It Isn’t

Hu Yichao28 Apr 2025 20:56 UTC

0 points

4 comments2 min readLW link