All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

All Jan FebMarApr May Jun

All 1 2 3 4 567 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Social Dilemmas — public goods, free riders, and exploitation

James Stephen BrownMar 5, 2025, 11:31 PM

6 points

0 comments3 min readLW link

(nonzerosum.games)

Introducing MASK: A Benchmark for Measuring Honesty in AI Systems

Richard Ren, Mantas Mazeika and Dan H

Mar 5, 2025, 10:56 PM

35 points

5 comments2 min readLW link

(www.mask-benchmark.ai)

The Hardware-Software Framework: A New Perspective on Economic Growth with AI

Jakub GrowiecMar 5, 2025, 7:59 PM

3 points

0 comments3 min readLW link

NY State Has a New Frontier Model Bill (+quick takes)

henryjMar 5, 2025, 7:29 PM

9 points

0 comments1 min readLW link

(www.henryjosephson.com)

The old memories tree

Yair HalberstadtMar 5, 2025, 7:03 PM

7 points

1 comment1 min readLW link

Reply to Vitalik on d/acc

samuelshadrachMar 5, 2025, 6:55 PM

8 points

0 comments3 min readLW link

(samuelshadrach.com)

A Bear Case: My Predictions Regarding AI Progress

Thane RuthenisMar 5, 2025, 4:41 PM

362 points

157 comments9 min readLW link

On the Rationality of Deterring ASI

Dan HMar 5, 2025, 4:11 PM

166 points

34 comments4 min readLW link

(nationalsecurity.ai)

On OpenAI’s Safety and Alignment Philosophy

ZviMar 5, 2025, 2:00 PM

58 points

5 comments17 min readLW link

(thezvi.wordpress.com)

The Alignment Imperative: Act Now or Lose Everything

racinkc1Mar 5, 2025, 5:49 AM

−14 points

0 comments1 min readLW link

Contra Dance Pay and Inflation

jefftkMar 5, 2025, 2:40 AM

12 points

0 comments2 min readLW link

(www.jefftk.com)

NYT Op-Ed The Government Knows A.G.I. Is Coming

worseMar 5, 2025, 1:53 AM

11 points

12 comments2 min readLW link

(www.nytimes.com)

Could this be an unusually good time to Earn To Give?

TomGardinerMar 4, 2025, 9:51 PM

−1 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

What is the best / most proper definition of “Feeling the AGI” there is?

AnnapurnaMar 4, 2025, 8:13 PM

8 points

5 comments1 min readLW link

Energy Markets Temporal Arbitrage with Batteries

NickyPMar 4, 2025, 5:37 PM

21 points

3 comments16 min readLW link

Distillation of Meta’s Large Concept Models Paper

NickyPMar 4, 2025, 5:33 PM

19 points

3 comments4 min readLW link

Top AI safety newsletters, books, podcasts, etc – new AISafety.com resource

Bryce Robertson and Søren Elverlin

Mar 4, 2025, 5:01 PM

32 points

2 comments1 min readLW link

2028 Should Not Be AI Safety’s First Foray Into Politics

Jesse RichardsonMar 4, 2025, 4:46 PM

5 points

0 comments2 min readLW link

[Question] How Much Are LLMs Actually Boosting Real-World Programmer Productivity?

Thane RuthenisMar 4, 2025, 4:23 PM

137 points

52 comments3 min readLW link

Validating against a misalignment detector is very different to training against one

mattmacdermottMar 4, 2025, 3:41 PM

33 points

4 comments4 min readLW link

For scheming, we should first focus on detection and then on prevention

Marius HobbhahnMar 4, 2025, 3:22 PM

47 points

7 comments5 min readLW link

Progress links and short notes, 2025-03-03

jasoncrawfordMar 4, 2025, 3:20 PM

8 points

0 comments6 min readLW link

(newsletter.rootsofprogress.org)

Formation Research: Organisation Overview

alamertonMar 4, 2025, 3:03 PM

5 points

0 comments11 min readLW link

On Writing #1

ZviMar 4, 2025, 1:30 PM

37 points

2 comments15 min readLW link

(thezvi.wordpress.com)

The Semi-Rational Militar Firefighter

P. JoãoMar 4, 2025, 12:23 PM

72 points

10 comments2 min readLW link

Observations About LLM Inference Pricing

Aaron_ScherMar 4, 2025, 3:03 AM

28 points

2 comments9 min readLW link

(techgov.intelligence.org)

[Question] How much should I worry about the Atlanta Fed’s GDP estimates?

Brendan LongMar 4, 2025, 2:03 AM

16 points

2 comments1 min readLW link

[Question] shouldn’t we try to get media attention?

KvmanThinkingMar 4, 2025, 1:39 AM

6 points

1 comment1 min readLW link

The Milton Friedman Model of Policy Change

JohnofCharlestonMar 4, 2025, 12:38 AM

136 points

17 comments4 min readLW link

The Compliment Sandwich 🥪 aka: How to criticize a normie without making them upset.

keltanMar 3, 2025, 11:15 PM

13 points

10 comments1 min readLW link

AI Safety at the Frontier: Paper Highlights, February ’25

gasteigerjoMar 3, 2025, 10:09 PM

7 points

0 comments7 min readLW link

(aisafetyfrontier.substack.com)

What goals will AIs have? A list of hypotheses

Daniel KokotajloMar 3, 2025, 8:08 PM

87 points

19 comments18 min readLW link

Takeaways From Our Recent Work on SAE Probing

Josh Engels, Subhash Kantamneni, Senthooran Rajamanoharan and Neel Nanda

Mar 3, 2025, 7:50 PM

30 points

0 comments5 min readLW link

Why People Commit White Collar Fraud (Ozy linkpost)

sapphireMar 3, 2025, 7:33 PM

22 points

1 comment1 min readLW link

(thingofthings.substack.com)

[Question] Ask Me Anything—Samuel

samuelshadrachMar 3, 2025, 7:24 PM

0 points

0 comments1 min readLW link

Expanding HarmBench: Investigating Gaps & Extending Adversarial LLM Testing

racinkc1Mar 3, 2025, 7:23 PM

1 point

0 comments1 min readLW link

Could Advanced AI Accelerate the Pace of AI Progress? Interviews with AI Researchers

jleibowich, Nikola Jurkovic and Tom Davidson

Mar 3, 2025, 7:05 PM

43 points

1 comment1 min readLW link

(papers.ssrn.com)

Middle School Choice

jefftkMar 3, 2025, 4:10 PM

27 points

10 comments4 min readLW link

(www.jefftk.com)

On GPT-4.5

ZviMar 3, 2025, 1:40 PM

44 points

12 comments22 min readLW link

(thezvi.wordpress.com)

Coalescence—Determinism In Ways We Care About

vitaliyaMar 3, 2025, 1:20 PM

12 points

0 comments11 min readLW link

Methods for strong human germline engineering

TsviBTMar 3, 2025, 8:13 AM

149 points

28 comments108 min readLW link

[Question] Examples of self-fulfilling prophecies in AI alignment?

Chris LakinMar 3, 2025, 2:45 AM

22 points

6 comments1 min readLW link

[Question] Request for Comments on AI-related Prediction Market Ideas

PeterMcCluskeyMar 2, 2025, 8:52 PM

17 points

1 comment3 min readLW link

Statistical Challenges with Making Super IQ babies

Jan Christian RefsgaardMar 2, 2025, 8:26 PM

154 points

26 comments9 min readLW link

Cautions about LLMs in Human Cognitive Loops

Alice BlairMar 2, 2025, 7:53 PM

39 points

11 comments7 min readLW link

Self-fulfilling misalignment data might be poisoning our AI models

TurnTroutMar 2, 2025, 7:51 PM

153 points

28 comments1 min readLW link

(turntrout.com)

Spencer Greenberg hiring a personal/professional/research remote assistant for 5-10 hours per week

spencergMar 2, 2025, 6:01 PM

13 points

0 comments LW link

[Question] Will LLM agents become the first takeover-capable AGIs?

Seth HerdMar 2, 2025, 5:15 PM

36 points

10 comments1 min readLW link

Not-yet-falsifiable beliefs?

Benjamin HendricksMar 2, 2025, 2:11 PM

6 points

4 comments1 min readLW link

Saving Zest

jefftkMar 2, 2025, 12:00 PM

24 points

1 comment1 min readLW link

(www.jefftk.com)