All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 272829 30

Seeking advice on careers in AI Safety

nem27 Apr 2025 23:59 UTC

8 points

2 comments1 min readLW link

Thin Alignment Can’t Solve Thick Problems

Daan Henselmans27 Apr 2025 22:42 UTC

11 points

2 comments9 min readLW link

The Way You Go Depends A Good Deal On Where You Want To Get: FEP minimizes surprise about actions using preferences about the future as evidence

Christopher King27 Apr 2025 21:55 UTC

10 points

5 comments5 min readLW link

How people use LLMs

Elizabeth27 Apr 2025 21:48 UTC

83 points

6 comments1 min readLW link

(www.gleech.org)

Луна Лавгуд и Комната Тайн, Часть 6

Kongo Landwalker and lsusr

27 Apr 2025 20:26 UTC

3 points

0 comments2 min readLW link

Our Reality: A Simulation Run by a Paperclip Maximizer

James_Miller and avturchin

27 Apr 2025 16:17 UTC

34 points

70 comments5 min readLW link

Questions for old LW members: how have discussions about AI changed compared to 10+ years ago?

Expertium27 Apr 2025 16:11 UTC

11 points

12 comments1 min readLW link

The case for multi-decade AI timelines [Linkpost]

Noosphere8927 Apr 2025 15:31 UTC

58 points

22 comments1 min readLW link

(epoch.ai)

My Research Process: Key Mindsets—Truth-Seeking, Prioritisation, Moving Fast

Neel Nanda27 Apr 2025 14:38 UTC

50 points

0 comments11 min readLW link

I doubt model collapse will happen

Hruss27 Apr 2025 14:08 UTC

5 points

0 comments1 min readLW link

Propaganda-Bot: A Sketch of a Possible RSI

TristanTrim27 Apr 2025 12:15 UTC

6 points

0 comments3 min readLW link

After Internet Dependency

Vorak27 Apr 2025 8:18 UTC

14 points

2 comments1 min readLW link

Emergence of superintelligence from AI hiveminds: how to make it human-friendly?

Mitchell_Porter27 Apr 2025 4:51 UTC

11 points

0 comments2 min readLW link

“The Urgency of Interpretability” (Dario Amodei)

RobertM27 Apr 2025 4:31 UTC

31 points

23 comments3 min readLW link

(www.darioamodei.com)

AI Self Portraits Aren’t Accurate

JustisMills27 Apr 2025 3:27 UTC

59 points

10 comments5 min readLW link

MiCARwave

jefftk27 Apr 2025 2:30 UTC

13 points

0 comments1 min readLW link

(www.jefftk.com)

Open Source LLM Pokémon Scaffold

Julian Bradshaw27 Apr 2025 0:57 UTC

31 points

0 comments1 min readLW link

(github.com)

What are important UI-shaped problems that Lightcone could tackle?

Raemon27 Apr 2025 0:02 UTC

59 points

22 comments2 min readLW link

Kodo and Din

Screwtape26 Apr 2025 18:54 UTC

8 points

10 comments4 min readLW link

We should try to automate AI safety work asap

Marius Hobbhahn26 Apr 2025 16:35 UTC

116 points

10 comments15 min readLW link

Research Taxonomy Generator and Visualizer

Myles H26 Apr 2025 16:14 UTC

6 points

0 comments6 min readLW link

AI Safety & Entrepreneurship v1.0

Chris_Leong26 Apr 2025 14:37 UTC

16 points

0 comments2 min readLW link

Reconsidering Money: The Case for Freigeld in the Digital Age and a Networked Future

henophilia26 Apr 2025 12:54 UTC

−22 points

0 comments5 min readLW link

(blog.hermesloom.org)

How I Think About My Research Process: Explore, Understand, Distill

Neel Nanda26 Apr 2025 10:31 UTC

63 points

4 comments8 min readLW link

Don’t you mean “the most conditionally forbidden technique?”

Knight Lee26 Apr 2025 3:45 UTC

19 points

0 comments3 min readLW link

Land with no aunties

thellimist26 Apr 2025 1:20 UTC

6 points

0 comments1 min readLW link

(kanyilmaz.me)

AI 2027 Thoughts

PeterMcCluskey26 Apr 2025 0:00 UTC

30 points

2 comments6 min readLW link

(bayesianinvestor.com)

Who’s Working On It? AI-Controlled Experiments

sarahconstantin25 Apr 2025 21:40 UTC

19 points

0 comments1 min readLW link

(sarahconstantin.substack.com)

[Linkpost] AI War seems unlikely to prevent AI Doom

thenoviceoof25 Apr 2025 20:44 UTC

7 points

6 comments2 min readLW link

(thenoviceoof.com)

Worries About AI Are Usually Complements Not Substitutes

Zvi25 Apr 2025 20:00 UTC

45 points

3 comments4 min readLW link

(thezvi.wordpress.com)

Why would AI companies use human-level AI to do alignment research?

MichaelDickens25 Apr 2025 19:12 UTC

29 points

8 comments2 min readLW link

How Democratic Is Effective Altruism — Really?

B Jacobs25 Apr 2025 16:02 UTC

0 points

2 comments2 min readLW link

(bobjacobs.substack.com)

Will Programmer Compensation Decouple from Productivity?

Gordon Seidoh Worley25 Apr 2025 15:32 UTC

15 points

7 comments2 min readLW link

(uncertainupdates.substack.com)

Zstd Window Size

jefftk25 Apr 2025 14:40 UTC

12 points

1 comment2 min readLW link

(www.jefftk.com)

List of petitions against OpenAI’s for-profit move

Remmelt25 Apr 2025 10:03 UTC

5 points

1 comment1 min readLW link

A review of “Why Did Environmentalism Become Partisan?”

David Scott Krueger25 Apr 2025 5:12 UTC

24 points

0 comments4 min readLW link

LLM Pareto Frontier But Live

winstonBosan24 Apr 2025 21:22 UTC

8 points

0 comments1 min readLW link

Modifying LLM Beliefs with Synthetic Document Finetuning

RowanWang, Johannes Treutlein, Avery, Ethan Perez, Fabien Roger and Sam Marks

24 Apr 2025 21:15 UTC

77 points

12 comments2 min readLW link

(alignment.anthropic.com)

This prompt (sometimes) makes ChatGPT think about terrorist organisations

jakub_krys24 Apr 2025 21:15 UTC

30 points

13 comments1 min readLW link

Severe control over AI agents as a tool for mass-surveillance

Andrey Seryakov24 Apr 2025 20:27 UTC

2 points

0 comments3 min readLW link

Token and Taboo

Guive24 Apr 2025 20:17 UTC

31 points

6 comments4 min readLW link

(guive.substack.com)

Trouble at Miningtown: Prologue

Quinn24 Apr 2025 19:09 UTC

19 points

0 comments4 min readLW link

Training-time schemers vs behavioral schemers

Alex Mallen24 Apr 2025 19:07 UTC

58 points

9 comments6 min readLW link

Reward hacking is becoming more sophisticated and deliberate in frontier LLMs

Kei Nishimura-Gasparian24 Apr 2025 16:03 UTC

97 points

7 comments1 min readLW link

Finding an Error-Detection Feature in DeepSeek-R1

keith_wynroe24 Apr 2025 16:03 UTC

23 points

0 comments7 min readLW link

Anticipating AI: Keeping Up With What We Build

Alvin Ånestrand24 Apr 2025 15:23 UTC

2 points

0 comments11 min readLW link

(forecastingaifutures.substack.com)

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Matrice Jacobine24 Apr 2025 14:11 UTC

12 points

4 comments1 min readLW link

(limit-of-rlvr.github.io)

Academia as a happy place?

jow and pchvykov

24 Apr 2025 14:03 UTC

9 points

0 comments19 min readLW link

“The Era of Experience” has an unsolved technical alignment problem

Steven Byrnes24 Apr 2025 13:57 UTC

116 points

48 comments23 min readLW link

AI #113: The o3 Era Begins

Zvi24 Apr 2025 13:40 UTC

38 points

4 comments62 min readLW link

(thezvi.wordpress.com)