All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 272829 30 31

[Question] Is AGI actually that likely to take off given the world energy consumption?

StanislavKrym27 Mar 2025 23:13 UTC

2 points

2 comments1 min readLW link

[Linkpost] The value of initiating a pursuit in temporal decision-making

Gunnar_Zarncke27 Mar 2025 21:47 UTC

13 points

0 comments2 min readLW link

Alignment through atomic agents

micseydel27 Mar 2025 18:43 UTC

−1 points

0 comments1 min readLW link

Machines of Stolen Grace

Riley Tavassoli27 Mar 2025 18:15 UTC

2 points

0 comments5 min readLW link

On the plausibility of a “messy” rogue AI committing human-like evil

Jacob Griffith27 Mar 2025 18:06 UTC

8 points

0 comments7 min readLW link

AI Moral Alignment: The Most Important Goal of Our Generation

Ronen Bar27 Mar 2025 18:04 UTC

3 points

0 comments8 min readLW link

(forum.effectivealtruism.org)

Tracing the Thoughts of a Large Language Model

Adam Jermyn27 Mar 2025 17:20 UTC

308 points

23 comments10 min readLW link

(www.anthropic.com)

Computational Superposition in a Toy Model of the U-AND Problem

Adam Newgas27 Mar 2025 16:56 UTC

18 points

2 comments11 min readLW link

Mistral Large 2 (123B) seems to exhibit alignment faking

Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Cameron Berg, Kvee, Mike Vaiana and Trent Hodgeson

27 Mar 2025 15:39 UTC

82 points

4 comments13 min readLW link

AIS Netherlands is looking for a Founding Executive Director (EOI form)

gergogaspar, Jelle Donders, Natalia Matuszczyk and ENAIS

27 Mar 2025 15:30 UTC

15 points

0 comments4 min readLW link

AI #109: Google Fails Marketing Forever

Zvi27 Mar 2025 14:50 UTC

42 points

12 comments35 min readLW link

(thezvi.wordpress.com)

What is scaffolding?

Vishakha and Algon

27 Mar 2025 9:06 UTC

10 points

0 comments2 min readLW link

(aisafety.info)

Workflow vs interface vs implementation

Sniffnoy27 Mar 2025 7:38 UTC

12 points

0 comments1 min readLW link

Quick thoughts on the difficulty of widely conveying a non-stereotyped position

Sniffnoy27 Mar 2025 7:30 UTC

12 points

0 comments5 min readLW link

Doing principle-of-charity better

Sniffnoy27 Mar 2025 5:19 UTC

24 points

1 comment3 min readLW link

X as phenomenon vs as policy, Goodhart, and the AB problem

Sniffnoy27 Mar 2025 4:32 UTC

15 points

0 comments2 min readLW link

Consequentialism is for making decisions

Sniffnoy27 Mar 2025 4:00 UTC

21 points

9 comments1 min readLW link

Third-wave AI safety needs sociopolitical thinking

Richard_Ngo27 Mar 2025 0:55 UTC

100 points

23 comments26 min readLW link

Knowledge, Reasoning, and Superintelligence

owencb26 Mar 2025 23:28 UTC

21 points

1 comment7 min readLW link

(strangecities.substack.com)

Many Common Problems are NP-Hard, and Why that Matters for AI

Andrew Keenan Richardson26 Mar 2025 21:51 UTC

5 points

9 comments5 min readLW link

Fun With GPT-4o Image Generation

Zvi26 Mar 2025 19:50 UTC

76 points

3 comments15 min readLW link

(thezvi.wordpress.com)

I’m hiring a Research Assistant for a nonfiction book on AI!

garrison26 Mar 2025 19:46 UTC

17 points

0 comments1 min readLW link

(garrisonlovely.substack.com)

Automated Researchers Can Subtly Sandbag

gasteigerjo, Akbir Khan, Sam Bowman, Vlad Mikulik, Ethan Perez and Fabien Roger

26 Mar 2025 19:13 UTC

44 points

0 comments4 min readLW link

(alignment.anthropic.com)

Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)

lewis smith, Senthooran Rajamanoharan, Arthur Conmy, CallumMcDougall, Tom Lieberum, János Kramár, Rohin Shah and Neel Nanda

26 Mar 2025 19:07 UTC

117 points

15 comments29 min readLW link

(deepmindsafetyresearch.medium.com)

AI companies should be safety-testing the most capable versions of their models

sjadler26 Mar 2025 19:03 UTC

17 points

6 comments1 min readLW link

(stevenadler.substack.com)

Conceptual Rounding Errors

Jan_Kulveit26 Mar 2025 19:00 UTC

156 points

15 comments3 min readLW link

(boundedlyrational.substack.com)

Personal Agents: The First Step in Emergent AI Society

Andrey Seryakov26 Mar 2025 18:55 UTC

3 points

0 comments2 min readLW link

Will AI R&D Automation Cause a Software Intelligence Explosion?

Tom Davidson and Daniel_Eth

26 Mar 2025 18:12 UTC

19 points

3 comments2 min readLW link

(www.forethought.org)

Why Does Unemployment Happen?

Nicholas D.26 Mar 2025 18:02 UTC

−2 points

2 comments1 min readLW link

(nicholasdecker.substack.com)

Finding Emergent Misalignment

Jan Betley26 Mar 2025 17:33 UTC

39 points

0 comments3 min readLW link

Center on Long-Term Risk: Summer Research Fellowship 2025 - Apply Now

Tristan Cook26 Mar 2025 17:29 UTC

33 points

0 comments1 min readLW link

(longtermrisk.org)

Eukaryote Skips Town—Why I’m leaving DC

eukaryote26 Mar 2025 17:16 UTC

82 points

1 comment6 min readLW link

(eukaryotewritesblog.com)

Apply to become a Futurekind AI Facilitator or Mentor (deadline: April 10)

superbeneficiary26 Mar 2025 15:47 UTC

4 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

Language and My Frustration Continue in Our RSI

TristanTrim26 Mar 2025 14:13 UTC

2 points

1 comment7 min readLW link

[Question] Would it be effective to learn a language to improve cognition?

Hruss26 Mar 2025 10:17 UTC

9 points

7 comments1 min readLW link

New AI safety treaty paper out!

otto.barten26 Mar 2025 9:29 UTC

19 points

2 comments4 min readLW link

Map of all 40 copyright suits v. AI in U.S.

Remmelt26 Mar 2025 7:57 UTC

40 points

3 comments1 min readLW link

(chatgptiseatingtheworld.com)

Gemini 2.5 Pro released

Yair Halberstadt26 Mar 2025 7:56 UTC

14 points

0 comments1 min readLW link

(blog.google)

Probability Theory Fundamentals 102: Territory that Probability is in the Map of

Ape in the coat26 Mar 2025 6:40 UTC

10 points

17 comments9 min readLW link

Avoid the Counterargument Collapse

marknm26 Mar 2025 3:19 UTC

34 points

4 comments4 min readLW link

Luxemburg – ACX Meetups Everywhere Spring 2025

Roland26 Mar 2025 0:12 UTC

1 point

0 comments1 min readLW link

Madrid – ACX Meetups Everywhere Spring 2025

Pablo Villalobos26 Mar 2025 0:11 UTC

2 points

1 comment1 min readLW link

Groningen – ACX Meetups Everywhere Spring 2025

Herman van der Veer26 Mar 2025 0:11 UTC

1 point

0 comments1 min readLW link

Pittsburgh – ACX Meetups Everywhere Spring 2025

MrJones26 Mar 2025 0:11 UTC

1 point

0 comments1 min readLW link

Chico – ACX Meetups Everywhere Spring 2025

Ryan_Axtell26 Mar 2025 0:11 UTC

1 point

0 comments1 min readLW link

Munich – ACX Meetups Everywhere Spring 2025

Organizer26 Mar 2025 0:11 UTC

1 point

0 comments1 min readLW link

Buenos Aires – ACX Meetups Everywhere Spring 2025

eitan sprejer26 Mar 2025 0:11 UTC

3 points

0 comments1 min readLW link

Shanghai – ACX Meetups Everywhere Spring 2025

David Jiang26 Mar 2025 0:11 UTC

3 points

2 comments1 min readLW link

Oklahoma City – ACX Meetups Everywhere Spring 2025

Bean26 Mar 2025 0:11 UTC

1 point

0 comments1 min readLW link

Cape Town – ACX Meetups Everywhere Spring 2025

Leo_Hyams26 Mar 2025 0:11 UTC

1 point

0 comments1 min readLW link