All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 252627 28 29 30 31

[Question] Is CDT with precommitment enough?

martinkunev25 May 2024 21:40 UTC

10 points

18 comments1 min readLW link

Complex systems theory in human performance. New model for conceptualizing training, adaptation and long-term development

Matěj Nekoranec25 May 2024 20:17 UTC

1 point

0 comments7 min readLW link

Blindspot in Sport’s Data-Driven Age

Matěj Nekoranec25 May 2024 20:17 UTC

2 points

0 comments7 min readLW link

LMSR subsidy parameter is the price of information

Abhimanyu Pallavi Sudhir25 May 2024 18:05 UTC

5 points

0 comments1 min readLW link

Low Fertility is a Degrowth Paradise

Maxwell Tabarrok25 May 2024 17:35 UTC

7 points

2 comments3 min readLW link

(www.maximum-progress.com)

Episode: Austin vs Linch on OpenAI

Austin Chen25 May 2024 16:15 UTC

20 points

25 comments44 min readLW link

(manifund.substack.com)

Training-time domain authorization could be helpful for safety

domenicrosati, Jan Wehner and David Atanasov

25 May 2024 15:10 UTC

15 points

4 comments7 min readLW link

Level up your spreadsheeting

angelinahli25 May 2024 14:57 UTC

47 points

11 comments3 min readLW link

(docs.google.com)

“Successful language model evals” by Jason Wei

Arjun Panickssery25 May 2024 9:34 UTC

7 points

0 comments1 min readLW link

(www.jasonwei.net)

[Question] What should the norms around AI voices be?

ChristianKl25 May 2024 6:29 UTC

17 points

6 comments1 min readLW link

Secret US natsec project with intel revealed

Nathan Helm-Burger25 May 2024 4:22 UTC

27 points

1 comment1 min readLW link

(www.politico.com)

Launch & Grow Your University Group: Apply now to OSP & FSP!

agucova25 May 2024 1:03 UTC

3 points

0 comments2 min readLW link

Computational Mechanics Hackathon (June 1 & 2)

Adam Shai24 May 2024 22:18 UTC

34 points

5 comments1 min readLW link

[Question] Request for comments/opinions/ideas on safety/ethics for use of tool AI in a large healthcare system.

bokov24 May 2024 20:53 UTC

5 points

2 comments1 min readLW link

NYU Code Debates Update/Postmortem

David Rein24 May 2024 16:08 UTC

27 points

4 comments10 min readLW link

AI companies aren’t really using external evaluators

Zach Stein-Perlman24 May 2024 16:01 UTC

242 points

15 comments4 min readLW link

The Schumer Report on AI (RTFB)

Zvi24 May 2024 15:10 UTC

34 points

3 comments36 min readLW link

(thezvi.wordpress.com)

minutes from a human-alignment meeting

bhauth24 May 2024 5:01 UTC

67 points

4 comments2 min readLW link

Talent Needs of Technical AI Safety Teams

yams, Carson Jones, deus_ex_maki and Ryan Kidd

24 May 2024 0:36 UTC

129 points

65 comments14 min readLW link

How to Give Coming AGI’s the Best Chance of Figuring Out Ethics for Us

sweenesm23 May 2024 19:44 UTC

1 point

2 comments10 min readLW link

Mentorship in AGI Safety (MAGIS) call for mentors

Valentin2026 and Joe Rogero

23 May 2024 18:28 UTC

32 points

3 comments2 min readLW link

Quick Thoughts on Scaling Monosemanticity

Joel Burget23 May 2024 16:22 UTC

28 points

1 comment4 min readLW link

(transformer-circuits.pub)

The case for stopping AI safety research

catubc23 May 2024 15:55 UTC

53 points

38 comments1 min readLW link

[Question] SAE sparse feature graph using only residual layers

Jaehyuk Lim23 May 2024 13:32 UTC

0 points

3 comments1 min readLW link

[Question] Are most people deeply confused about “love”, or am I missing a human universal?

SpectrumDT23 May 2024 13:22 UTC

13 points

28 comments3 min readLW link

Executive Dysfunction 101

DaystarEld23 May 2024 12:43 UTC

35 points

1 comment3 min readLW link

(daystareld.com)

AI #65: I Spy With My AI

Zvi23 May 2024 12:40 UTC

28 points

7 comments43 min readLW link

(thezvi.wordpress.com)

What mistakes has the AI safety movement made?

EuanMcLean23 May 2024 11:19 UTC

65 points

29 comments12 min readLW link

What should AI safety be trying to achieve?

EuanMcLean23 May 2024 11:17 UTC

17 points

1 comment13 min readLW link

What will the first human-level AI look like, and how might things go wrong?

EuanMcLean23 May 2024 11:17 UTC

20 points

2 comments15 min readLW link

Big Picture AI Safety: Introduction

EuanMcLean23 May 2024 11:15 UTC

46 points

7 comments5 min readLW link

Paper in Science: Managing extreme AI risks amid rapid progress

JanB23 May 2024 8:40 UTC

50 points

2 comments1 min readLW link

Power Law Policy

Ben Turtel23 May 2024 5:28 UTC

4 points

7 comments6 min readLW link

(bturtel.substack.com)

Why entropy means you might not have to worry as much about superintelligent AI

Ron J23 May 2024 3:52 UTC

−26 points

1 comment2 min readLW link

Quick Thoughts on Our First Sampling Run

jefftk23 May 2024 0:20 UTC

29 points

3 comments2 min readLW link

(www.jefftk.com)

AI Safety proposal—Influencing the superintelligence explosion

Morgan22 May 2024 23:31 UTC

0 points

2 comments7 min readLW link

Implementing Asimov’s Laws of Robotics—How I imagine alignment working.

Joshua Clancy22 May 2024 23:15 UTC

2 points

0 comments11 min readLW link

Higher-Order Forecasts

ozziegooen22 May 2024 21:49 UTC

45 points

1 comment3 min readLW link

A Positive Double Standard—Self-Help Principles Work For Individuals Not Populations

James Stephen Brown22 May 2024 21:37 UTC

8 points

3 comments5 min readLW link

A Bi-Modal Brain Model

Johannes C. Mayer22 May 2024 20:10 UTC

12 points

3 comments2 min readLW link

Offering service as a sensayer for simulationist-adjacent beliefs.

mako yass22 May 2024 18:52 UTC

22 points

0 comments1 min readLW link

Do Not Mess With Scarlett Johansson

Zvi22 May 2024 15:10 UTC

65 points

7 comments16 min readLW link

(thezvi.wordpress.com)

How Multiverse Theory dissolves Quantum inexplicability

mrdlm22 May 2024 14:55 UTC

0 points

0 comments1 min readLW link

[Question] Should we be concerned about eating too much soy?

ChristianKl22 May 2024 12:53 UTC

18 points

3 comments1 min readLW link

Procedural Executive Function, Part 3

DaystarEld22 May 2024 11:58 UTC

25 points

4 comments23 min readLW link

Cicadas, Anthropic, and the bilateral alignment problem

kromem22 May 2024 11:09 UTC

28 points

6 comments5 min readLW link

Announcing Human-aligned AI Summer School

Jan_Kulveit and Tomáš Gavenčiak

22 May 2024 8:55 UTC

51 points

0 comments1 min readLW link

(humanaligned.ai)

“Which chains-of-thought was that faster than?”

Emrik22 May 2024 8:21 UTC

37 points

4 comments4 min readLW link

Each Llama3-8b text uses a different “random” subspace of the activation space

tailcalled22 May 2024 7:31 UTC

3 points

4 comments7 min readLW link

ARIA’s Safeguarded AI grant program is accepting applications for Technical Area 1.1 until May 28th

Brendon_Wong22 May 2024 6:54 UTC

11 points

0 comments1 min readLW link

(www.aria.org.uk)