All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Q&A on Proposed SB 1047

ZviMay 2, 2024, 3:10 PM

74 points

8 comments44 min readLW link

(thezvi.wordpress.com)

On Dwarkesh’s Podcast with OpenAI’s John Schulman

ZviMay 21, 2024, 5:30 PM

73 points

4 comments20 min readLW link

(thezvi.wordpress.com)

AXRP Episode 31 - Singular Learning Theory with Daniel Murfet

DanielFilanMay 7, 2024, 3:50 AM

72 points

4 comments71 min readLW link

When Are Circular Definitions A Problem?

johnswentworthMay 28, 2024, 8:00 PM

68 points

15 comments3 min readLW link

Introducing AI-Powered Audiobooks of Rational Fiction Classics

AskwhoMay 4, 2024, 5:32 PM

67 points

14 comments1 min readLW link

minutes from a human-alignment meeting

bhauthMay 24, 2024, 5:01 AM

67 points

4 comments2 min readLW link

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Joar SkalseMay 17, 2024, 7:13 PM

67 points

10 comments2 min readLW link

How to be an amateur polyglot

arisAlexisMay 8, 2024, 3:08 PM

66 points

16 comments7 min readLW link

Do Not Mess With Scarlett Johansson

ZviMay 22, 2024, 3:10 PM

65 points

7 comments16 min readLW link

(thezvi.wordpress.com)

What mistakes has the AI safety movement made?

EuanMcLeanMay 23, 2024, 11:19 AM

64 points

29 comments12 min readLW link

DeepMind: Frontier Safety Framework

Zach Stein-PerlmanMay 17, 2024, 5:30 PM

64 points

0 comments3 min readLW link

(deepmind.google)

The Problem With the Word ‘Alignment’

peligrietzer and particlemania

May 21, 2024, 3:48 AM

63 points

8 comments6 min readLW link

Now THIS is forecasting: understanding Epoch’s Direct Approach

Elliot Mckernon and Zershaaneh Qureshi

May 4, 2024, 12:06 PM

63 points

4 comments19 min readLW link

Catastrophic Goodhart in RL with KL penalty

Thomas Kwa and Adrià Garriga-alonso

May 15, 2024, 12:58 AM

62 points

10 comments7 min readLW link

A civilization ran by amateurs

Olli JärviniemiMay 30, 2024, 5:57 PM

61 points

8 comments6 min readLW link

Thoughts on SB-1047

ryan_greenblattMay 29, 2024, 11:26 PM

60 points

1 comment11 min readLW link

How do open AI models affect incentive to race?

jessicataMay 7, 2024, 12:33 AM

60 points

13 comments3 min readLW link

(unstablerontology.substack.com)

some thoughts on LessOnline

RaemonMay 8, 2024, 11:17 PM

58 points

5 comments5 min readLW link

[Question] Shane Legg’s necessary properties for every AGI Safety plan

jacquesthibsMay 1, 2024, 5:15 PM

58 points

12 comments1 min readLW link

Apply to ESPR & PAIR, Rationality and AI Camps for Ages 16-21

Anna GajdovaMay 3, 2024, 12:36 PM

58 points

5 comments1 min readLW link

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning

Dan Braun, Jordan Taylor, Nicholas Goldowsky-Dill and Lee Sharkey

May 17, 2024, 4:25 PM

57 points

20 comments4 min readLW link

(arxiv.org)

Why Care About Natural Latents?

johnswentworth and David Lorell

May 9, 2024, 11:14 PM

56 points

3 comments5 min readLW link

Questions are usually too cheap

Nathan YoungMay 11, 2024, 1:00 PM

55 points

19 comments6 min readLW link

(nathanpmyoung.substack.com)

Building intuition with spaced repetition systems

Jacob G-WMay 12, 2024, 3:49 PM

55 points

8 comments4 min readLW link

(jacobgw.com)

OpenAI releases GPT-4o, natively interfacing with text, voice and vision

Martín SotoMay 13, 2024, 6:50 PM

54 points

23 comments1 min readLW link

(openai.com)

“If we go extinct due to misaligned AI, at least nature will continue, right? … right?”

plexMay 18, 2024, 2:09 PM

54 points

23 comments2 min readLW link

(aisafety.info)

The case for stopping AI safety research

catubcMay 23, 2024, 3:55 PM

53 points

38 comments1 min readLW link

S-Risks: Fates Worse Than Extinction

aggliu and Writer

May 4, 2024, 3:30 PM

53 points

2 comments6 min readLW link

(youtu.be)

Can we build a better Public Doublecrux?

RaemonMay 11, 2024, 7:21 PM

52 points

6 comments4 min readLW link

shortest goddamn bayes guide ever

lemonhopeMay 10, 2024, 7:06 AM

52 points

8 comments1 min readLW link

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Gunnar_ZarnckeMay 16, 2024, 1:09 PM

51 points

20 comments1 min readLW link

(arxiv.org)

Applying refusal-vector ablation to a Llama 3 70B agent

Simon LermenMay 11, 2024, 12:08 AM

51 points

14 comments7 min readLW link

Observations on Teaching for Four Weeks

ClareChiaraVincentMay 6, 2024, 4:55 PM

51 points

14 comments3 min readLW link

Why you should learn a musical instrument

cataMay 15, 2024, 8:36 PM

50 points

23 comments3 min readLW link

Finding Backward Chaining Circuits in Transformers Trained on Tree Search

abhayesian, Jannik Brinkmann and Victor Levoso

May 28, 2024, 5:29 AM

50 points

1 comment9 min readLW link

(arxiv.org)

Paper in Science: Managing extreme AI risks amid rapid progress

JanBMay 23, 2024, 8:40 AM

50 points

2 comments1 min readLW link

Announcing Human-aligned AI Summer School

Jan_Kulveit and Tomáš Gavenčiak

May 22, 2024, 8:55 AM

50 points

0 comments1 min readLW link

(humanaligned.ai)

The Dunning-Kruger of disproving Dunning-Kruger

kromemMay 16, 2024, 10:11 AM

50 points

0 comments5 min readLW link

Anthropic announces interpretability advances. How much does this advance alignment?

Seth HerdMay 21, 2024, 10:30 PM

49 points

4 comments3 min readLW link

(www.anthropic.com)

Designing for a single purpose

Itay DreyfusMay 7, 2024, 2:11 PM

48 points

12 comments10 min readLW link

(productidentity.co)

How to do conceptual research: Case study interview with Caspar Oesterheld

Chi NguyenMay 14, 2024, 3:09 PM

48 points

5 comments9 min readLW link

Rapid capability gain around supergenius level seems probable even without intelligence needing to improve intelligence

Towards_Keeperhood and Davanchama

May 6, 2024, 5:09 PM

48 points

17 comments4 min readLW link

Mechanistic Interpretability Workshop Happening at ICML 2024!

Neel Nanda, LawrenceC and Fazl

May 3, 2024, 1:18 AM

48 points

6 comments1 min readLW link

Some Experiments I’d Like Someone To Try With An Amnestic

johnswentworthMay 4, 2024, 10:04 PM

47 points

33 comments3 min readLW link

Big Picture AI Safety: Introduction

EuanMcLeanMay 23, 2024, 11:15 AM

46 points

7 comments5 min readLW link

New intro textbook on AIXI

Alex_AltairMay 11, 2024, 6:18 PM

46 points

8 comments1 min readLW link

Book review: Everything Is Predictable

PeterMcCluskeyMay 27, 2024, 3:33 AM

46 points

1 comment2 min readLW link

(bayesianinvestor.com)

Dating Roundup #3: Third Time’s the Charm

ZviMay 8, 2024, 1:30 PM

45 points

28 comments39 min readLW link

(thezvi.wordpress.com)

Monthly Roundup #18: May 2024

ZviMay 13, 2024, 12:30 PM

45 points

10 comments48 min readLW link

(thezvi.wordpress.com)

Higher-Order Forecasts

ozziegooenMay 22, 2024, 9:49 PM

45 points

1 comment LW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer