Q&A on Pro­posed SB 1047

ZviMay 2, 2024, 3:10 PM
74 points
8 comments44 min readLW link
(thezvi.wordpress.com)

On Dwarkesh’s Pod­cast with OpenAI’s John Schulman

ZviMay 21, 2024, 5:30 PM
73 points
4 comments20 min readLW link
(thezvi.wordpress.com)

AXRP Epi­sode 31 - Sin­gu­lar Learn­ing The­ory with Daniel Murfet

DanielFilanMay 7, 2024, 3:50 AM
72 points
4 comments71 min readLW link

When Are Cir­cu­lar Defi­ni­tions A Prob­lem?

johnswentworthMay 28, 2024, 8:00 PM
68 points
15 comments3 min readLW link

In­tro­duc­ing AI-Pow­ered Au­dio­books of Ra­tional Fic­tion Classics

AskwhoMay 4, 2024, 5:32 PM
67 points
14 comments1 min readLW link

min­utes from a hu­man-al­ign­ment meeting

bhauthMay 24, 2024, 5:01 AM
67 points
4 comments2 min readLW link

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Joar SkalseMay 17, 2024, 7:13 PM
67 points
10 comments2 min readLW link

How to be an am­a­teur polyglot

arisAlexisMay 8, 2024, 3:08 PM
66 points
16 comments7 min readLW link

Do Not Mess With Scar­lett Johansson

ZviMay 22, 2024, 3:10 PM
65 points
7 comments16 min readLW link
(thezvi.wordpress.com)

What mis­takes has the AI safety move­ment made?

EuanMcLeanMay 23, 2024, 11:19 AM
64 points
29 comments12 min readLW link

Deep­Mind: Fron­tier Safety Framework

Zach Stein-PerlmanMay 17, 2024, 5:30 PM
64 points
0 comments3 min readLW link
(deepmind.google)

The Prob­lem With the Word ‘Align­ment’

May 21, 2024, 3:48 AM
63 points
8 comments6 min readLW link

Now THIS is fore­cast­ing: un­der­stand­ing Epoch’s Direct Approach

May 4, 2024, 12:06 PM
63 points
4 comments19 min readLW link

Catas­trophic Good­hart in RL with KL penalty

May 15, 2024, 12:58 AM
62 points
10 comments7 min readLW link

A civ­i­liza­tion ran by amateurs

Olli JärviniemiMay 30, 2024, 5:57 PM
61 points
8 comments6 min readLW link

Thoughts on SB-1047

ryan_greenblattMay 29, 2024, 11:26 PM
60 points
1 comment11 min readLW link

How do open AI mod­els af­fect in­cen­tive to race?

jessicataMay 7, 2024, 12:33 AM
60 points
13 comments3 min readLW link
(unstablerontology.substack.com)

some thoughts on LessOnline

RaemonMay 8, 2024, 11:17 PM
58 points
5 comments5 min readLW link

[Question] Shane Legg’s nec­es­sary prop­er­ties for ev­ery AGI Safety plan

jacquesthibsMay 1, 2024, 5:15 PM
58 points
12 comments1 min readLW link

Ap­ply to ESPR & PAIR, Ra­tion­al­ity and AI Camps for Ages 16-21

Anna GajdovaMay 3, 2024, 12:36 PM
58 points
5 comments1 min readLW link

Iden­ti­fy­ing Func­tion­ally Im­por­tant Fea­tures with End-to-End Sparse Dic­tionary Learning

May 17, 2024, 4:25 PM
57 points
20 comments4 min readLW link
(arxiv.org)

Why Care About Nat­u­ral La­tents?

May 9, 2024, 11:14 PM
56 points
3 comments5 min readLW link

Ques­tions are usu­ally too cheap

Nathan YoungMay 11, 2024, 1:00 PM
55 points
19 comments6 min readLW link
(nathanpmyoung.substack.com)

Build­ing in­tu­ition with spaced rep­e­ti­tion systems

Jacob G-WMay 12, 2024, 3:49 PM
55 points
8 comments4 min readLW link
(jacobgw.com)

OpenAI re­leases GPT-4o, na­tively in­ter­fac­ing with text, voice and vision

Martín SotoMay 13, 2024, 6:50 PM
54 points
23 comments1 min readLW link
(openai.com)

“If we go ex­tinct due to mis­al­igned AI, at least na­ture will con­tinue, right? … right?”

plexMay 18, 2024, 2:09 PM
54 points
23 comments2 min readLW link
(aisafety.info)

The case for stop­ping AI safety research

catubcMay 23, 2024, 3:55 PM
53 points
38 comments1 min readLW link

S-Risks: Fates Worse Than Ex­tinc­tion

May 4, 2024, 3:30 PM
53 points
2 comments6 min readLW link
(youtu.be)

Can we build a bet­ter Public Dou­ble­crux?

RaemonMay 11, 2024, 7:21 PM
52 points
6 comments4 min readLW link

short­est god­damn bayes guide ever

lemonhopeMay 10, 2024, 7:06 AM
52 points
8 comments1 min readLW link

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Gunnar_ZarnckeMay 16, 2024, 1:09 PM
51 points
20 comments1 min readLW link
(arxiv.org)

Ap­ply­ing re­fusal-vec­tor ab­la­tion to a Llama 3 70B agent

Simon LermenMay 11, 2024, 12:08 AM
51 points
14 comments7 min readLW link

Ob­ser­va­tions on Teach­ing for Four Weeks

ClareChiaraVincentMay 6, 2024, 4:55 PM
51 points
14 comments3 min readLW link

Why you should learn a mu­si­cal instrument

cataMay 15, 2024, 8:36 PM
50 points
23 comments3 min readLW link

Find­ing Back­ward Chain­ing Cir­cuits in Trans­form­ers Trained on Tree Search

May 28, 2024, 5:29 AM
50 points
1 comment9 min readLW link
(arxiv.org)

Paper in Science: Manag­ing ex­treme AI risks amid rapid progress

JanBMay 23, 2024, 8:40 AM
50 points
2 comments1 min readLW link

An­nounc­ing Hu­man-al­igned AI Sum­mer School

May 22, 2024, 8:55 AM
50 points
0 comments1 min readLW link
(humanaligned.ai)

The Dun­ning-Kruger of dis­prov­ing Dun­ning-Kruger

kromemMay 16, 2024, 10:11 AM
50 points
0 comments5 min readLW link

An­thropic an­nounces in­ter­pretabil­ity ad­vances. How much does this ad­vance al­ign­ment?

Seth HerdMay 21, 2024, 10:30 PM
49 points
4 comments3 min readLW link
(www.anthropic.com)

De­sign­ing for a sin­gle purpose

Itay DreyfusMay 7, 2024, 2:11 PM
48 points
12 comments10 min readLW link
(productidentity.co)

How to do con­cep­tual re­search: Case study in­ter­view with Cas­par Oesterheld

Chi NguyenMay 14, 2024, 3:09 PM
48 points
5 comments9 min readLW link

Rapid ca­pa­bil­ity gain around su­per­ge­nius level seems prob­a­ble even with­out in­tel­li­gence need­ing to im­prove intelligence

May 6, 2024, 5:09 PM
48 points
17 comments4 min readLW link

Mechanis­tic In­ter­pretabil­ity Work­shop Hap­pen­ing at ICML 2024!

May 3, 2024, 1:18 AM
48 points
6 comments1 min readLW link

Some Ex­per­i­ments I’d Like Some­one To Try With An Amnestic

johnswentworthMay 4, 2024, 10:04 PM
47 points
33 comments3 min readLW link

Big Pic­ture AI Safety: Introduction

EuanMcLeanMay 23, 2024, 11:15 AM
46 points
7 comments5 min readLW link

New in­tro text­book on AIXI

Alex_AltairMay 11, 2024, 6:18 PM
46 points
8 comments1 min readLW link

Book re­view: Every­thing Is Predictable

PeterMcCluskeyMay 27, 2024, 3:33 AM
46 points
1 comment2 min readLW link
(bayesianinvestor.com)

Dat­ing Roundup #3: Third Time’s the Charm

ZviMay 8, 2024, 1:30 PM
45 points
28 comments39 min readLW link
(thezvi.wordpress.com)

Monthly Roundup #18: May 2024

ZviMay 13, 2024, 12:30 PM
45 points
10 comments48 min readLW link
(thezvi.wordpress.com)

Higher-Order Forecasts

ozziegooenMay 22, 2024, 9:49 PM
45 points
1 commentLW link