Ten Levels of AI Align­ment Difficulty

Sammy MartinJul 3, 2023, 8:20 PM
138 points
24 comments12 min readLW link1 review

Se­cu­rity, Cryp­tograhy AI Work­shop in SF

Allison DuettmannJul 3, 2023, 7:01 PM
7 points
0 comments1 min readLW link

[Question] What in your opinion is the biggest open prob­lem in AI al­ign­ment?

tailcalledJul 3, 2023, 4:34 PM
39 points
35 comments1 min readLW link

A Sub­tle Selec­tion Effect in Over­con­fi­dence Studies

Kevin DorstJul 3, 2023, 2:43 PM
24 points
0 comments6 min readLW link
(kevindorst.substack.com)

Monthly Roundup #8: July 2023

ZviJul 3, 2023, 1:20 PM
40 points
4 comments46 min readLW link
(thezvi.wordpress.com)

Com­plex Signs Bad

EvenstarJul 3, 2023, 1:09 PM
5 points
2 comments3 min readLW link

6/​23

CelerJul 3, 2023, 6:30 AM
8 points
0 comments10 min readLW link
(keller.substack.com)

Marginal charity

Pat MyronJul 3, 2023, 2:13 AM
3 points
1 commentLW link

My Cen­tral Align­ment Pri­or­ity (2 July 2023)

Nicholas / Heather KrossJul 3, 2023, 1:46 AM
12 points
1 comment3 min readLW link

My Align­ment Timeline

Nicholas / Heather KrossJul 3, 2023, 1:04 AM
22 points
0 comments2 min readLW link

Dou­glas Hofs­tadter changes his mind on Deep Learn­ing & AI risk (June 2023)?

gwernJul 3, 2023, 12:48 AM
426 points
54 comments7 min readLW link
(www.youtube.com)

Frames in context

Richard_NgoJul 3, 2023, 12:38 AM
39 points
9 comments6 min readLW link

Meta-ra­tio­nal­ity and frames

Richard_NgoJul 3, 2023, 12:33 AM
64 points
2 comments5 min readLW link

VC The­ory Overview

Joar SkalseJul 2, 2023, 10:45 PM
12 points
2 comments11 min readLW link

Sources of ev­i­dence in Alignment

Martín SotoJul 2, 2023, 8:38 PM
20 points
0 comments11 min readLW link

Quan­ti­ta­tive cruxes in Alignment

Martín SotoJul 2, 2023, 8:38 PM
19 points
0 comments23 min readLW link

Go­ing Crazy and Get­ting Bet­ter Again

EvenstarJul 2, 2023, 6:55 PM
139 points
13 comments7 min readLW link1 review

Shall We Throw A Huge Party Be­fore AGI Bids Us Adieu?

GeorgeManJul 2, 2023, 5:56 PM
−1 points
6 comments1 min readLW link

Why it’s so hard to talk about Consciousness

Rafael HarthJul 2, 2023, 3:56 PM
167 points
215 comments9 min readLW link3 reviews

How Smart Are Hu­mans?

Joar SkalseJul 2, 2023, 3:46 PM
10 points
19 comments2 min readLW link

Through a panel, darkly: a case study in in­ter­net BS detection

jchanJul 2, 2023, 1:40 PM
22 points
7 comments3 min readLW link

LLMs, Batches, and Emer­gent Epi­sodic Memory

Lao MeinJul 2, 2023, 7:55 AM
5 points
4 comments1 min readLW link

Nega­tivity en­hances positivity

Adam ZernerJul 2, 2023, 2:47 AM
12 points
7 comments2 min readLW link

faster la­tent diffusion

bhauthJul 2, 2023, 1:30 AM
10 points
8 comments2 min readLW link
(www.bhauth.com)

Us­ing (Un­in­ter­pretable) LLMs to Gen­er­ate In­ter­pretable AI Code

Joar SkalseJul 2, 2023, 1:01 AM
13 points
12 comments3 min readLW link

Grant ap­pli­ca­tions and grand narratives

ElizabethJul 2, 2023, 12:16 AM
191 points
22 comments6 min readLW link

An In­tro­duc­tion, an Overview of my per­sonal re­sources, and how one might make use of them

ProofBySonnetJul 1, 2023, 9:00 PM
4 points
6 comments3 min readLW link

My “2.9 trauma limit”

RaemonJul 1, 2023, 7:32 PM
198 points
31 comments7 min readLW link

Alpha

Erich_GrunewaldJul 1, 2023, 4:05 PM
65 points
2 comments14 min readLW link
(www.erichgrunewald.com)

Fo­rum Karma: view stats and find highly-rated com­ments for any LW user

Max HJul 1, 2023, 3:36 PM
60 points
16 comments2 min readLW link
(forumkarma.com)

[ASoT] GPT2 Steer­ing & The Tuned Lens

Ulisse MiniJul 1, 2023, 2:12 PM
23 points
0 comments2 min readLW link

[Linkpost] A shared lin­guis­tic space for trans­mit­ting our thoughts from brain to brain in nat­u­ral conversations

Bogdan Ionut CirsteaJul 1, 2023, 1:57 PM
17 points
2 comments1 min readLW link

Ele­ments of Com­pu­ta­tional Philos­o­phy, Vol. I: Truth

Jul 1, 2023, 11:44 AM
12 points
6 comments1 min readLW link
(compphil.github.io)

Micro Habits that Im­prove One’s Day

silentbobJul 1, 2023, 10:53 AM
64 points
9 comments5 min readLW link

Ate­liers: But what is an Ate­lier?

Stephen FowlerJul 1, 2023, 5:57 AM
4 points
2 comments10 min readLW link

Pre­dict­ing: Quick Start

duck_masterJul 1, 2023, 3:43 AM
9 points
3 comments14 min readLW link

EA/​LW/​SSC Ar­gentina Group!

daviddelaubaJul 1, 2023, 2:47 AM
1 point
0 comments1 min readLW link

De­s­pe­dida a Pablo Stafforini

daviddelaubaJul 1, 2023, 2:44 AM
1 point
0 comments1 min readLW link

Hori­zon­tal and Ver­ti­cal Integration

Jeffrey HeningerJul 1, 2023, 1:15 AM
17 points
1 comment2 min readLW link

In­flec­tion AI an­nounces $1.3 billion of fund­ing led by cur­rent in­vestors, Microsoft, and NVIDIA

SandXboxJun 30, 2023, 9:32 PM
7 points
0 comments1 min readLW link
(inflection.ai)

Introduction

Jun 30, 2023, 8:45 PM
8 points
0 comments2 min readLW link

In­her­ently In­ter­pretable Architectures

Jun 30, 2023, 8:43 PM
4 points
0 comments7 min readLW link

Pos­i­tive Attractors

Jun 30, 2023, 8:43 PM
6 points
0 comments13 min readLW link

Agency from a causal perspective

Jun 30, 2023, 5:37 PM
40 points
5 comments6 min readLW link

Lit­tle at­ten­tion seems to be on dis­cour­ag­ing hard­ware progress

RussellThorJun 30, 2023, 10:14 AM
5 points
3 comments1 min readLW link

In­tro­duc­ing EffiS­ciences’ AI Safety Unit

Jun 30, 2023, 7:44 AM
68 points
0 comments12 min readLW link

Con­tra An­ton 🏴‍☠️ on Kol­mogorov com­plex­ity and re­cur­sive self improvement

DaemonicSigilJun 30, 2023, 5:15 AM
25 points
12 comments2 min readLW link

Foom Liability

PeterMcCluskeyJun 30, 2023, 3:55 AM
22 points
10 comments6 min readLW link
(bayesianinvestor.com)

I Think Eliezer Should Go on Glenn Beck

Lao MeinJun 30, 2023, 3:12 AM
29 points
21 comments1 min readLW link

Ben­gio’s FAQ on Catas­trophic AI Risks

VaniverJun 29, 2023, 11:04 PM
39 points
0 comments1 min readLW link
(yoshuabengio.org)