Pes­simistic Shard Theory

Garrett BakerJan 25, 2023, 12:59 AM
72 points
13 comments3 min readLW link

A gen­eral com­ment on dis­cus­sions of ge­netic group differences

anonymous8101Jan 14, 2023, 2:11 AM
71 points
46 comments3 min readLW link

“Sta­tus” can be cor­ro­sive; here’s how I han­dle it

Orpheus16Jan 24, 2023, 1:25 AM
71 points
8 comments6 min readLW link

How we could stum­ble into AI catastrophe

HoldenKarnofskyJan 13, 2023, 4:20 PM
71 points
18 comments18 min readLW link
(www.cold-takes.com)

Op­por­tu­nity Cost Blackmail

adamShimiJan 2, 2023, 1:48 PM
70 points
11 comments2 min readLW link
(epistemologicalvigilance.substack.com)

Some of my dis­agree­ments with List of Lethalities

TurnTroutJan 24, 2023, 12:25 AM
70 points
7 comments10 min readLW link

In­vest­ing for a World Trans­formed by AI

PeterMcCluskeyJan 1, 2023, 2:47 AM
70 points
24 comments6 min readLW link1 review
(bayesianinvestor.com)

AGI safety field build­ing pro­jects I’d like to see

Severin T. SeehrichJan 19, 2023, 10:40 PM
68 points
28 comments9 min readLW link

In­fo­haz­ards vs Fork Hazards

jimrandomhJan 5, 2023, 9:45 AM
68 points
16 comments1 min readLW link

Thoughts on hard­ware /​ com­pute re­quire­ments for AGI

Steven ByrnesJan 24, 2023, 2:03 PM
63 points
32 comments24 min readLW link

Si­mu­lacra are Things

janusJan 8, 2023, 11:03 PM
63 points
7 comments2 min readLW link

Dangers of deference

TsviBTJan 8, 2023, 2:36 PM
62 points
5 comments2 min readLW link

Tracr: Com­piled Trans­form­ers as a Lab­o­ra­tory for In­ter­pretabil­ity | Deep­Mind

DragonGodJan 13, 2023, 4:53 PM
62 points
12 comments1 min readLW link
(arxiv.org)

Es­cape Ve­loc­ity from Bul­lshit Jobs

ZviJan 10, 2023, 2:30 PM
61 points
18 comments5 min readLW link
(thezvi.wordpress.com)

My first year in AI alignment

Alex_AltairJan 2, 2023, 1:28 AM
61 points
10 comments7 min readLW link

An­nounc­ing aisafety.training

JJ HepburnJan 21, 2023, 1:01 AM
61 points
4 comments1 min readLW link

Spooky ac­tion at a dis­tance in the loss landscape

Jan 28, 2023, 12:22 AM
61 points
4 comments7 min readLW link
(www.jessehoogland.com)

Movie Re­view: Megan

ZviJan 23, 2023, 12:50 PM
60 points
19 comments24 min readLW link
(thezvi.wordpress.com)

LW Filter Tags (Ra­tion­al­ity/​World Model­ing now pro­moted in Lat­est Posts)

Jan 28, 2023, 10:14 PM
60 points
4 comments3 min readLW link

As­sign­ing Praise and Blame: De­cou­pling Episte­mol­ogy and De­ci­sion Theory

Jan 27, 2023, 6:16 PM
59 points
5 comments3 min readLW link

Con­ver­sa­tional canyons

Henrik KarlssonJan 4, 2023, 6:55 PM
59 points
4 comments7 min readLW link
(escapingflatland.substack.com)

An­nounc­ing Cavendish Labs

Jan 19, 2023, 8:15 PM
59 points
5 comments2 min readLW link
(forum.effectivealtruism.org)

Con­se­quen­tial­ists: One-Way Pat­tern Traps

David UdellJan 16, 2023, 8:48 PM
59 points
3 comments14 min readLW link

[Linkpost] TIME ar­ti­cle: Deep­Mind’s CEO Helped Take AI Main­stream. Now He’s Urg­ing Caution

Orpheus16Jan 21, 2023, 4:51 PM
58 points
2 comments3 min readLW link
(time.com)

In­verse Scal­ing Prize: Se­cond Round Winners

Jan 24, 2023, 8:12 PM
58 points
17 comments15 min readLW link

My Ad­vice for In­com­ing SERI MATS Scholars

Johannes C. MayerJan 3, 2023, 7:25 PM
58 points
6 comments4 min readLW link

Lin­ear Alge­bra Done Right, Axler

David UdellJan 2, 2023, 10:54 PM
57 points
6 comments9 min readLW link

Ev­i­dence un­der Ad­ver­sar­ial Conditions

PeterMcCluskeyJan 9, 2023, 4:21 PM
57 points
1 comment3 min readLW link
(bayesianinvestor.com)

Con­sider pay­ing for liter­a­ture or book re­views us­ing boun­ties and dom­i­nant as­surance contracts

Arjun PanicksseryJan 15, 2023, 3:56 AM
57 points
7 comments2 min readLW link

Gra­di­ent Filtering

Jan 18, 2023, 8:09 PM
56 points
16 comments13 min readLW link

What’s go­ing on with ‘crunch time’?

rosehadsharJan 20, 2023, 9:42 AM
54 points
6 comments4 min readLW link

Reflec­tions on De­cep­tion & Gen­er­al­ity in Scal­able Over­sight (Another OpenAI Align­ment Re­view)

Shoshannah TekofskyJan 28, 2023, 5:26 AM
53 points
7 comments7 min readLW link

Why you should learn sign language

Noah TopperJan 18, 2023, 5:03 PM
53 points
23 comments7 min readLW link
(naivebayes.substack.com)

Why and How to Grad­u­ate Early [U.S.]

TegoJan 29, 2023, 1:28 AM
53 points
9 comments8 min readLW link1 review

Paper: Su­per­po­si­tion, Me­moriza­tion, and Dou­ble Des­cent (An­thropic)

LawrenceCJan 5, 2023, 5:54 PM
53 points
11 comments1 min readLW link
(transformer-circuits.pub)

Cri­tique of some re­cent philos­o­phy of LLMs’ minds

Roman LeventovJan 20, 2023, 12:53 PM
52 points
8 comments20 min readLW link

Con­tra Com­mon Knowledge

abramdemskiJan 4, 2023, 10:50 PM
52 points
31 comments16 min readLW link

How Likely is Los­ing a Google Ac­count?

jefftkJan 30, 2023, 12:20 AM
52 points
12 comments3 min readLW link
(www.jefftk.com)

Be­ware safety-washing

LizkaJan 13, 2023, 1:59 PM
51 points
2 comments4 min readLW link

The Thing­ness of Things

TsviBTJan 1, 2023, 10:19 PM
51 points
35 comments10 min readLW link

11 heuris­tics for choos­ing (al­ign­ment) re­search projects

Jan 27, 2023, 12:36 AM
50 points
5 comments1 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #1 Back­ground & shared assumptions

Jan 2, 2023, 11:48 PM
50 points
4 comments3 min readLW link

[Question] Would it be good or bad for the US mil­i­tary to get in­volved in AI risk?

Grant DemareeJan 1, 2023, 7:02 PM
50 points
12 comments1 min readLW link

Try­ing to iso­late ob­jec­tives: ap­proaches to­ward high-level interpretability

JozdienJan 9, 2023, 6:33 PM
49 points
14 comments8 min readLW link

Cita­bil­ity of Less­wrong and the Align­ment Forum

Leon LangJan 8, 2023, 10:12 PM
48 points
2 comments1 min readLW link

Lan­guage mod­els can gen­er­ate su­pe­rior text com­pared to their input

ChristianKlJan 17, 2023, 10:57 AM
48 points
28 comments1 min readLW link

[Cross­post] ACX 2022 Pre­dic­tion Con­test Results

Jan 24, 2023, 6:56 AM
48 points
6 comments8 min readLW link

[RFC] Pos­si­ble ways to ex­pand on “Dis­cov­er­ing La­tent Knowl­edge in Lan­guage Models Without Su­per­vi­sion”.

Jan 25, 2023, 7:03 PM
48 points
6 comments12 min readLW link

How-to Trans­former Mechanis­tic In­ter­pretabil­ity—in 50 lines of code or less!

StefanHexJan 24, 2023, 6:45 PM
47 points
5 comments13 min readLW link

[Question] What spe­cific thing would you do with AI Align­ment Re­search As­sis­tant GPT?

quetzal_rainbowJan 8, 2023, 7:24 PM
47 points
9 comments1 min readLW link