The Nat­u­ral Ab­strac­tion Hy­poth­e­sis: Im­pli­ca­tions and Evidence

CallumMcDougall14 Dec 2021 23:14 UTC
37 points
8 comments19 min readLW link

Robin Han­son’s “Hu­mans are Early”

Raemon14 Dec 2021 22:07 UTC
11 points
0 comments2 min readLW link
(www.overcomingbias.com)

Ngo’s view on al­ign­ment difficulty

14 Dec 2021 21:34 UTC
63 points
7 comments17 min readLW link

A pro­posed sys­tem for ideas jumpstart

Just Learning14 Dec 2021 21:01 UTC
4 points
2 comments3 min readLW link

Should we rely on the speed prior for safety?

Marc Carauleanu14 Dec 2021 20:45 UTC
14 points
5 comments5 min readLW link

ARC’s first tech­ni­cal re­port: Elic­it­ing La­tent Knowledge

14 Dec 2021 20:09 UTC
225 points
90 comments1 min readLW link3 reviews
(docs.google.com)

ARC is hiring!

14 Dec 2021 20:09 UTC
63 points
2 comments1 min readLW link

In­ter­lude: Agents as Automobiles

Daniel Kokotajlo14 Dec 2021 18:49 UTC
26 points
6 comments5 min readLW link

Zvi’s Thoughts on the Sur­vival and Flour­ish­ing Fund (SFF)

Zvi14 Dec 2021 14:30 UTC
186 points
65 comments64 min readLW link1 review
(thezvi.wordpress.com)

Con­se­quen­tial­ism & corrigibility

Steven Byrnes14 Dec 2021 13:23 UTC
66 points
27 comments7 min readLW link

De­ci­sion The­ory Break­down—Per­sonal At­tempt at a Review

Jake Arft-Guatelli14 Dec 2021 0:40 UTC
4 points
1 comment8 min readLW link

Mys­tery Hunt 2022

Scott Garrabrant13 Dec 2021 21:57 UTC
30 points
5 comments1 min readLW link

En­abling More Feed­back for AI Safety Researchers

frances_lorenz13 Dec 2021 20:10 UTC
17 points
0 comments3 min readLW link

Lan­guage Model Align­ment Re­search Internships

Ethan Perez13 Dec 2021 19:53 UTC
74 points
1 comment1 min readLW link

Omicron Post #6

Zvi13 Dec 2021 18:00 UTC
89 points
30 comments8 min readLW link
(thezvi.wordpress.com)

Anal­y­sis of Bird Box (2018)

TekhneMakre13 Dec 2021 17:30 UTC
11 points
3 comments5 min readLW link

Solv­ing In­ter­pretabil­ity Week

Logan Riggs13 Dec 2021 17:09 UTC
11 points
5 comments1 min readLW link

Un­der­stand­ing and con­trol­ling auto-in­duced dis­tri­bu­tional shift

L Rudolf L13 Dec 2021 14:59 UTC
32 points
4 comments16 min readLW link

A fate worse than death?

RomanS13 Dec 2021 11:05 UTC
−25 points
26 comments2 min readLW link

What’s the back­ward-for­ward FLOP ra­tio for Neu­ral Net­works?

13 Dec 2021 8:54 UTC
19 points
12 comments10 min readLW link

Sum­mary of the Acausal At­tack Is­sue for AIXI

Diffractor13 Dec 2021 8:16 UTC
12 points
6 comments4 min readLW link

Hard-Cod­ing Neu­ral Computation

MadHatter13 Dec 2021 4:35 UTC
34 points
8 comments27 min readLW link

[Question] Is “gears-level” just a syn­onym for “mechanis­tic”?

David Scott Krueger (formerly: capybaralet)13 Dec 2021 4:11 UTC
48 points
29 comments1 min readLW link

Baby Nicknames

jefftk13 Dec 2021 2:20 UTC
11 points
0 comments1 min readLW link
(www.jefftk.com)

[Question] Why do gov­ern­ments re­fer to ex­is­ten­tial risks pri­mar­ily in terms of na­tional se­cu­rity?

Evan_Gaensbauer13 Dec 2021 1:05 UTC
3 points
3 comments1 min readLW link

[Question] [Re­solved] Who else prefers “AI al­ign­ment” to “AI safety?”

Evan_Gaensbauer13 Dec 2021 0:35 UTC
5 points
8 comments1 min readLW link

Work­ing through D&D.Sci, prob­lem 1

Pablo Repetto12 Dec 2021 23:10 UTC
8 points
2 comments1 min readLW link
(pabloernesto.github.io)

Teaser: Hard-cod­ing Trans­former Models

MadHatter12 Dec 2021 22:04 UTC
74 points
19 comments1 min readLW link

The Three Mu­ta­tions of Dark Rationality

DarkRationalist12 Dec 2021 22:01 UTC
−15 points
0 comments2 min readLW link

Red­wood’s Tech­nique-Fo­cused Epistemic Strategy

adamShimi12 Dec 2021 16:36 UTC
48 points
1 comment7 min readLW link

For and Against Lot­ter­ies in Elite Univer­sity Admissions

Sam Enright12 Dec 2021 13:41 UTC
10 points
2 comments3 min readLW link

[Question] Nu­clear war anthropics

smountjoy12 Dec 2021 4:54 UTC
11 points
7 comments1 min readLW link

Some ab­stract, non-tech­ni­cal rea­sons to be non-max­i­mally-pes­simistic about AI alignment

Rob Bensinger12 Dec 2021 2:08 UTC
70 points
35 comments7 min readLW link

Magna Alta Doctrina

jacob_cannell11 Dec 2021 21:54 UTC
58 points
7 comments28 min readLW link

EA Din­ner Covid Logistics

jefftk11 Dec 2021 21:50 UTC
17 points
7 comments2 min readLW link
(www.jefftk.com)

Trans­form­ing my­opic op­ti­miza­tion to or­di­nary op­ti­miza­tion—Do we want to seek con­ver­gence for my­opic op­ti­miza­tion prob­lems?

tailcalled11 Dec 2021 20:38 UTC
12 points
1 comment5 min readLW link

What on Earth is a Series I sav­ings bond?

rossry11 Dec 2021 12:18 UTC
11 points
7 comments7 min readLW link

D&D.Sci GURPS Dec 2021: Hun­ters of Monsters

J Bostock11 Dec 2021 12:13 UTC
20 points
18 comments2 min readLW link

Anx­iety and com­puter architecture

Adam Zerner11 Dec 2021 10:37 UTC
13 points
8 comments3 min readLW link

[Question] Rea­sons to act ac­cord­ing to the free will paradigm?

Maciej Jałocha11 Dec 2021 8:44 UTC
−3 points
5 comments1 min readLW link

Ex­trin­sic and In­trin­sic Mo­ral Frameworks

lsusr11 Dec 2021 5:28 UTC
14 points
5 comments2 min readLW link

Moore’s Law, AI, and the pace of progress

Veedrac11 Dec 2021 3:02 UTC
125 points
38 comments24 min readLW link

What role should evolu­tion­ary analo­gies play in un­der­stand­ing AI take­off speeds?

anson.ho11 Dec 2021 1:19 UTC
14 points
0 comments42 min readLW link

[Question] Non­ver­bal thinkers: how do you ex­pe­rience your in­ner critic?

Phoenix Eliot11 Dec 2021 0:40 UTC
9 points
2 comments1 min readLW link

The Plan

johnswentworth10 Dec 2021 23:41 UTC
254 points
78 comments14 min readLW link1 review

[Linkpost] Chi­nese gov­ern­ment’s guidelines on AI

RomanS10 Dec 2021 21:10 UTC
61 points
14 comments1 min readLW link

Un­der­stand­ing Gra­di­ent Hacking

peterbarnett10 Dec 2021 15:58 UTC
41 points
5 comments30 min readLW link

There is es­sen­tially one best-val­i­dated the­ory of cog­ni­tion.

abramdemski10 Dec 2021 15:51 UTC
89 points
33 comments3 min readLW link

The Promise and Peril of Finite Sets

davidad10 Dec 2021 12:29 UTC
42 points
4 comments6 min readLW link

Are big brains for pro­cess­ing sen­sory in­put?

lsusr10 Dec 2021 7:08 UTC
43 points
20 comments3 min readLW link