Les­sons On How To Get Things Right On The First Try

Jun 19, 2023, 11:58 PM
252 points
57 comments10 min readLW link1 review

Mode col­lapse in RL may be fueled by the up­date equation

Jun 19, 2023, 9:51 PM
53 points
10 comments8 min readLW link

New refer­ence stan­dard on LLM Ap­pli­ca­tion se­cu­rity started by OWASP

QuantumForestJun 19, 2023, 8:54 PM
2 points
0 comments1 min readLW link

Ex­per­i­ments in Eval­u­at­ing Steer­ing Vectors

Gytis DaujotasJun 19, 2023, 3:11 PM
34 points
4 comments4 min readLW link

Provisionality

TsviBTJun 19, 2023, 11:49 AM
7 points
2 comments7 min readLW link

[Question] When did you ori­ent?

lemonhopeJun 19, 2023, 7:22 AM
12 points
7 comments1 min readLW link

Guide to ra­tio­nal­ist in­te­rior decorating

mingyuanJun 19, 2023, 6:47 AM
327 points
53 comments12 min readLW link4 reviews

A Mul­tidis­ci­plinary Ap­proach to Align­ment (MATA) and Archety­pal Trans­fer Learn­ing (ATL)

MiguelDevJun 19, 2023, 2:32 AM
4 points
2 comments7 min readLW link

re­solv­ing some neu­ral net­work mysteries

bhauthJun 19, 2023, 12:09 AM
44 points
6 comments2 min readLW link
(www.bhauth.com)

Why I am not an AI ex­tinc­tion cautionista

ShmiJun 18, 2023, 9:28 PM
22 points
40 comments2 min readLW link

My im­pres­sion of sin­gu­lar learn­ing theory

Ege ErdilJun 18, 2023, 3:34 PM
47 points
30 comments2 min readLW link

Ber­lin AI Align­ment Open Meetup July 2023

GuyPJun 18, 2023, 2:13 PM
1 point
0 comments1 min readLW link

Alaska Trip

jefftkJun 18, 2023, 1:40 PM
18 points
0 comments2 min readLW link
(www.jefftk.com)

UK Foun­da­tion Model Task Force—Ex­pres­sion of Interest

ojorgensenJun 18, 2023, 9:43 AM
64 points
2 comments1 min readLW link
(twitter.com)

Cry­on­ics Ca­reer Sur­vey (more jobs than you think)

Mati_RoyJun 18, 2023, 2:13 AM
41 points
1 comment2 min readLW link

Solomonoff in­duc­tion still works if the uni­verse is un­com­putable, and its use­ful­ness doesn’t re­quire know­ing Oc­cam’s razor

Christopher KingJun 18, 2023, 1:52 AM
38 points
28 comments4 min readLW link

DSLT 2. Why Neu­ral Net­works obey Oc­cam’s Razor

Liam CarrollJun 18, 2023, 12:23 AM
24 points
14 comments17 min readLW link

The foun­da­tions of knowl­edge.

archeonJun 18, 2023, 12:05 AM
−1 points
4 comments2 min readLW link

A few more ants and grasshoppers

c.troutJun 17, 2023, 11:38 PM
16 points
3 comments4 min readLW link

The “Loss Func­tion of Real­ity” Is Not So Spiky and Unpredictable

Thoth HermesJun 17, 2023, 9:43 PM
12 points
0 comments6 min readLW link
(thothhermes.substack.com)

[Question] What is the foun­da­tion of me ex­pe­rienc­ing the pre­sent mo­ment be­ing right now and not at some other point in time?

MvBJun 17, 2023, 8:47 PM
20 points
19 comments1 min readLW link

Ad­ven­tist Health Study-2 sup­ports pesc­etar­i­anism more than veganism

ElizabethJun 17, 2023, 8:10 PM
67 points
11 comments6 min readLW link
(acesounderglass.com)

The en­vi­ron­ment as infrastructure

jasoncrawfordJun 17, 2023, 6:42 PM
28 points
9 comments1 min readLW link
(rootsofprogress.org)

A sum­mary of cur­rent work in AI governance

constructiveJun 17, 2023, 6:41 PM
44 points
1 comment11 min readLW link
(forum.effectivealtruism.org)

[Linkpost] Rosetta Neu­rons: Min­ing the Com­mon Units in a Model Zoo

Bogdan Ionut CirsteaJun 17, 2023, 4:38 PM
12 points
0 comments1 min readLW link

Par­tial Si­mu­la­tion Ex­trap­o­la­tion: A Pro­posal for Build­ing Safer Simulators

lukemarksJun 17, 2023, 1:55 PM
16 points
0 comments10 min readLW link

Alewife Train is Now Arriving

jefftkJun 17, 2023, 1:20 PM
21 points
4 comments1 min readLW link
(www.jefftk.com)

[Question] What frac­tion of words writ­ten/​read are AI-writ­ten?

Mati_RoyJun 17, 2023, 1:15 PM
8 points
6 comments1 min readLW link

Are Bayesian meth­ods guaran­teed to overfit?

Ege ErdilJun 17, 2023, 12:52 PM
52 points
5 comments3 min readLW link
(www.yulingyao.com)

The AI gov­er­nance gaps in de­vel­op­ing countries

ntranJun 17, 2023, 2:50 AM
20 points
1 comment14 min readLW link

June and Mulberries

jefftkJun 17, 2023, 1:30 AM
13 points
2 comments1 min readLW link
(www.jefftk.com)

Up­dat­ing Drexler’s CAIS model

Matthew BarnettJun 16, 2023, 10:53 PM
47 points
32 comments4 min readLW link

Avoid­ing meta­physics means giv­ing bad philos­o­phy a free pass

AdityaJun 16, 2023, 8:54 PM
5 points
9 comments4 min readLW link

Crit­i­cism of Eliezer’s ir­ra­tional moral beliefs

JorterderJun 16, 2023, 8:47 PM
−17 points
21 comments1 min readLW link

Car­tog­ra­phy, blow­ing one’s mind, the illu­sion of sep­a­ra­tion and other gen­eral musings

Neil Jun 16, 2023, 7:19 PM
0 points
4 comments2 min readLW link

[Repli­ca­tion] Con­jec­ture’s Sparse Cod­ing in Small Transformers

Jun 16, 2023, 6:02 PM
52 points
0 comments5 min readLW link

Longevity: Dou­ble Hu­man Lifes­pan in the Next Decade?

Jannik SchgJun 16, 2023, 5:51 PM
1 point
0 comments1 min readLW link

LLMs Some­times Gen­er­ate Purely Nega­tively-Re­in­forced Text

Fabien RogerJun 16, 2023, 4:31 PM
177 points
11 comments7 min readLW link

Palan­tir’s AI models

ChristianKlJun 16, 2023, 4:20 PM
26 points
16 comments1 min readLW link
(www.palantir.com)

[Linkpost] Faith and Fate: Limits of Trans­form­ers on Compositionality

Joe KwonJun 16, 2023, 3:04 PM
19 points
4 comments1 min readLW link
(arxiv.org)

The ones who endure

Richard_NgoJun 16, 2023, 2:40 PM
65 points
16 comments5 min readLW link
(www.thinkingcomplete.com)

Con­jec­ture: A stand­ing offer for pub­lic de­bates on AI

Andrea_MiottiJun 16, 2023, 2:33 PM
29 points
1 comment2 min readLW link
(www.conjecture.dev)

Ex­plain­ing “Tak­ing fea­tures out of su­per­po­si­tion with sparse au­toen­coders”

Robert_AIZIJun 16, 2023, 1:59 PM
10 points
0 comments8 min readLW link
(aizi.substack.com)

[Question] How not to write the Cook­book of Doom?

brunopargaJun 16, 2023, 1:37 PM
17 points
5 comments1 min readLW link

Scaf­folded LLMs: Less Ob­vi­ous Concerns

Stephen FowlerJun 16, 2023, 10:39 AM
34 points
15 comments14 min readLW link

Mo­ti­va­tion in AI

nickasafJun 16, 2023, 9:50 AM
−1 points
1 comment2 min readLW link

DSLT 0. Distill­ing Sin­gu­lar Learn­ing Theory

Liam CarrollJun 16, 2023, 9:50 AM
80 points
7 comments5 min readLW link

DSLT 1. The RLCT Mea­sures the Effec­tive Di­men­sion of Neu­ral Networks

Liam CarrollJun 16, 2023, 9:50 AM
54 points
10 comments13 min readLW link

[Linkpost] Map­ping Brains with Lan­guage Models: A Survey

Bogdan Ionut CirsteaJun 16, 2023, 9:49 AM
5 points
0 comments1 min readLW link

Ra­tional An­i­ma­tions is look­ing for an AI Safety scriptwriter, a lead com­mu­nity man­ager, and other roles.

WriterJun 16, 2023, 9:41 AM
74 points
1 comment3 min readLW link