Les­sons On How To Get Things Right On The First Try

19 Jun 2023 23:58 UTC
228 points
56 comments10 min readLW link

Mode col­lapse in RL may be fueled by the up­date equation

19 Jun 2023 21:51 UTC
49 points
10 comments8 min readLW link

New refer­ence stan­dard on LLM Ap­pli­ca­tion se­cu­rity started by OWASP

QuantumForest19 Jun 2023 20:54 UTC
2 points
0 comments1 min readLW link

Ex­per­i­ments in Eval­u­at­ing Steer­ing Vectors

Gytis Daujotas19 Jun 2023 15:11 UTC
32 points
3 comments4 min readLW link

Provisionality

TsviBT19 Jun 2023 11:49 UTC
7 points
2 comments7 min readLW link

[Question] When did you ori­ent?

lukehmiles19 Jun 2023 7:22 UTC
11 points
7 comments1 min readLW link

Guide to ra­tio­nal­ist in­te­rior decorating

mingyuan19 Jun 2023 6:47 UTC
271 points
45 comments12 min readLW link

A Mul­tidis­ci­plinary Ap­proach to Align­ment (MATA) and Archety­pal Trans­fer Learn­ing (ATL)

MiguelDev19 Jun 2023 2:32 UTC
4 points
2 comments7 min readLW link

re­solv­ing some neu­ral net­work mysteries

bhauth19 Jun 2023 0:09 UTC
44 points
6 comments2 min readLW link
(www.bhauth.com)

Why I am not an AI ex­tinc­tion cautionista

shminux18 Jun 2023 21:28 UTC
22 points
40 comments2 min readLW link

My im­pres­sion of sin­gu­lar learn­ing theory

Ege Erdil18 Jun 2023 15:34 UTC
45 points
30 comments2 min readLW link

Ber­lin AI Align­ment Open Meetup July 2023

GuyP18 Jun 2023 14:13 UTC
1 point
0 comments1 min readLW link

Alaska Trip

jefftk18 Jun 2023 13:40 UTC
18 points
0 comments2 min readLW link
(www.jefftk.com)

UK Foun­da­tion Model Task Force—Ex­pres­sion of Interest

ojorgensen18 Jun 2023 9:43 UTC
64 points
2 comments1 min readLW link
(twitter.com)

Cry­on­ics Ca­reer Sur­vey (more jobs than you think)

Mati_Roy18 Jun 2023 2:13 UTC
41 points
1 comment2 min readLW link

Solomonoff in­duc­tion still works if the uni­verse is un­com­putable, and its use­ful­ness doesn’t re­quire know­ing Oc­cam’s razor

Christopher King18 Jun 2023 1:52 UTC
38 points
28 comments4 min readLW link

DSLT 2. Why Neu­ral Net­works obey Oc­cam’s Razor

Liam Carroll18 Jun 2023 0:23 UTC
22 points
14 comments17 min readLW link

The foun­da­tions of knowl­edge.

archeon18 Jun 2023 0:05 UTC
−1 points
4 comments2 min readLW link

A few more ants and grasshoppers

c.trout17 Jun 2023 23:38 UTC
16 points
3 comments4 min readLW link

The “Loss Func­tion of Real­ity” Is Not So Spiky and Unpredictable

Thoth Hermes17 Jun 2023 21:43 UTC
12 points
0 comments6 min readLW link
(thothhermes.substack.com)

[Question] What is the foun­da­tion of me ex­pe­rienc­ing the pre­sent mo­ment be­ing right now and not at some other point in time?

MvB17 Jun 2023 20:47 UTC
20 points
19 comments1 min readLW link

Ad­ven­tist Health Study-2 sup­ports pesc­etar­i­anism more than veganism

Elizabeth17 Jun 2023 20:10 UTC
67 points
11 comments6 min readLW link
(acesounderglass.com)

The en­vi­ron­ment as infrastructure

jasoncrawford17 Jun 2023 18:42 UTC
28 points
9 comments1 min readLW link
(rootsofprogress.org)

A sum­mary of cur­rent work in AI governance

constructive17 Jun 2023 18:41 UTC
43 points
1 comment11 min readLW link
(forum.effectivealtruism.org)

[Linkpost] Rosetta Neu­rons: Min­ing the Com­mon Units in a Model Zoo

Bogdan Ionut Cirstea17 Jun 2023 16:38 UTC
12 points
0 comments1 min readLW link

Par­tial Si­mu­la­tion Ex­trap­o­la­tion: A Pro­posal for Build­ing Safer Simulators

lukemarks17 Jun 2023 13:55 UTC
16 points
0 comments10 min readLW link

Alewife Train is Now Arriving

jefftk17 Jun 2023 13:20 UTC
21 points
4 comments1 min readLW link
(www.jefftk.com)

[Question] What frac­tion of words writ­ten/​read are AI-writ­ten?

Mati_Roy17 Jun 2023 13:15 UTC
8 points
6 comments1 min readLW link

Are Bayesian meth­ods guaran­teed to overfit?

Ege Erdil17 Jun 2023 12:52 UTC
52 points
5 comments3 min readLW link
(www.yulingyao.com)

The AI gov­er­nance gaps in de­vel­op­ing countries

nguyên17 Jun 2023 2:50 UTC
20 points
1 comment14 min readLW link

June and Mulberries

jefftk17 Jun 2023 1:30 UTC
13 points
2 comments1 min readLW link
(www.jefftk.com)

Up­dat­ing Drexler’s CAIS model

Matthew Barnett16 Jun 2023 22:53 UTC
46 points
32 comments4 min readLW link

Avoid­ing meta­physics means giv­ing bad philos­o­phy a free pass

Aditya16 Jun 2023 20:54 UTC
6 points
9 comments4 min readLW link

Crit­i­cism of Eliezer’s ir­ra­tional moral beliefs

Jorterder16 Jun 2023 20:47 UTC
−17 points
21 comments1 min readLW link

Car­tog­ra­phy, blow­ing one’s mind, the illu­sion of sep­a­ra­tion and other gen­eral musings

Neil 16 Jun 2023 19:19 UTC
0 points
4 comments2 min readLW link

[Repli­ca­tion] Con­jec­ture’s Sparse Cod­ing in Small Transformers

16 Jun 2023 18:02 UTC
52 points
0 comments5 min readLW link

Longevity: Dou­ble Hu­man Lifes­pan in the Next Decade?

Jannik Schg16 Jun 2023 17:51 UTC
1 point
0 comments1 min readLW link

LLMs Some­times Gen­er­ate Purely Nega­tively-Re­in­forced Text

Fabien Roger16 Jun 2023 16:31 UTC
175 points
10 comments7 min readLW link

Palan­tir’s AI models

ChristianKl16 Jun 2023 16:20 UTC
26 points
16 comments1 min readLW link
(www.palantir.com)

[Linkpost] Faith and Fate: Limits of Trans­form­ers on Compositionality

Joe Kwon16 Jun 2023 15:04 UTC
19 points
4 comments1 min readLW link
(arxiv.org)

The ones who endure

Richard_Ngo16 Jun 2023 14:40 UTC
58 points
15 comments5 min readLW link
(www.thinkingcomplete.com)

Con­jec­ture: A stand­ing offer for pub­lic de­bates on AI

Andrea_Miotti16 Jun 2023 14:33 UTC
29 points
1 comment2 min readLW link
(www.conjecture.dev)

Ex­plain­ing “Tak­ing fea­tures out of su­per­po­si­tion with sparse au­toen­coders”

Robert_AIZI16 Jun 2023 13:59 UTC
9 points
0 comments8 min readLW link
(aizi.substack.com)

[Question] How not to write the Cook­book of Doom?

brunoparga16 Jun 2023 13:37 UTC
17 points
5 comments1 min readLW link

Scaf­folded LLMs: Less Ob­vi­ous Concerns

Stephen Fowler16 Jun 2023 10:39 UTC
30 points
13 comments11 min readLW link

Mo­ti­va­tion in AI

nickasaf16 Jun 2023 9:50 UTC
−1 points
1 comment2 min readLW link

DSLT 0. Distill­ing Sin­gu­lar Learn­ing Theory

Liam Carroll16 Jun 2023 9:50 UTC
76 points
6 comments5 min readLW link

DSLT 1. The RLCT Mea­sures the Effec­tive Di­men­sion of Neu­ral Networks

Liam Carroll16 Jun 2023 9:50 UTC
47 points
8 comments13 min readLW link

[Linkpost] Map­ping Brains with Lan­guage Models: A Survey

Bogdan Ionut Cirstea16 Jun 2023 9:49 UTC
5 points
0 comments1 min readLW link

Ra­tional An­i­ma­tions is look­ing for an AI Safety scriptwriter, a lead com­mu­nity man­ager, and other roles.

Writer16 Jun 2023 9:41 UTC
74 points
1 comment3 min readLW link