Toward a tax­on­omy of cog­ni­tive bench­marks for agen­tic AGIs

Ben SmithJun 27, 2024, 11:50 PM
15 points
0 comments5 min readLW link

How Big a Deal are MatMul-Free Trans­form­ers?

JustisMillsJun 27, 2024, 10:28 PM
19 points
6 comments5 min readLW link
(justismills.substack.com)

Se­condary forces of debt

KatjaGraceJun 27, 2024, 9:10 PM
81 points
18 comments2 min readLW link
(worldspiritsockpuppet.com)

Distil­la­tion of ‘Do lan­guage mod­els plan for fu­ture to­kens’

TheManxLoinerJun 27, 2024, 8:57 PM
26 points
2 comments6 min readLW link

how birds sense mag­netic fields

bhauthJun 27, 2024, 6:59 PM
51 points
4 comments5 min readLW link
(www.bhauth.com)

Rep­re­sen­ta­tion Tuning

Christopher AckermanJun 27, 2024, 5:44 PM
35 points
9 comments13 min readLW link

An is­sue with train­ing schemers with su­per­vised fine-tuning

Fabien RogerJun 27, 2024, 3:37 PM
49 points
12 comments6 min readLW link

AI #70: A Beau­tiful Sonnet

ZviJun 27, 2024, 2:40 PM
38 points
0 comments44 min readLW link
(thezvi.wordpress.com)

De­tect­ing Ge­net­i­cally Eng­ineered Viruses With Me­tage­nomic Sequencing

jefftkJun 27, 2024, 2:01 PM
87 points
10 commentsLW link
(naobservatory.org)

Cross Robin

jefftkJun 27, 2024, 3:10 AM
11 points
2 comments1 min readLW link
(www.jefftk.com)

Live The­ory Part 0: Tak­ing In­tel­li­gence Seriously

SahilJun 26, 2024, 9:37 PM
101 points
3 comments8 min readLW link

In­stru­men­tal vs Ter­mi­nal Desiderata

Max HarmsJun 26, 2024, 8:57 PM
21 points
0 comments3 min readLW link

Im­bue (Gen­er­ally In­tel­li­gent) con­tinue to make progress

Nathan Helm-BurgerJun 26, 2024, 8:41 PM
18 points
0 comments1 min readLW link
(imbue.com)

Trac­ing the steps

matimissonaJun 26, 2024, 7:22 PM
−8 points
2 comments4 min readLW link

Coun­ter­ing AI dis­in­for­ma­tion and deep fakes with digi­tal signatures

Dave LindberghJun 26, 2024, 6:09 PM
13 points
5 comments1 min readLW link

Progress Con­fer­ence 2024: Toward Abun­dant Futures

jasoncrawfordJun 26, 2024, 3:39 PM
40 points
2 comments1 min readLW link
(rootsofprogress.org)

Schel­ling points in the AGI policy space

mesaoptimizerJun 26, 2024, 1:19 PM
52 points
2 comments6 min readLW link

Bad les­sons learned from the debate

bayesyatinaJun 26, 2024, 11:54 AM
8 points
5 comments6 min readLW link

Child­hood and Ed­u­ca­tion Roundup #6: Col­lege Edition

ZviJun 26, 2024, 11:40 AM
28 points
8 comments23 min readLW link
(thezvi.wordpress.com)

New fast trans­former in­fer­ence ASIC — Sohu by Etched

lemonhopeJun 26, 2024, 9:56 AM
8 points
9 comments1 min readLW link
(www.etched.com)

Em­piri­cal vs. Math­e­mat­i­cal Joints of Nature

Jun 26, 2024, 1:55 AM
35 points
1 comment5 min readLW link

My Cur­rent Claims and Cruxes on LLM Fore­cast­ing & Epistemics

ozziegooenJun 26, 2024, 12:40 AM
11 points
0 commentsLW link

In favour of ex­plor­ing nag­ging doubts about x-risk

owencbJun 25, 2024, 11:52 PM
105 points
2 commentsLW link

What is a Tool?

Jun 25, 2024, 11:40 PM
62 points
4 comments6 min readLW link

[Question] When do al­ign­ment re­searchers re­tire?

Jordan TaylorJun 25, 2024, 11:30 PM
4 points
2 comments1 min readLW link

Com­pute Gover­nance Liter­a­ture Re­view

sijarvisJun 25, 2024, 10:41 PM
11 points
0 comments13 min readLW link

Com­pu­ta­tional Com­plex­ity as an In­tu­ition Pump for LLM Gen­er­al­ity

aribrillJun 25, 2024, 8:25 PM
18 points
6 comments3 min readLW link

Failure Modes of Teach­ing AI Safety

Eleni AngelouJun 25, 2024, 7:07 PM
20 points
0 comments1 min readLW link

Kingfisher Sum­mer Tour 2024

jefftkJun 25, 2024, 6:50 PM
9 points
0 comments1 min readLW link
(www.jefftk.com)

In­cen­tive Learn­ing vs Dead Sea Salt Experiment

Steven ByrnesJun 25, 2024, 5:49 PM
30 points
1 comment28 min readLW link

An In­tu­itive Ex­pla­na­tion of Sparse Au­toen­coders for Mechanis­tic In­ter­pretabil­ity of LLMs

Adam KarvonenJun 25, 2024, 3:57 PM
27 points
0 comments9 min readLW link
(adamkarvonen.github.io)

For­mal ver­ifi­ca­tion, heuris­tic ex­pla­na­tions and sur­prise accounting

Jacob_HiltonJun 25, 2024, 3:40 PM
156 points
11 comments9 min readLW link
(www.alignment.org)

Me­tas­trat­egy get-started guide

TahpJun 25, 2024, 3:04 PM
6 points
1 comment8 min readLW link

La­bor Par­ti­ci­pa­tion is an Align­ment Risk

alexJun 25, 2024, 2:15 PM
−5 points
2 comments17 min readLW link

Monthly Roundup #19: June 2024

ZviJun 25, 2024, 12:00 PM
28 points
9 comments54 min readLW link
(thezvi.wordpress.com)

Reg­u­larly meta-optimization

Crazy philosopherJun 25, 2024, 6:12 AM
−4 points
6 comments1 min readLW link

Memet­ics as an anal­ogy and its im­plicit connotations

Rachel ShuJun 25, 2024, 5:13 AM
3 points
0 comments3 min readLW link

Mis­takes peo­ple make when think­ing about units

Isaac KingJun 25, 2024, 3:39 AM
74 points
14 comments7 min readLW link

Higher-effort sum­mer sols­tice: What if we used AI (i.e., An­gel Is­land)?

Rachel ShuJun 25, 2024, 1:35 AM
46 points
9 comments3 min readLW link

I’m a bit skep­ti­cal of AlphaFold 3

Oleg TrottJun 25, 2024, 12:04 AM
87 points
14 comments2 min readLW link

Be­ing hella lost as ra­tio­nal­ity practice

Rachel ShuJun 24, 2024, 11:50 PM
14 points
0 comments2 min readLW link

A Ba­sic Eco­nomics-Style Model of AI Ex­is­ten­tial Risk

Rubi J. HudsonJun 24, 2024, 8:26 PM
24 points
3 comments7 min readLW link

The Minor­ity Coalition

Richard_NgoJun 24, 2024, 8:01 PM
103 points
9 comments5 min readLW link
(www.narrativeark.xyz)

Com­pact Proofs of Model Perfor­mance via Mechanis­tic Interpretability

24 Jun 2024 19:27 UTC
97 points
4 comments8 min readLW link
(arxiv.org)

Con­tra­pos­i­tive Nat­u­ral Ab­strac­tion—Pro­ject Intro

Elliot Callender24 Jun 2024 18:37 UTC
4 points
5 comments2 min readLW link

Sparse Fea­tures Through Time

Rogan Inglis24 Jun 2024 18:06 UTC
12 points
1 comment1 min readLW link
(roganinglis.io)

PSA: Con­sider al­ter­na­tives to AUROC when re­port­ing clas­sifier met­rics for alignment

rpglover6424 Jun 2024 17:53 UTC
18 points
1 comment3 min readLW link

Pay­ing Rus­si­ans to not in­vade Ukraine

djColliderBias24 Jun 2024 17:46 UTC
9 points
7 comments3 min readLW link

SAE fea­ture ge­om­e­try is out­side the su­per­po­si­tion hypothesis

jake_mendel24 Jun 2024 16:07 UTC
228 points
17 comments11 min readLW link

So you want to work on tech­ni­cal AI safety

gw24 Jun 2024 14:29 UTC
51 points
3 comments14 min readLW link