Bro­ken La­tents: Study­ing SAEs and Fea­ture Co-oc­cur­rence in Toy Models

Dec 30, 2024, 10:50 PM
22 points
3 comments15 min readLW link

Ge­net­i­cally ed­ited mosquitoes haven’t scaled yet. Why?

alexeyDec 30, 2024, 9:37 PM
24 points
0 comments1 min readLW link
(eryney.substack.com)

Linkpost: Look at the Water

J BostockDec 30, 2024, 7:49 PM
4 points
3 comments4 min readLW link

The low In­for­ma­tion Den­sity of Eliezer Yud­kowsky & LessWrong

Felix OlszewskiDec 30, 2024, 7:43 PM
14 points
8 comments1 min readLW link

o3, Oh My

ZviDec 30, 2024, 2:10 PM
63 points
17 comments36 min readLW link
(thezvi.wordpress.com)

World mod­els I’m cur­rently building

samuelshadrachDec 30, 2024, 8:26 AM
1 point
0 comments17 min readLW link
(samuelshadrach.com)

Is “VNM-agent” one of sev­eral op­tions, for what minds can grow up into?

AnnaSalamonDec 30, 2024, 6:36 AM
89 points
55 comments2 min readLW link

Why I’m Mov­ing from Mechanis­tic to Pro­saic Interpretability

Daniel TanDec 30, 2024, 6:35 AM
113 points
34 comments5 min readLW link

When do ex­perts think hu­man-level AI will be cre­ated?

Dec 30, 2024, 6:20 AM
10 points
0 comments2 min readLW link
(aisafety.info)

2025 Pre­dic­tion Thread

habrykaDec 30, 2024, 1:50 AM
77 points
21 comments1 min readLW link

The Great OpenAI De­bate: Should It Stay ‘Open’ or Go Pri­vate?

SatyaDec 30, 2024, 1:14 AM
−1 points
0 comments3 min readLW link

Learn to write well BEFORE you have some­thing worth saying

eukaryoteDec 29, 2024, 11:42 PM
67 points
18 comments3 min readLW link
(eukaryotewritesblog.com)

Teach­ing Claude to Meditate

Gordon Seidoh WorleyDec 29, 2024, 10:27 PM
−5 points
4 comments23 min readLW link

Ac­tion: how do you REALLY go about do­ing?

DDthinkerDec 29, 2024, 10:00 PM
−7 points
1 comment4 min readLW link

Be­gan a pay-on-re­sults coach­ing ex­per­i­ment, made $40,300 since July

ChipmonkDec 29, 2024, 9:12 PM
43 points
15 comments1 min readLW link
(chrislakin.blog)

Cor­rigi­bil­ity should be an AI’s Only Goal

PeterMcCluskeyDec 29, 2024, 8:25 PM
22 points
3 comments8 min readLW link
(bayesianinvestor.com)

[Question] Could my work, “Beyond HaHa” benefit the LessWrong com­mu­nity?

P. JoãoDec 29, 2024, 4:14 PM
9 points
7 comments1 min readLW link

Book Sum­mary: Zero to One

bilalchughtaiDec 29, 2024, 4:13 PM
27 points
2 comments8 min readLW link

Bos­ton Sols­tice 2024 Retrospective

jefftkDec 29, 2024, 3:40 PM
15 points
0 comments4 min readLW link
(www.jefftk.com)

Some ar­gu­ments against a land value tax

Matthew BarnettDec 29, 2024, 3:17 PM
83 points
40 comments15 min readLW link

Pre­dic­tions of Near-Term So­cietal Changes Due to Ar­tifi­cial Intelligence

AnnapurnaDec 29, 2024, 2:53 PM
10 points
0 comments6 min readLW link
(jorgevelez.substack.com)

Con­sid­er­a­tions on orca intelligence

Towards_KeeperhoodDec 29, 2024, 2:35 PM
51 points
14 comments9 min readLW link

AI Align­ment, and where we stand.

afeller08Dec 29, 2024, 2:08 PM
−17 points
0 comments2 min readLW link

The Le­gacy of Com­puter Science

Johannes C. MayerDec 29, 2024, 1:15 PM
18 points
0 comments1 min readLW link
(groups.csail.mit.edu)

Shal­low re­view of tech­ni­cal AI safety, 2024

Dec 29, 2024, 12:01 PM
189 points
34 comments41 min readLW link

Dish­brain and im­pli­ca­tions.

RussellThorDec 29, 2024, 10:42 AM
4 points
0 comments2 min readLW link

Notes on Altruism

David GrossDec 29, 2024, 3:13 AM
17 points
2 comments35 min readLW link

Re­ject­ing An­thro­po­mor­phic Bias: Ad­dress­ing Fears of AGI and Transformation

GedankensprüngeDec 29, 2024, 1:48 AM
−17 points
1 comment3 min readLW link

What hap­pens next?

Logan ZoellnerDec 29, 2024, 1:41 AM
41 points
19 comments2 min readLW link

The Mis­con­cep­tion of AGI as an Ex­is­ten­tial Threat: A Reassessment

GedankensprüngeDec 29, 2024, 1:39 AM
−27 points
1 comment2 min readLW link

Does Claude Pri­ori­tize Some Prompt In­put Chan­nels Over Others?

keltanDec 29, 2024, 1:21 AM
9 points
2 comments5 min readLW link

Im­pact in AI Safety Now Re­quires Spe­cific Strate­gic Insight

MiloSalDec 29, 2024, 12:40 AM
28 points
1 comment6 min readLW link
(ameliorology.substack.com)

Mo­ral­ity Is Still Demanding

utilistrutilDec 29, 2024, 12:33 AM
−8 points
2 commentsLW link

Emer­gence and Am­plifi­ca­tion of Survival

jgraves01Dec 28, 2024, 11:52 PM
−1 points
0 comments3 min readLW link

[Question] Has Some­one Checked The Cold-Water-In-Left-Ear Thing?

MaloewDec 28, 2024, 8:15 PM
11 points
0 comments1 min readLW link

By de­fault, cap­i­tal will mat­ter more than ever af­ter AGI

L Rudolf LDec 28, 2024, 5:52 PM
289 points
100 comments16 min readLW link
(nosetgauge.substack.com)

AI As­sis­tants Should Have a Direct Line to Their Developers

Jan_KulveitDec 28, 2024, 5:01 PM
57 points
6 comments2 min readLW link

No, the Poly­mar­ket price does not mean we can im­me­di­ately con­clude what the prob­a­bil­ity of a bird flu pan­demic is. We also need to know the in­ter­est rate!

Christopher KingDec 28, 2024, 4:05 PM
7 points
11 comments1 min readLW link

The av­er­age ra­tio­nal­ist IQ is about 122

RockenotsDec 28, 2024, 3:42 PM
20 points
23 comments1 min readLW link

Why OpenAI’s Struc­ture Must Evolve To Ad­vance Our Mission

stuhlmuellerDec 28, 2024, 4:24 AM
19 points
1 comment1 min readLW link
(openai.com)

The Eng­ineer­ing Ar­gu­ment Fal­lacy: Why Tech­nolog­i­cal Suc­cess Doesn’t Val­i­date Physics

Wenitte ApiouDec 28, 2024, 12:49 AM
−16 points
5 comments2 min readLW link

The Robot, the Pup­pet-mas­ter, and the Psychohistorian

WillPetilloDec 28, 2024, 12:12 AM
8 points
2 comments3 min readLW link

Progress links and short notes, 2024-12-27: Clini­cal trial abun­dance, grid-scale fu­sion, per­mit­ting vs. com­pli­ance, cross­word ma­nia, and more

jasoncrawford27 Dec 2024 23:34 UTC
11 points
0 comments2 min readLW link
(newsletter.rootsofprogress.org)

Greedy-Ad­van­tage-Aware RLHF

sej202027 Dec 2024 19:47 UTC
48 points
15 comments13 min readLW link

De­con­struct­ing ar­gu­ments against AI art

DMMF27 Dec 2024 19:40 UTC
7 points
5 comments5 min readLW link
(danfrank.ca)

From the Archives: a story

Richard_Ngo27 Dec 2024 16:36 UTC
20 points
1 comment16 min readLW link
(www.narrativeark.xyz)

[Question] What’s the best met­ric for mea­sur­ing qual­ity of life?

ChristianKl27 Dec 2024 14:29 UTC
10 points
5 comments1 min readLW link

Re­view: Planecrash

L Rudolf L27 Dec 2024 14:18 UTC
360 points
45 comments22 min readLW link
(nosetgauge.substack.com)

Good For­tune and Many Worlds

Jonah Wilberg27 Dec 2024 13:21 UTC
4 points
0 comments5 min readLW link

Let­ter from an Alien Mind

Shoshannah Tekofsky27 Dec 2024 13:20 UTC
23 points
7 comments3 min readLW link
(open.substack.com)