WISDOMISM A Mo­ral The­ory for the Age of Information

Peter lawless Apr 19, 2024, 11:06 PM
2 points
0 comments9 min readLW link

In­duc­ing Un­prompted Misal­ign­ment in LLMs

Apr 19, 2024, 8:00 PM
38 points
7 comments16 min readLW link

Introspection

A*Apr 19, 2024, 7:10 PM
7 points
0 comments1 min readLW link

[Full Post] Progress Up­date #1 from the GDM Mech In­terp Team

Apr 19, 2024, 7:06 PM
79 points
10 comments8 min readLW link

[Sum­mary] Progress Up­date #1 from the GDM Mech In­terp Team

Apr 19, 2024, 7:06 PM
72 points
0 comments3 min readLW link

Daniel Den­nett has died (1942-2024)

kaveApr 19, 2024, 4:17 PM
150 points
5 comments1 min readLW link
(dailynous.com)

Events Book­ing New Callers?

jefftkApr 19, 2024, 3:50 PM
9 points
0 comments1 min readLW link
(www.jefftk.com)

[Question] What is the best way to talk about prob­a­bil­ities you ex­pect to change with ev­i­dence/​ex­per­i­ments?

Will_PearsonApr 19, 2024, 3:35 PM
14 points
11 comments1 min readLW link

CTMU in­sight: maybe con­scious­ness *can* af­fect quan­tum out­comes?

zhukeepaApr 19, 2024, 3:23 PM
13 points
11 comments5 min readLW link

De­mon­strate and eval­u­ate risks from AI to so­ciety at the AI x Democ­racy re­search hackathon

Esben KranApr 19, 2024, 2:46 PM
5 points
0 commentsLW link
(www.apartresearch.com)

[Question] How to Model the Fu­ture of Open-Source LLMs?

Joel BurgetApr 19, 2024, 2:28 PM
25 points
9 comments1 min readLW link

What’s up with all the non-Mor­mons? Weirdly spe­cific uni­ver­sal­ities across LLMs

mwatkinsApr 19, 2024, 1:43 PM
40 points
13 comments27 min readLW link

[Question] If digi­tal goods in vir­tual wor­lds in­crease GDP, do we ac­tu­ally be­come richer?

No77eApr 19, 2024, 10:06 AM
10 points
14 comments1 min readLW link

Ex­per­i­ment on re­peat­ing choices

KatjaGraceApr 19, 2024, 4:20 AM
56 points
1 comment3 min readLW link
(worldspiritsockpuppet.com)

Effec­tive Altru­ists and Ra­tion­al­ists Views & The case for us­ing mar­ket­ing to high­light AI risks.

gilchApr 19, 2024, 4:16 AM
6 points
1 comment1 min readLW link
(youtu.be)

Co­he­sion and busi­ness problems

Adam ZernerApr 19, 2024, 12:45 AM
12 points
8 comments4 min readLW link

The Ther­mo­dy­nam­ics of Death

Peter lawless Apr 19, 2024, 12:36 AM
6 points
0 comments10 min readLW link

Back­yard Office

jefftkApr 19, 2024, 12:31 AM
13 points
0 comments1 min readLW link
(www.jefftk.com)

hy­dro­gen tube transport

bhauthApr 18, 2024, 10:47 PM
34 points
12 comments5 min readLW link
(www.bhauth.com)

LessOn­line Fes­ti­val Up­dates Thread

Ben PaceApr 18, 2024, 9:55 PM
59 points
26 comments1 min readLW link

A Re­view of In-Con­text Learn­ing Hy­pothe­ses for Au­to­mated AI Align­ment Research

alamertonApr 18, 2024, 6:29 PM
25 points
4 comments16 min readLW link

I’m open for pro­jects (sort of)

cousin_itApr 18, 2024, 6:05 PM
46 points
13 comments1 min readLW link

Blessed in­for­ma­tion, garbage in­for­ma­tion, cursed information

tailcalledApr 18, 2024, 4:56 PM
23 points
8 comments3 min readLW link

[Fic­tion] A Confession

Arjun PanicksseryApr 18, 2024, 4:28 PM
38 points
2 comments5 min readLW link
(arjunpanickssery.substack.com)

Discrim­i­nat­ing Be­hav­iorally Iden­ti­cal Clas­sifiers: a model prob­lem for ap­ply­ing in­ter­pretabil­ity to scal­able oversight

Sam MarksApr 18, 2024, 4:17 PM
113 points
10 comments12 min readLW link

Co­op­er­a­tion is op­ti­mal, with weaker agents too  -  tldr

Ryo Apr 18, 2024, 3:03 PM
12 points
22 comments4 min readLW link
(medium.com)

How to co­or­di­nate de­spite our bi­ases? - tldr

Ryo Apr 18, 2024, 3:03 PM
3 points
2 comments3 min readLW link
(medium.com)

Knowl­edge Base 7: Long-tail knowl­edge and col­lec­tive intelligence

iwisApr 18, 2024, 2:21 PM
−6 points
0 comments1 min readLW link

AI #60: Oh the Humanity

ZviApr 18, 2024, 2:10 PM
44 points
7 comments62 min readLW link
(thezvi.wordpress.com)

UDT1.01: Log­i­cal In­duc­tors and Im­plicit Beliefs (5/​10)

DiffractorApr 18, 2024, 8:39 AM
34 points
2 comments19 min readLW link

An ex­am­i­na­tion of GPT-2′s bor­ing yet effec­tive glitch

MiguelDevApr 18, 2024, 5:26 AM
5 points
3 comments3 min readLW link

[Question] What if Ethics is Prov­ably Self-Con­tra­dic­tory?

YitzApr 18, 2024, 5:12 AM
3 points
7 comments2 min readLW link

The Mom Test: Sum­mary and Thoughts

Adam ZernerApr 18, 2024, 3:34 AM
48 points
3 comments10 min readLW link

Ex­press in­ter­est in an “FHI of the West”

habrykaApr 18, 2024, 3:32 AM
268 points
41 comments3 min readLW link

Why Would Belief-States Have A Frac­tal Struc­ture, And Why Would That Mat­ter For In­ter­pretabil­ity? An Explainer

Apr 18, 2024, 12:27 AM
185 points
21 comments7 min readLW link

AXRP Epi­sode 28 - Su­ing Labs for AI Risk with Gabriel Weil

DanielFilanApr 17, 2024, 9:42 PM
12 points
0 comments65 min readLW link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

Apr 17, 2024, 9:09 PM
45 points
1 comment3 min readLW link
(tiny.cc)

SFS: Foun­da­tions of Forecasting

MAD2Apr 17, 2024, 5:46 PM
3 points
0 comments1 min readLW link

An eth­i­cal frame­work to su­per­sede Utilitarianism

metalcrowApr 17, 2024, 5:18 PM
1 point
4 comments4 min readLW link

Mov­ing on from com­mu­nity living

VikaApr 17, 2024, 5:02 PM
63 points
7 comments3 min readLW link
(vkrakovna.wordpress.com)

Staged release

Zach Stein-PerlmanApr 17, 2024, 4:00 PM
11 points
4 comments2 min readLW link

[Question] Dis­com­fort Stacking

Lewis O’BrienApr 17, 2024, 2:49 PM
5 points
12 comments1 min readLW link

FHI (Fu­ture of Hu­man­ity In­sti­tute) has shut down (2005–2024)

gwernApr 17, 2024, 1:54 PM
176 points
22 comments1 min readLW link
(www.futureofhumanityinstitute.org)

Child­hood and Ed­u­ca­tion Roundup #5

ZviApr 17, 2024, 1:00 PM
37 points
3 comments25 min readLW link
(thezvi.wordpress.com)

Should we max­i­mize the Geo­met­ric Ex­pec­ta­tion of Utility?

A.H.Apr 17, 2024, 10:37 AM
5 points
17 comments9 min readLW link

Claude 3 Opus can op­er­ate as a Tur­ing machine

Gunnar_ZarnckeApr 17, 2024, 8:41 AM
36 points
2 comments1 min readLW link
(twitter.com)

When is a mind me?

Rob BensingerApr 17, 2024, 5:56 AM
144 points
130 comments15 min readLW link

Mid-con­di­tional love

KatjaGraceApr 17, 2024, 4:00 AM
76 points
21 comments2 min readLW link
(worldspiritsockpuppet.com)

Spend­ing Up­date 2024

jefftkApr 17, 2024, 2:30 AM
20 points
2 comments3 min readLW link
(www.jefftk.com)

Anti MMAcevedo Protocol

Logan ZoellnerApr 16, 2024, 10:32 PM
1 point
1 comment8 min readLW link