Mus­ings on Cargo Cult Consciousness

Gareth Davidson25 Jan 2024 23:00 UTC
−13 points
11 comments17 min readLW link

RAND re­port finds no effect of cur­rent LLMs on vi­a­bil­ity of bioter­ror­ism attacks

StellaAthena25 Jan 2024 19:17 UTC
94 points
14 comments1 min readLW link
(www.rand.org)

[Question] Bayesian Reflec­tion Prin­ci­ples and Ig­no­rance of the Future

crickets25 Jan 2024 19:00 UTC
5 points
3 comments1 min readLW link

“Does your paradigm beget new, good, paradigms?”

Raemon25 Jan 2024 18:23 UTC
38 points
5 comments2 min readLW link

AI #48: The Talk of Davos

Zvi25 Jan 2024 16:20 UTC
38 points
9 comments36 min readLW link
(thezvi.wordpress.com)

Im­port­ing a Python File by Name

jefftk25 Jan 2024 16:00 UTC
12 points
7 comments1 min readLW link
(www.jefftk.com)

[Re­post] The Copen­hagen In­ter­pre­ta­tion of Ethics

mesaoptimizer25 Jan 2024 15:20 UTC
70 points
3 comments5 min readLW link
(web.archive.org)

Nash Bar­gain­ing be­tween Subagents doesn’t solve the Shut­down Problem

A.H.25 Jan 2024 10:47 UTC
22 points
1 comment6 min readLW link

Sta­tus-ori­ented spending

Adam Zerner25 Jan 2024 6:46 UTC
14 points
19 comments4 min readLW link

Pro­tect­ing agent boundaries

Chipmonk25 Jan 2024 4:13 UTC
10 points
6 comments2 min readLW link

[Question] What sub­jects are un­ex­pect­edly high-util­ity?

FinalFormal225 Jan 2024 4:00 UTC
9 points
18 comments1 min readLW link

[Question] Is a ran­dom box of gas pre­dictable af­ter 20 sec­onds?

24 Jan 2024 23:00 UTC
37 points
35 comments1 min readLW link

[Question] Will quan­tum ran­dom­ness af­fect the 2028 elec­tion?

24 Jan 2024 22:54 UTC
63 points
48 comments1 min readLW link

AISN #30: In­vest­ments in Com­pute and Mili­tary AI Plus, Ja­pan and Sin­ga­pore’s Na­tional AI Safety Institutes

24 Jan 2024 19:38 UTC
27 points
1 comment6 min readLW link
(newsletter.safe.ai)

Krueger Lab AI Safety In­tern­ship 2024

Joey Bream24 Jan 2024 19:17 UTC
3 points
0 comments1 min readLW link

Agents that act for rea­sons: a thought experiment

Michele Campolo24 Jan 2024 16:47 UTC
3 points
0 comments3 min readLW link

Im­pact Assess­ment of AI Safety Camp (Arb Re­search)

Samuel Holton24 Jan 2024 16:19 UTC
11 points
0 comments11 min readLW link
(forum.effectivealtruism.org)

The case for en­sur­ing that pow­er­ful AIs are controlled

24 Jan 2024 16:11 UTC
246 points
66 comments28 min readLW link

LLMs can strate­gi­cally de­ceive while do­ing gain-of-func­tion re­search

Igor Ivanov24 Jan 2024 15:45 UTC
32 points
4 comments11 min readLW link

Monthly Roundup #14: Jan­uary 2024

Zvi24 Jan 2024 12:50 UTC
38 points
22 comments44 min readLW link
(thezvi.wordpress.com)

This might be the last AI Safety Camp

24 Jan 2024 9:33 UTC
181 points
33 comments1 min readLW link

Global LessWrong/​AC10 Meetup on VRChat

24 Jan 2024 5:44 UTC
15 points
2 comments1 min readLW link

Hu­mans aren’t fleeb.

Charlie Steiner24 Jan 2024 5:31 UTC
34 points
5 comments2 min readLW link

A Paradigm Shift in Sustainability

Jose Miguel Cruz y Celis23 Jan 2024 23:34 UTC
5 points
0 comments18 min readLW link

From Finite Fac­tors to Bayes Nets

J Bostock23 Jan 2024 20:03 UTC
38 points
7 comments8 min readLW link

In­sti­tu­tional eco­nomics through the lens of scale-free reg­u­la­tive de­vel­op­ment, mor­pho­gen­e­sis, and cog­ni­tive science

Roman Leventov23 Jan 2024 19:42 UTC
8 points
0 comments14 min readLW link

Mak­ing a Sec­u­lar Sols­tice Songbook

jefftk23 Jan 2024 19:40 UTC
38 points
6 comments1 min readLW link
(www.jefftk.com)

Sim­ple Appreciations

Jonathan Moregård23 Jan 2024 16:23 UTC
17 points
11 comments4 min readLW link
(open.substack.com)

[Question] What en­vi­ron­men­tal cues had you not seen them would have ended in dis­aster?

koratkar23 Jan 2024 14:59 UTC
11 points
1 comment1 min readLW link

Loneli­ness and suicide miti­ga­tion for stu­dents us­ing GPT3-en­abled chat­bots (sur­vey of Replika users in Na­ture)

Kaj_Sotala23 Jan 2024 14:05 UTC
45 points
2 comments2 min readLW link
(www.nature.com)

“Safety as a Scien­tific Pur­suit” (2024)

technicalities23 Jan 2024 12:40 UTC
14 points
3 comments2 min readLW link
(banburismus.substack.com)

Brain­storm­ing: Slow Takeoff

David Piepgrass23 Jan 2024 6:58 UTC
2 points
0 comments51 min readLW link

Refram­ing Acausal Trol­ling as Acausal Patronage

StrivingForLegibility23 Jan 2024 3:04 UTC
14 points
0 comments2 min readLW link

Orthog­o­nal­ity or the “Hu­man Worth Hy­poth­e­sis”?

Jeffs23 Jan 2024 0:57 UTC
21 points
31 comments3 min readLW link

the sub­red­dit size threshold

bhauth23 Jan 2024 0:38 UTC
32 points
3 comments4 min readLW link
(www.bhauth.com)

Start­ing in mechanis­tic interpretability

Jakub Smékal22 Jan 2024 23:40 UTC
1 point
0 comments3 min readLW link
(jakubsmekal.com)

We need a Science of Evals

22 Jan 2024 20:30 UTC
66 points
13 comments9 min readLW link

An­nounc­ing the SoS Re­search Col­lec­tive for in­de­pen­dent re­searchers (and aca­demics think­ing in­de­pen­dently)

rogersbacon22 Jan 2024 20:13 UTC
15 points
0 comments8 min readLW link
(www.theseedsofscience.pub)

A Brief Assess­ment of OpenAI’s Pre­pared­ness Frame­work & Some Sugges­tions for Improvement

simeon_c22 Jan 2024 20:08 UTC
14 points
0 comments6 min readLW link
(uploads-ssl.webflow.com)

D&D.Sci(-fi): Coloniz­ing the Su­perHyper­Sphere [Eval­u­a­tion and Rule­set]

abstractapplic22 Jan 2024 19:20 UTC
38 points
7 comments3 min readLW link

′ pe­ter­todd’’s last stand: The fi­nal days of open GPT-3 research

mwatkins22 Jan 2024 18:47 UTC
108 points
16 comments45 min readLW link

In­terLab – a toolkit for ex­per­i­ments with multi-agent interactions

22 Jan 2024 18:23 UTC
69 points
0 comments8 min readLW link
(acsresearch.org)

San Fer­nando Valley Ra­tion­al­ist Meetup

Thomas Broadley22 Jan 2024 16:49 UTC
3 points
1 comment1 min readLW link

Who Or­ga­nizes Dances?

jefftk22 Jan 2024 14:30 UTC
12 points
0 comments1 min readLW link
(www.jefftk.com)

Values Darwinism

pchvykov22 Jan 2024 10:44 UTC
11 points
13 comments3 min readLW link

[Question] The akra­sia doom loop and ex­ec­u­tive func­tion di­s­or­ders: a question

TeaTieAndHat22 Jan 2024 7:01 UTC
16 points
7 comments2 min readLW link

Pre­dict­ing AGI by the Tur­ing Test

Yuxi_Liu22 Jan 2024 4:22 UTC
21 points
2 comments10 min readLW link
(yuxi-liu-wired.github.io)

In­cor­po­rat­ing Jus­tice The­ory into De­ci­sion Theory

StrivingForLegibility21 Jan 2024 19:17 UTC
13 points
20 comments5 min readLW link

De­liber­ate Dy­sen­tery: Q&A about Hu­man Challenge Trials

Niko_McCarty21 Jan 2024 19:05 UTC
16 points
1 comment18 min readLW link
(www.asimov.press)

When Does Altru­ism Strengthen Altru­ism?

jefftk21 Jan 2024 18:50 UTC
44 points
2 comments3 min readLW link
(www.jefftk.com)