Everett branches, in­ter-light cone trade and other alien mat­ters: Ap­pendix to “An ECL ex­plainer”

24 Feb 2024 23:09 UTC
17 points
0 comments11 min readLW link

Co­op­er­at­ing with aliens and AGIs: An ECL explainer

24 Feb 2024 22:58 UTC
57 points
8 comments20 min readLW link

Choos­ing My Quest (Part 2 of “The Sense Of Phys­i­cal Ne­ces­sity”)

LoganStrohl24 Feb 2024 21:31 UTC
40 points
7 comments12 min readLW link

Ra­tion­al­ity Re­search Re­port: Towards 10x OODA Loop­ing?

Raemon24 Feb 2024 21:06 UTC
117 points
26 comments15 min readLW link

Ex­er­cise: Plan­mak­ing, Sur­prise An­ti­ci­pa­tion, and “Baba is You”

Raemon24 Feb 2024 20:33 UTC
69 points
31 comments6 min readLW link

In search of God.

Spiritus Dei24 Feb 2024 18:59 UTC
−19 points
3 comments7 min readLW link

Im­pos­si­bil­ity of An­thro­pocen­tric-Alignment

False Name24 Feb 2024 18:31 UTC
−8 points
2 comments39 min readLW link

The In­ner Align­ment Problem

Jakub Halmeš24 Feb 2024 17:55 UTC
1 point
1 comment3 min readLW link
(jakubhalmes.substack.com)

We Need Ma­jor, But Not Rad­i­cal, FDA Reform

Maxwell Tabarrok24 Feb 2024 16:54 UTC
42 points
12 comments7 min readLW link
(www.maximum-progress.com)

After Over­mor­row: Scat­tered Mus­ings on the Im­me­di­ate Post-AGI World

Yuli_Ban24 Feb 2024 15:49 UTC
−3 points
0 comments26 min readLW link

[Question] CDT vs. EDT on Deterrence

Terence Coelho24 Feb 2024 15:41 UTC
1 point
9 comments1 min readLW link

Balanc­ing Games

jefftk24 Feb 2024 14:40 UTC
62 points
18 comments1 min readLW link
(www.jefftk.com)

How well do truth probes gen­er­al­ise?

mishajw24 Feb 2024 14:12 UTC
96 points
11 comments9 min readLW link

Rawls’s Veil of Ig­no­rance Doesn’t Make Any Sense

Arjun Panickssery24 Feb 2024 13:18 UTC
9 points
9 comments1 min readLW link

[Question] Can some­one ex­plain to me what went wrong with ChatGPT?

Valentin Baltadzhiev24 Feb 2024 11:50 UTC
9 points
1 comment1 min readLW link

The Sense Of Phys­i­cal Ne­ces­sity: A Nat­u­ral­ism Demo (In­tro­duc­tion)

LoganStrohl24 Feb 2024 2:56 UTC
59 points
1 comment6 min readLW link

In­stru­men­tal de­cep­tion and ma­nipu­la­tion in LLMs—a case study

Olli Järviniemi24 Feb 2024 2:07 UTC
39 points
13 comments12 min readLW link

A start­ing point for mak­ing sense of task struc­ture (in ma­chine learn­ing)

24 Feb 2024 1:51 UTC
51 points
2 comments12 min readLW link

Why you, per­son­ally, should want a larger hu­man population

jasoncrawford23 Feb 2024 19:48 UTC
32 points
32 comments5 min readLW link
(rootsofprogress.org)

De­liber­a­tive Cog­ni­tive Al­gorithms as Scaffolding

Cole Wyeth23 Feb 2024 17:15 UTC
20 points
4 comments3 min readLW link

The Shut­down Prob­lem: In­com­plete Prefer­ences as a Solution

EJT23 Feb 2024 16:01 UTC
54 points
33 comments42 min readLW link

In set the­ory, ev­ery­thing is a set

Jacob G-W23 Feb 2024 14:35 UTC
11 points
9 comments2 min readLW link

The role of philo­soph­i­cal think­ing in un­der­stand­ing large lan­guage mod­els: Cal­ibrat­ing and clos­ing the gap be­tween first-per­son ex­pe­rience and un­der­ly­ing mechanisms

Bill Benzon23 Feb 2024 12:19 UTC
4 points
0 comments10 min readLW link

Deep and ob­vi­ous points in the gap be­tween your thoughts and your pic­tures of thought

KatjaGrace23 Feb 2024 7:30 UTC
42 points
6 comments1 min readLW link
(worldspiritsockpuppet.com)

Paraso­cial re­la­tion­ship logic

KatjaGrace23 Feb 2024 7:30 UTC
20 points
1 comment1 min readLW link
(worldspiritsockpuppet.com)

Sham­ing with and with­out naming

KatjaGrace23 Feb 2024 7:30 UTC
17 points
5 comments2 min readLW link
(worldspiritsockpuppet.com)

Com­plex­ity of value but not dis­value im­plies more fo­cus on s-risk. Mo­ral un­cer­tainty and prefer­ence util­i­tar­i­anism also do.

Chi Nguyen23 Feb 2024 6:10 UTC
53 points
18 comments2 min readLW link

[Question] Does in­creas­ing the power of a mul­ti­modal LLM get you an agen­tic AI?

yanni kyriacos23 Feb 2024 4:14 UTC
3 points
3 comments1 min readLW link

Pop­u­lar con­cep­tions of “bound­aries” don’t make sense

Chris Lakin23 Feb 2024 1:09 UTC
26 points
3 comments1 min readLW link1 review
(chrislakin.blog)

Con­tra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki Heicklen22 Feb 2024 23:56 UTC
188 points
5 comments4 min readLW link
(bayesshammai.substack.com)

AI #52: Oops

Zvi22 Feb 2024 21:50 UTC
50 points
9 comments29 min readLW link
(thezvi.wordpress.com)

Embed your sec­ond brain in your first brain

dkl922 Feb 2024 21:46 UTC
10 points
3 comments1 min readLW link
(dkl9.net)

The Gem­ini Incident

Zvi22 Feb 2024 21:00 UTC
80 points
19 comments18 min readLW link
(thezvi.wordpress.com)

Some Thoughts On Us­ing Auc­tions For Land Valuation

harsimony22 Feb 2024 19:54 UTC
0 points
9 comments9 min readLW link
(progressandpoverty.substack.com)

The Bind­ing of Isaac & Trans­par­ent New­comb’s Prob­lem

suvjectibity22 Feb 2024 18:56 UTC
−10 points
0 comments10 min readLW link

Lan­guage Models Don’t Learn the Phys­i­cal Man­i­fes­ta­tion of Language

22 Feb 2024 18:52 UTC
39 points
23 comments1 min readLW link
(arxiv.org)

Sora What

Zvi22 Feb 2024 18:10 UTC
47 points
3 comments9 min readLW link
(thezvi.wordpress.com)

Do sparse au­toen­coders find “true fea­tures”?

Demian Till22 Feb 2024 18:06 UTC
75 points
33 comments11 min readLW link

Every­thing Wrong with Roko’s Claims about an Eng­ineered Pandemic

WitheringWeights22 Feb 2024 15:59 UTC
97 points
10 comments16 min readLW link

The One and a Half Gemini

Zvi22 Feb 2024 13:10 UTC
73 points
4 comments8 min readLW link
(thezvi.wordpress.com)

[Question] How do I make pre­dic­tions about the fu­ture to make sense of what to do with my life?

Raj Thimmiah22 Feb 2024 11:22 UTC
8 points
1 comment1 min readLW link

How are vol­un­tary com­mit­ments on vuln­er­a­bil­ity re­port­ing go­ing?

Adam Jones22 Feb 2024 8:43 UTC
23 points
1 comment1 min readLW link
(adamjones.me)

Notes on In­ter­nal Ob­jec­tives in Toy Models of Agents

Paul Colognese22 Feb 2024 8:02 UTC
16 points
0 comments8 min readLW link

The By­ronic Hero Always Loses

Cole Wyeth22 Feb 2024 1:31 UTC
32 points
4 comments2 min readLW link

Job List­ing: Manag­ing Edi­tor /​ Writer

Gretta Duleba21 Feb 2024 23:41 UTC
43 points
2 comments1 min readLW link

The Pareto Best and the Curse of Doom

Screwtape21 Feb 2024 23:10 UTC
123 points
21 comments9 min readLW link

AISN #31: A New AI Policy Bill in Cal­ifor­nia Plus, Prece­dents for AI Gover­nance and The EU AI Office

Dan H21 Feb 2024 21:58 UTC
17 points
0 comments6 min readLW link
(newsletter.safe.ai)

Analo­gies be­tween scal­ing labs and mis­al­igned su­per­in­tel­li­gent AI

scasper21 Feb 2024 19:29 UTC
77 points
5 comments4 min readLW link

Ex­tinc­tion Risks from AI: In­visi­ble to Science?

21 Feb 2024 18:07 UTC
24 points
7 comments1 min readLW link
(arxiv.org)

Ex­tinc­tion-level Good­hart’s Law as a Prop­erty of the Environment

21 Feb 2024 17:56 UTC
23 points
0 comments10 min readLW link