Everett branches, in­ter-light cone trade and other alien mat­ters: Ap­pendix to “An ECL ex­plainer”

24 Feb 2024 23:09 UTC
17 points
0 comments1 min readLW link

Co­op­er­at­ing with aliens and AGIs: An ECL explainer

24 Feb 2024 22:58 UTC
53 points
8 comments1 min readLW link

Choos­ing My Quest (Part 2 of “The Sense Of Phys­i­cal Ne­ces­sity”)

LoganStrohl24 Feb 2024 21:31 UTC
40 points
7 comments12 min readLW link

Ra­tion­al­ity Re­search Re­port: Towards 10x OODA Loop­ing?

Raemon24 Feb 2024 21:06 UTC
112 points
21 comments15 min readLW link

Let’s ask some of the largest LLMs for tips and ideas on how to take over the world

Super AGI24 Feb 2024 20:35 UTC
1 point
0 comments7 min readLW link

Ex­er­cise: Plan­mak­ing, Sur­prise An­ti­ci­pa­tion, and “Baba is You”

Raemon24 Feb 2024 20:33 UTC
41 points
15 comments6 min readLW link

In search of God.

Spiritus Dei24 Feb 2024 18:59 UTC
−19 points
3 comments7 min readLW link

Im­pos­si­bil­ity of An­thro­pocen­tric-Alignment

False Name24 Feb 2024 18:31 UTC
−8 points
2 comments39 min readLW link

The In­ner Align­ment Problem

Jakub Halmeš24 Feb 2024 17:55 UTC
1 point
1 comment3 min readLW link
(jakubhalmes.substack.com)

We Need Ma­jor, But Not Rad­i­cal, FDA Reform

Maxwell Tabarrok24 Feb 2024 16:54 UTC
42 points
12 comments7 min readLW link
(www.maximum-progress.com)

After Over­mor­row: Scat­tered Mus­ings on the Im­me­di­ate Post-AGI World

Yuli_Ban24 Feb 2024 15:49 UTC
−3 points
0 comments26 min readLW link

[Question] CDT vs. EDT on Deterrence

notfnofn24 Feb 2024 15:41 UTC
1 point
9 comments1 min readLW link

Balanc­ing Games

jefftk24 Feb 2024 14:40 UTC
61 points
18 comments1 min readLW link
(www.jefftk.com)

How well do truth probes gen­er­al­ise?

mishajw24 Feb 2024 14:12 UTC
87 points
11 comments9 min readLW link

Rawls’s Veil of Ig­no­rance Doesn’t Make Any Sense

Arjun Panickssery24 Feb 2024 13:18 UTC
9 points
9 comments1 min readLW link

[Question] Can some­one ex­plain to me what went wrong with ChatGPT?

Valentin Baltadzhiev24 Feb 2024 11:50 UTC
9 points
1 comment1 min readLW link

The Sense Of Phys­i­cal Ne­ces­sity: A Nat­u­ral­ism Demo (In­tro­duc­tion)

LoganStrohl24 Feb 2024 2:56 UTC
59 points
1 comment6 min readLW link

In­stru­men­tal de­cep­tion and ma­nipu­la­tion in LLMs—a case study

Olli Järviniemi24 Feb 2024 2:07 UTC
39 points
13 comments12 min readLW link

A start­ing point for mak­ing sense of task struc­ture (in ma­chine learn­ing)

24 Feb 2024 1:51 UTC
38 points
2 comments12 min readLW link

Why you, per­son­ally, should want a larger hu­man population

jasoncrawford23 Feb 2024 19:48 UTC
32 points
32 comments5 min readLW link
(rootsofprogress.org)

De­liber­a­tive Cog­ni­tive Al­gorithms as Scaffolding

Cole Wyeth23 Feb 2024 17:15 UTC
18 points
4 comments3 min readLW link

The Shut­down Prob­lem: In­com­plete Prefer­ences as a Solution

EJT23 Feb 2024 16:01 UTC
50 points
21 comments41 min readLW link

In set the­ory, ev­ery­thing is a set

Jacob G-W23 Feb 2024 14:35 UTC
11 points
9 comments2 min readLW link

The role of philo­soph­i­cal think­ing in un­der­stand­ing large lan­guage mod­els: Cal­ibrat­ing and clos­ing the gap be­tween first-per­son ex­pe­rience and un­der­ly­ing mechanisms

Bill Benzon23 Feb 2024 12:19 UTC
4 points
0 comments10 min readLW link

Deep and ob­vi­ous points in the gap be­tween your thoughts and your pic­tures of thought

KatjaGrace23 Feb 2024 7:30 UTC
42 points
6 comments1 min readLW link
(worldspiritsockpuppet.com)

Paraso­cial re­la­tion­ship logic

KatjaGrace23 Feb 2024 7:30 UTC
20 points
1 comment1 min readLW link
(worldspiritsockpuppet.com)

Sham­ing with and with­out naming

KatjaGrace23 Feb 2024 7:30 UTC
15 points
5 comments2 min readLW link
(worldspiritsockpuppet.com)

Com­plex­ity of value but not dis­value im­plies more fo­cus on s-risk. Mo­ral un­cer­tainty and prefer­ence util­i­tar­i­anism also do.

Chi Nguyen23 Feb 2024 6:10 UTC
54 points
18 comments1 min readLW link

[Question] Does in­creas­ing the power of a mul­ti­modal LLM get you an agen­tic AI?

yanni kyriacos23 Feb 2024 4:14 UTC
3 points
3 comments1 min readLW link

The nat­u­ral bound­aries be­tween people

Chipmonk23 Feb 2024 1:09 UTC
20 points
2 comments8 min readLW link
(chipmonk.substack.com)

Con­tra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki Heicklen22 Feb 2024 23:56 UTC
184 points
5 comments4 min readLW link
(bayesshammai.substack.com)

AI #52: Oops

Zvi22 Feb 2024 21:50 UTC
51 points
9 comments29 min readLW link
(thezvi.wordpress.com)

Embed your sec­ond brain in your first brain

dkl922 Feb 2024 21:46 UTC
10 points
3 comments1 min readLW link
(dkl9.net)

The Gem­ini Incident

Zvi22 Feb 2024 21:00 UTC
80 points
19 comments18 min readLW link
(thezvi.wordpress.com)

Some Thoughts On Us­ing Auc­tions For Land Valuation

harsimony22 Feb 2024 19:54 UTC
0 points
9 comments9 min readLW link
(progressandpoverty.substack.com)

The Bind­ing of Isaac & Trans­par­ent New­comb’s Prob­lem

suvjectibity22 Feb 2024 18:56 UTC
−11 points
0 comments10 min readLW link

Re­search Post: Tasks That Lan­guage Models Don’t Learn

22 Feb 2024 18:52 UTC
39 points
23 comments2 min readLW link
(arxiv.org)

Sora What

Zvi22 Feb 2024 18:10 UTC
46 points
3 comments9 min readLW link
(thezvi.wordpress.com)

Do sparse au­toen­coders find “true fea­tures”?

Demian Till22 Feb 2024 18:06 UTC
70 points
33 comments11 min readLW link

Every­thing Wrong with Roko’s Claims about an Eng­ineered Pandemic

EZ9722 Feb 2024 15:59 UTC
90 points
10 comments16 min readLW link

The One and a Half Gemini

Zvi22 Feb 2024 13:10 UTC
73 points
4 comments8 min readLW link
(thezvi.wordpress.com)

[Question] How do I make pre­dic­tions about the fu­ture to make sense of what to do with my life?

Raj Thimmiah22 Feb 2024 11:22 UTC
8 points
1 comment1 min readLW link

How are vol­un­tary com­mit­ments on vuln­er­a­bil­ity re­port­ing go­ing?

Adam Jones22 Feb 2024 8:43 UTC
23 points
1 comment1 min readLW link
(adamjones.me)

Notes on In­ter­nal Ob­jec­tives in Toy Models of Agents

Paul Colognese22 Feb 2024 8:02 UTC
16 points
0 comments8 min readLW link

The By­ronic Hero Always Loses

Cole Wyeth22 Feb 2024 1:31 UTC
31 points
4 comments2 min readLW link

Job List­ing: Manag­ing Edi­tor /​ Writer

Gretta Duleba21 Feb 2024 23:41 UTC
43 points
2 comments1 min readLW link

The Pareto Best and the Curse of Doom

Screwtape21 Feb 2024 23:10 UTC
108 points
22 comments9 min readLW link

AISN #31: A New AI Policy Bill in Cal­ifor­nia Plus, Prece­dents for AI Gover­nance and The EU AI Office

21 Feb 2024 21:58 UTC
9 points
0 comments6 min readLW link
(newsletter.safe.ai)

Analo­gies be­tween scal­ing labs and mis­al­igned su­per­in­tel­li­gent AI

scasper21 Feb 2024 19:29 UTC
72 points
4 comments4 min readLW link

Ex­tinc­tion Risks from AI: In­visi­ble to Science?

21 Feb 2024 18:07 UTC
24 points
7 comments1 min readLW link
(arxiv.org)