RSS

The Sense Of Phys­i­cal Ne­ces­sity: A Nat­u­ral­ism Demo (In­tro­duc­tion)

LoganStrohl24 Feb 2024 2:56 UTC
31 points
0 comments6 min readLW link

In­stru­men­tal de­cep­tion and ma­nipu­la­tion in LLMs—a case study

Olli Järviniemi24 Feb 2024 2:07 UTC
23 points
0 comments12 min readLW link

A start­ing point for mak­ing sense of task struc­ture (in ma­chine learn­ing)

24 Feb 2024 1:51 UTC
18 points
0 comments12 min readLW link

Why you, per­son­ally, should want a larger hu­man population

jasoncrawford23 Feb 2024 19:48 UTC
20 points
7 comments5 min readLW link
(rootsofprogress.org)

De­liber­a­tive Cog­ni­tive Al­gorithms as Scaffolding

Cole Wyeth23 Feb 2024 17:15 UTC
16 points
4 comments3 min readLW link

The Shut­down Prob­lem: In­com­plete Prefer­ences as a Solution

EJT23 Feb 2024 16:01 UTC
40 points
1 comment41 min readLW link

In set the­ory, ev­ery­thing is a set

g-w123 Feb 2024 14:35 UTC
11 points
9 comments2 min readLW link

The role of philo­soph­i­cal think­ing in un­der­stand­ing large lan­guage mod­els: Cal­ibrat­ing and clos­ing the gap be­tween first-per­son ex­pe­rience and un­der­ly­ing mechanisms

Bill Benzon23 Feb 2024 12:19 UTC
4 points
0 comments10 min readLW link

Deep and ob­vi­ous points in the gap be­tween your thoughts and your pic­tures of thought

KatjaGrace23 Feb 2024 7:30 UTC
28 points
5 comments1 min readLW link
(worldspiritsockpuppet.com)

Paraso­cial re­la­tion­ship logic

KatjaGrace23 Feb 2024 7:30 UTC
20 points
1 comment1 min readLW link
(worldspiritsockpuppet.com)

Sham­ing with and with­out naming

KatjaGrace23 Feb 2024 7:30 UTC
13 points
5 comments2 min readLW link
(worldspiritsockpuppet.com)

Com­plex­ity of value but not dis­value im­plies more fo­cus on s-risk. Mo­ral un­cer­tainty and prefer­ence util­i­tar­i­anism also do.

Chi Nguyen23 Feb 2024 6:10 UTC
34 points
12 comments1 min readLW link

[Question] Does in­creas­ing the power of a mul­ti­modal LLM get you an agen­tic AI?

yanni23 Feb 2024 4:14 UTC
3 points
3 comments1 min readLW link

The nat­u­ral bound­aries be­tween people

Chipmonk23 Feb 2024 1:09 UTC
17 points
2 comments8 min readLW link
(chipmonk.substack.com)

Embed your sec­ond brain in your first brain

dkl922 Feb 2024 21:46 UTC
10 points
3 comments1 min readLW link
(dkl9.net)

Some Thoughts On Us­ing Auc­tions For Land Valuation

harsimony22 Feb 2024 19:54 UTC
0 points
8 comments9 min readLW link
(progressandpoverty.substack.com)

The Bind­ing of Isaac & Trans­par­ent New­comb’s Prob­lem

suvjectibity22 Feb 2024 18:56 UTC
−11 points
0 comments10 min readLW link

Re­search Post: Tasks That Lan­guage Models Don’t Learn

22 Feb 2024 18:52 UTC
36 points
21 comments2 min readLW link
(arxiv.org)

Do sparse au­toen­coders find “true fea­tures”?

Demian Till22 Feb 2024 18:06 UTC
64 points
18 comments11 min readLW link

Every­thing Wrong with Roko’s Claims about an Eng­ineered Pandemic

EZ9722 Feb 2024 15:59 UTC
70 points
4 comments16 min readLW link