Why you, per­son­ally, should want a larger hu­man population

jasoncrawford23 Feb 2024 19:48 UTC
32 points
32 comments5 min readLW link
(rootsofprogress.org)

De­liber­a­tive Cog­ni­tive Al­gorithms as Scaffolding

Cole Wyeth23 Feb 2024 17:15 UTC
18 points
4 comments3 min readLW link

The Shut­down Prob­lem: In­com­plete Prefer­ences as a Solution

EJT23 Feb 2024 16:01 UTC
50 points
22 comments41 min readLW link

In set the­ory, ev­ery­thing is a set

Jacob G-W23 Feb 2024 14:35 UTC
11 points
9 comments2 min readLW link

The role of philo­soph­i­cal think­ing in un­der­stand­ing large lan­guage mod­els: Cal­ibrat­ing and clos­ing the gap be­tween first-per­son ex­pe­rience and un­der­ly­ing mechanisms

Bill Benzon23 Feb 2024 12:19 UTC
4 points
0 comments10 min readLW link

Deep and ob­vi­ous points in the gap be­tween your thoughts and your pic­tures of thought

KatjaGrace23 Feb 2024 7:30 UTC
42 points
6 comments1 min readLW link
(worldspiritsockpuppet.com)

Paraso­cial re­la­tion­ship logic

KatjaGrace23 Feb 2024 7:30 UTC
20 points
1 comment1 min readLW link
(worldspiritsockpuppet.com)

Sham­ing with and with­out naming

KatjaGrace23 Feb 2024 7:30 UTC
15 points
5 comments2 min readLW link
(worldspiritsockpuppet.com)

Com­plex­ity of value but not dis­value im­plies more fo­cus on s-risk. Mo­ral un­cer­tainty and prefer­ence util­i­tar­i­anism also do.

Chi Nguyen23 Feb 2024 6:10 UTC
54 points
18 comments1 min readLW link

[Question] Does in­creas­ing the power of a mul­ti­modal LLM get you an agen­tic AI?

yanni kyriacos23 Feb 2024 4:14 UTC
3 points
3 comments1 min readLW link

The nat­u­ral bound­aries be­tween people

Chipmonk23 Feb 2024 1:09 UTC
20 points
2 comments8 min readLW link
(chipmonk.substack.com)

Con­tra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki Heicklen22 Feb 2024 23:56 UTC
184 points
5 comments4 min readLW link
(bayesshammai.substack.com)

AI #52: Oops

Zvi22 Feb 2024 21:50 UTC
51 points
9 comments29 min readLW link
(thezvi.wordpress.com)

Embed your sec­ond brain in your first brain

dkl922 Feb 2024 21:46 UTC
10 points
3 comments1 min readLW link
(dkl9.net)

The Gem­ini Incident

Zvi22 Feb 2024 21:00 UTC
80 points
19 comments18 min readLW link
(thezvi.wordpress.com)

Some Thoughts On Us­ing Auc­tions For Land Valuation

harsimony22 Feb 2024 19:54 UTC
0 points
9 comments9 min readLW link
(progressandpoverty.substack.com)

The Bind­ing of Isaac & Trans­par­ent New­comb’s Prob­lem

suvjectibity22 Feb 2024 18:56 UTC
−11 points
0 comments10 min readLW link

Re­search Post: Tasks That Lan­guage Models Don’t Learn

22 Feb 2024 18:52 UTC
39 points
23 comments2 min readLW link
(arxiv.org)

Sora What

Zvi22 Feb 2024 18:10 UTC
46 points
3 comments9 min readLW link
(thezvi.wordpress.com)

Do sparse au­toen­coders find “true fea­tures”?

Demian Till22 Feb 2024 18:06 UTC
70 points
33 comments11 min readLW link

Every­thing Wrong with Roko’s Claims about an Eng­ineered Pandemic

EZ9722 Feb 2024 15:59 UTC
90 points
10 comments16 min readLW link

The One and a Half Gemini

Zvi22 Feb 2024 13:10 UTC
73 points
4 comments8 min readLW link
(thezvi.wordpress.com)

[Question] How do I make pre­dic­tions about the fu­ture to make sense of what to do with my life?

Raj Thimmiah22 Feb 2024 11:22 UTC
8 points
1 comment1 min readLW link

How are vol­un­tary com­mit­ments on vuln­er­a­bil­ity re­port­ing go­ing?

Adam Jones22 Feb 2024 8:43 UTC
23 points
1 comment1 min readLW link
(adamjones.me)

Notes on In­ter­nal Ob­jec­tives in Toy Models of Agents

Paul Colognese22 Feb 2024 8:02 UTC
16 points
0 comments8 min readLW link

The By­ronic Hero Always Loses

Cole Wyeth22 Feb 2024 1:31 UTC
31 points
4 comments2 min readLW link

Job List­ing: Manag­ing Edi­tor /​ Writer

Gretta Duleba21 Feb 2024 23:41 UTC
43 points
2 comments1 min readLW link

The Pareto Best and the Curse of Doom

Screwtape21 Feb 2024 23:10 UTC
110 points
22 comments9 min readLW link

AISN #31: A New AI Policy Bill in Cal­ifor­nia Plus, Prece­dents for AI Gover­nance and The EU AI Office

21 Feb 2024 21:58 UTC
17 points
0 comments6 min readLW link
(newsletter.safe.ai)

Analo­gies be­tween scal­ing labs and mis­al­igned su­per­in­tel­li­gent AI

scasper21 Feb 2024 19:29 UTC
74 points
5 comments4 min readLW link

Ex­tinc­tion Risks from AI: In­visi­ble to Science?

21 Feb 2024 18:07 UTC
24 points
7 comments1 min readLW link
(arxiv.org)

Ex­tinc­tion-level Good­hart’s Law as a Prop­erty of the Environment

21 Feb 2024 17:56 UTC
23 points
0 comments10 min readLW link

Dy­nam­ics Cru­cial to AI Risk Seem to Make for Com­pli­cated Models

21 Feb 2024 17:54 UTC
18 points
0 comments9 min readLW link

Which Model Prop­er­ties are Ne­c­es­sary for Eval­u­at­ing an Ar­gu­ment?

21 Feb 2024 17:52 UTC
17 points
2 comments7 min readLW link

Weak vs Quan­ti­ta­tive Ex­tinc­tion-level Good­hart’s Law

21 Feb 2024 17:38 UTC
17 points
1 comment2 min readLW link

Dual Wield­ing Kin­dle Scribes

mesaoptimizer21 Feb 2024 17:17 UTC
50 points
18 comments6 min readLW link

A Tale of Two Res­tau­rant Types

Zvi21 Feb 2024 13:50 UTC
15 points
0 comments6 min readLW link
(thezvi.wordpress.com)

Less Wrong au­to­mated sys­tems are in­ad­ver­tently Cen­sor­ing me

Roko21 Feb 2024 12:57 UTC
8 points
52 comments1 min readLW link

[Question] What is the re­search speed mul­ti­plier of the most ad­vanced cur­rent LLMs?

wunan21 Feb 2024 12:39 UTC
6 points
2 comments1 min readLW link

Jailbreak­ing GPT-4 with the tool API

mishajw21 Feb 2024 11:16 UTC
20 points
2 comments4 min readLW link

Gut Ren­o­vat­ing Another Bathroom

jefftk21 Feb 2024 3:00 UTC
22 points
0 comments2 min readLW link
(www.jefftk.com)

Thoughts for and against an ASI figur­ing out ethics for itself

sweenesm20 Feb 2024 23:40 UTC
6 points
10 comments3 min readLW link

AI #51: Alt­man’s Ambition

Zvi20 Feb 2024 19:50 UTC
83 points
5 comments38 min readLW link
(thezvi.wordpress.com)

The Third Gemini

Zvi20 Feb 2024 19:50 UTC
30 points
2 comments9 min readLW link
(thezvi.wordpress.com)

Why does gen­er­al­iza­tion work?

Martín Soto20 Feb 2024 17:51 UTC
43 points
16 comments4 min readLW link

Rep­re­sen­ta­tions of Ab­stract Re­la­tions in Infancy

Bruce W. Lee20 Feb 2024 17:40 UTC
2 points
0 comments3 min readLW link
(direct.mit.edu)

ChatGPT re­fuses to ac­cept a challenge where it would get shot be­tween the eyes [game the­ory]

Bill Benzon20 Feb 2024 16:55 UTC
4 points
6 comments4 min readLW link

In­duc­ing hu­man-like bi­ases in moral rea­son­ing LMs

20 Feb 2024 16:28 UTC
19 points
3 comments14 min readLW link

Monthly Roundup #15: Fe­bru­ary 2024

Zvi20 Feb 2024 13:10 UTC
22 points
7 comments32 min readLW link
(thezvi.wordpress.com)

Selec­tions From “The Trou­ble With Be­ing Born”

Arjun Panickssery20 Feb 2024 10:07 UTC
23 points
2 comments2 min readLW link
(arjunpanickssery.substack.com)