SIA Is Just Be­ing a Bayesian About the Fact That One Ex­ists

omnizoid14 Nov 2023 22:55 UTC
2 points
5 comments4 min readLW link

AI Align­ment [progress] this Week (11/​12/​2023)

Logan Zoellner14 Nov 2023 22:21 UTC
6 points
0 comments2 min readLW link
(midwitalignment.substack.com)

[Question] When did Eliezer Yud­kowsky change his mind about neu­ral net­works?

[deactivated]14 Nov 2023 21:24 UTC
31 points
15 comments1 min readLW link

Bet­ting on what is un-falsifi­able and un-verifiable

Abhimanyu Pallavi Sudhir14 Nov 2023 21:11 UTC
13 points
0 comments14 min readLW link

Face­book is Pay­ing Me to Post

jefftk14 Nov 2023 19:10 UTC
26 points
5 comments1 min readLW link
(www.jefftk.com)

Feel­ings, Noth­ing More than Feel­ings, About AI

PaulBecon14 Nov 2023 18:50 UTC
−3 points
0 comments3 min readLW link

Kids or No kids

Kids or no kids14 Nov 2023 18:37 UTC
91 points
10 comments13 min readLW link

Rae­mon’s De­liber­ate (“Pur­pose­ful?”) Prac­tice Club

14 Nov 2023 18:24 UTC
61 points
11 comments22 min readLW link

More metal less ore

Logan Kieller14 Nov 2023 16:59 UTC
8 points
3 comments2 min readLW link
(logankieller.substack.com)

A fram­ing for interpretability

Nina Rimsky14 Nov 2023 16:14 UTC
69 points
5 comments4 min readLW link
(ninarimsky.substack.com)

Monthly Roundup #12: Novem­ber 2023

Zvi14 Nov 2023 15:20 UTC
34 points
5 comments33 min readLW link
(thezvi.wordpress.com)

Do you want a first-prin­ci­pled pre­pared­ness guide to pre­pare your­self and loved ones for po­ten­tial catas­tro­phes?

Ulrik Horn14 Nov 2023 12:13 UTC
15 points
5 comments15 min readLW link

[Question] Is there Work on Embed­ded Agency in Cel­lu­lar Au­tomata Toy Models?

Johannes C. Mayer14 Nov 2023 9:08 UTC
9 points
0 comments1 min readLW link

[Question] Would this be Progress in Solv­ing Embed­ded Agency?

Johannes C. Mayer14 Nov 2023 9:08 UTC
9 points
2 comments2 min readLW link

Is In­ter­pretabil­ity All We Need?

RogerDearnaley14 Nov 2023 5:31 UTC
1 point
1 comment1 min readLW link

What is wis­dom?

TsviBT14 Nov 2023 2:13 UTC
32 points
3 comments13 min readLW link

Fes­ti­val Stats 2023

jefftk14 Nov 2023 1:20 UTC
9 points
0 comments1 min readLW link
(www.jefftk.com)

Out of the Box

jesseduffield13 Nov 2023 23:43 UTC
5 points
1 comment7 min readLW link

Loudly Give Up, Don’t Quietly Fade

Screwtape13 Nov 2023 23:30 UTC
138 points
11 comments6 min readLW link

Great Em­pa­thy and Great Re­sponse Ability

positivesum13 Nov 2023 23:04 UTC
16 points
0 comments3 min readLW link
(tryingtruly.substack.com)

The­o­ries of Change for AI Auditing

13 Nov 2023 19:33 UTC
53 points
0 comments18 min readLW link
(www.apolloresearch.ai)

They are made of re­peat­ing patterns

quetzal_rainbow13 Nov 2023 18:17 UTC
49 points
4 comments2 min readLW link

How to Upload a Mind (In Three Not-So-Easy Steps)

13 Nov 2023 18:13 UTC
26 points
0 comments7 min readLW link
(youtu.be)

Non-my­opia stories

lberglund13 Nov 2023 17:52 UTC
28 points
10 comments7 min readLW link

It’s OK to eat shrimp: EAs Make In­valid In­fer­ences About Fish Qualia and Mo­ral Patienthood

Mikhail Samin13 Nov 2023 16:51 UTC
2 points
17 comments1 min readLW link

Sugges­tions for chess puzzles

Zane13 Nov 2023 15:39 UTC
13 points
1 comment1 min readLW link

Why small phe­nomenons are rele­vant to moral­ity ​

Ryo 13 Nov 2023 15:25 UTC
1 point
0 comments3 min readLW link

Op­tion­al­ity ap­proach to ethics

Ryo 13 Nov 2023 15:23 UTC
7 points
2 comments3 min readLW link

Redi­rect­ing one’s own taxes as an effec­tive al­tru­ism method

David Gross13 Nov 2023 15:17 UTC
1 point
34 comments16 min readLW link

AISC Pro­ject: Bench­marks for Stable Reflectivity

jacquesthibs13 Nov 2023 14:51 UTC
17 points
0 comments8 min readLW link

AISC Pro­ject: Model­ling Tra­jec­to­ries of Lan­guage Models

NickyP13 Nov 2023 14:33 UTC
26 points
0 comments12 min readLW link

Bostrom Goes Unheard

Zvi13 Nov 2023 14:11 UTC
81 points
9 comments18 min readLW link

Novem­ber hang­out in Warsaw

ntoxeg13 Nov 2023 13:20 UTC
1 point
1 comment1 min readLW link

The Science Al­gorithm AISC Project

Johannes C. Mayer13 Nov 2023 12:52 UTC
12 points
0 comments1 min readLW link
(docs.google.com)

You can just spon­ta­neously call peo­ple you haven’t met in years

lc13 Nov 2023 5:21 UTC
154 points
19 comments1 min readLW link

Zvi’s Man­i­fold Mar­kets House Rules

Zvi13 Nov 2023 0:28 UTC
39 points
4 comments3 min readLW link

[Question] What’s your best util­i­tar­ian model for risk­ing your best kid­neys?

Ilio12 Nov 2023 23:01 UTC
−3 points
4 comments1 min readLW link

Helpful ex­am­ples to get a sense of mod­ern au­to­mated manipulation

trevor12 Nov 2023 20:49 UTC
33 points
3 comments9 min readLW link

The Snug­gle/​Date/​Slap Protocol

MadHatter12 Nov 2023 20:44 UTC
−21 points
4 comments2 min readLW link

Two chil­dren’s stories

Optimization Process12 Nov 2023 20:29 UTC
11 points
1 comment7 min readLW link

The Fun­da­men­tal The­o­rem for mea­surable fac­tor spaces

Matthias G. Mayer12 Nov 2023 19:25 UTC
38 points
2 comments2 min readLW link

How ac­cu­rate are stan­dard Dark Triad per­son­al­ity scales?

jamesbill12 Nov 2023 8:21 UTC
0 points
2 comments2 min readLW link

[Question] What ML gears do you like?

Ulisse Mini11 Nov 2023 19:10 UTC
25 points
4 comments1 min readLW link

Smart Ses­sions—Fi­nally a (kinda) win­dow-cen­tric ses­sion manager

Eli Tyre11 Nov 2023 18:54 UTC
13 points
3 comments5 min readLW link

AISC pro­ject: Satis­fIA – AI that satis­fies with­out over­do­ing it

Jobst Heitzig11 Nov 2023 18:22 UTC
11 points
0 comments1 min readLW link
(docs.google.com)

Con­trol Sym­me­try: why we might want to start in­ves­ti­gat­ing asym­met­ric al­ign­ment interventions

domenicrosati11 Nov 2023 17:27 UTC
23 points
1 comment2 min readLW link

Game The­ory with­out Argmax [Part 2]

Cleo Nardo11 Nov 2023 16:02 UTC
31 points
14 comments13 min readLW link

Game The­ory with­out Argmax [Part 1]

Cleo Nardo11 Nov 2023 15:59 UTC
53 points
16 comments19 min readLW link

It’s OK to be bi­ased to­wards humans

dr_s11 Nov 2023 11:59 UTC
55 points
69 comments6 min readLW link

The Top AI Safety Bets for 2023: GiveWiki’s Lat­est Recommendations

Dawn Drescher11 Nov 2023 9:04 UTC
2 points
2 comments1 min readLW link