Scott Aaron­son on “Re­form AI Align­ment”

Shmi20 Nov 2022 22:20 UTC
39 points
17 comments1 min readLW link
(scottaaronson.blog)

On Mo­ral­ity, Ethics, and all that Jazz

Delen Heisman20 Nov 2022 20:00 UTC
4 points
4 comments2 min readLW link
(delen.substack.com)

Limits to the Con­trol­la­bil­ity of AGI

20 Nov 2022 19:18 UTC
10 points
2 comments9 min readLW link

Ca­reer Scout­ing: Dentistry

koratkar20 Nov 2022 15:55 UTC
69 points
5 comments5 min readLW link
(careerscouting.substack.com)

De­ci­sion The­ory but also Ghosts

eva_20 Nov 2022 13:24 UTC
20 points
26 comments10 min readLW link

ARC pa­per: For­mal­iz­ing the pre­sump­tion of independence

Erik Jenner20 Nov 2022 1:22 UTC
97 points
2 comments2 min readLW link
(arxiv.org)

Up­date to Mys­ter­ies of mode col­lapse: text-davinci-002 not RLHF

janus19 Nov 2022 23:51 UTC
71 points
8 comments2 min readLW link

Make the Drought Eva­po­rate!

AnthonyRepetto19 Nov 2022 23:41 UTC
32 points
25 comments3 min readLW link

Elas­tic Pro­duc­tivity Tools

Simon Berens19 Nov 2022 21:59 UTC
76 points
8 comments2 min readLW link
(simonberens.me)

A Short Dialogue on the Mean­ing of Re­ward Functions

19 Nov 2022 21:04 UTC
45 points
0 comments3 min readLW link

By De­fault, GPTs Think In Plain Sight

Fabien Roger19 Nov 2022 19:15 UTC
89 points
36 comments9 min readLW link

Re­view: Bayesian Statis­tics the Fun Way by Will Kurt

matto19 Nov 2022 18:52 UTC
4 points
2 comments2 min readLW link

[Question] How does acausal trade work in a de­ter­minis­tic mul­ti­verse?

sisyphus19 Nov 2022 1:50 UTC
2 points
13 comments1 min readLW link

Choos­ing the right dish

Adam Zerner19 Nov 2022 1:38 UTC
38 points
7 comments8 min readLW link

Reflec­tive Consequentialism

Adam Zerner18 Nov 2022 23:56 UTC
21 points
14 comments4 min readLW link

Value Created vs. Value Extracted

Sable18 Nov 2022 21:34 UTC
8 points
6 comments6 min readLW link
(affablyevil.substack.com)

The Disas­trously Con­fi­dent And Inac­cu­rate AI

Sharat Jacob Jacob18 Nov 2022 19:06 UTC
13 points
0 comments13 min readLW link

How AI Fails Us: A non-tech­ni­cal view of the Align­ment Problem

testingthewaters18 Nov 2022 19:02 UTC
7 points
1 comment2 min readLW link
(ethics.harvard.edu)

[Question] Is there any policy for a fair treat­ment of AIs whose friendli­ness is in doubt?

nahoj18 Nov 2022 19:01 UTC
16 points
10 comments1 min readLW link

Distil­la­tion of “How Likely Is De­cep­tive Align­ment?”

NickGabs18 Nov 2022 16:31 UTC
24 points
4 comments10 min readLW link

Con­tra Chords

jefftk18 Nov 2022 16:20 UTC
12 points
1 comment7 min readLW link
(www.jefftk.com)

[Question] Up­dates on scal­ing laws for foun­da­tion mod­els from ′ Tran­scend­ing Scal­ing Laws with 0.1% Ex­tra Com­pute’

Nick_Greig18 Nov 2022 12:46 UTC
15 points
2 comments1 min readLW link

Hal­i­fax, NS – Monthly Ra­tion­al­ist, EA, and ACX Meetup

Ideopunk18 Nov 2022 11:45 UTC
10 points
0 comments1 min readLW link

In­tro­duc­ing The Log­i­cal Foun­da­tion, an EA-Aligned Non­profit with a Plan to End Poverty With Guaran­teed Income

Michael Simm18 Nov 2022 8:13 UTC
9 points
23 comments24 min readLW link

My Deon­tol­ogy Says Nar­row-Mind­ed­ness is Always Wrong

LVSN18 Nov 2022 6:11 UTC
6 points
2 comments1 min readLW link

AI Ethics != Ai Safety

Dentin18 Nov 2022 3:02 UTC
2 points
0 comments1 min readLW link

Don’t de­sign agents which ex­ploit ad­ver­sar­ial inputs

18 Nov 2022 1:48 UTC
72 points
64 comments12 min readLW link

Eng­ineer­ing Monose­man­tic­ity in Toy Models

18 Nov 2022 1:43 UTC
75 points
7 comments3 min readLW link
(arxiv.org)

AGIs may value in­trin­sic re­wards more than ex­trin­sic ones

catubc17 Nov 2022 21:49 UTC
8 points
6 comments4 min readLW link

LLMs may cap­ture key com­po­nents of hu­man agency

catubc17 Nov 2022 20:14 UTC
27 points
0 comments4 min readLW link

Mastodon Replies as Comments

jefftk17 Nov 2022 20:10 UTC
20 points
0 comments1 min readLW link
(www.jefftk.com)

An­nounc­ing the Progress Forum

jasoncrawford17 Nov 2022 19:26 UTC
83 points
9 comments1 min readLW link

[Question] What kind of bias is this?

Daniel Samuel17 Nov 2022 18:44 UTC
3 points
2 comments1 min readLW link

AI Fore­cast­ing Re­search Ideas

Jsevillamol17 Nov 2022 17:37 UTC
21 points
2 comments1 min readLW link
(docs.google.com)

Re­sults from the in­ter­pretabil­ity hackathon

17 Nov 2022 14:51 UTC
81 points
0 comments6 min readLW link
(alignmentjam.com)

Covid 11/​17/​22: Slow Recovery

Zvi17 Nov 2022 14:50 UTC
33 points
3 comments4 min readLW link
(thezvi.wordpress.com)

Sadly, FTX

Zvi17 Nov 2022 14:30 UTC
133 points
18 comments47 min readLW link
(thezvi.wordpress.com)

Deon­tol­ogy and virtue ethics as “effec­tive the­o­ries” of con­se­quen­tial­ist ethics

Jan_Kulveit17 Nov 2022 14:11 UTC
68 points
9 comments10 min readLW link1 review

The Ground Truth Prob­lem (Or, Why Eval­u­at­ing In­ter­pretabil­ity Meth­ods Is Hard)

Jessica Rumbelow17 Nov 2022 11:06 UTC
27 points
2 comments2 min readLW link

[Question] [Per­sonal Ques­tion] Can any­one help me nav­i­gate this po­ten­tially painful in­ter­per­sonal dy­namic ra­tio­nally?

SlainLadyMondegreen17 Nov 2022 8:53 UTC
9 points
3 comments4 min readLW link

Mas­sive Scal­ing Should be Frowned Upon

harsimony17 Nov 2022 8:43 UTC
5 points
6 comments5 min readLW link

[Question] Why are prof­itable com­pa­nies lay­ing off staff?

Yair Halberstadt17 Nov 2022 6:19 UTC
15 points
10 comments1 min readLW link

[Question] [re­tracted] Dis­cus­sion: Was SBF a naive util­i­tar­ian, or a so­ciopath?

Nicholas Kross17 Nov 2022 2:52 UTC
0 points
4 comments1 min readLW link

Kel­sey Piper’s re­cent in­ter­view of SBF

agucova16 Nov 2022 20:30 UTC
51 points
29 comments2 min readLW link
(www.vox.com)

The Echo Principle

Jonathan Moregård16 Nov 2022 20:09 UTC
4 points
0 comments3 min readLW link
(honestliving.substack.com)

[Question] Is there some rea­son LLMs haven’t seen broader use?

tailcalled16 Nov 2022 20:04 UTC
25 points
27 comments1 min readLW link

When should we be sur­prised that an in­ven­tion took “so long”?

jasoncrawford16 Nov 2022 20:04 UTC
32 points
11 comments4 min readLW link
(rootsofprogress.org)

Ques­tions about Value Lock-in, Pa­ter­nal­ism, and Empowerment

Sam F. Brown16 Nov 2022 15:33 UTC
13 points
2 comments12 min readLW link
(sambrown.eu)

If Pro­fes­sional In­vestors Missed This...

jefftk16 Nov 2022 15:00 UTC
37 points
18 comments3 min readLW link
(www.jefftk.com)

Disagree­ment with bio an­chors that lead to shorter timelines

Marius Hobbhahn16 Nov 2022 14:40 UTC
75 points
17 comments7 min readLW link1 review