RSS

Myopia

TagLast edit: 2 Oct 2020 23:31 UTC by Ben Pace

Myopia means short-sighted, particularly with respect to planning—neglecting long-term consequences in favor of the short term. The extreme case, in which only immediate rewards are considered, is of particular interest. We can think of a myopic agent as one that only considers how best to answer the single question that you give to it rather than considering any sort of long-term consequences. Such an agent might have a number of desirable safety properties, such as a lack of instrumental incentives.

Par­tial Agency

abramdemski27 Sep 2019 22:04 UTC
58 points
18 comments9 min readLW link

The Credit As­sign­ment Problem

abramdemski8 Nov 2019 2:50 UTC
85 points
40 comments17 min readLW link1 review

Towards a mechanis­tic un­der­stand­ing of corrigibility

evhub22 Aug 2019 23:20 UTC
44 points
26 comments6 min readLW link

Open Prob­lems with Myopia

10 Mar 2021 18:38 UTC
57 points
16 comments8 min readLW link

Steer­ing Be­havi­our: Test­ing for (Non-)My­opia in Lan­guage Models

5 Dec 2022 20:28 UTC
35 points
13 comments10 min readLW link

Defin­ing Myopia

abramdemski19 Oct 2019 21:32 UTC
32 points
18 comments8 min readLW link

LCDT, A My­opic De­ci­sion Theory

3 Aug 2021 22:41 UTC
50 points
51 comments15 min readLW link

Ar­gu­ments against my­opic training

Richard_Ngo9 Jul 2020 16:07 UTC
56 points
39 comments12 min readLW link

An overview of 11 pro­pos­als for build­ing safe ad­vanced AI

evhub29 May 2020 20:38 UTC
194 points
36 comments38 min readLW link2 reviews

Bayesian Evolv­ing-to-Extinction

abramdemski14 Feb 2020 23:55 UTC
38 points
13 comments5 min readLW link

Ran­dom Thoughts on Pre­dict-O-Matic

abramdemski17 Oct 2019 23:39 UTC
31 points
3 comments9 min readLW link

The Parable of Pre­dict-O-Matic

abramdemski15 Oct 2019 0:49 UTC
289 points
42 comments14 min readLW link2 reviews

Self-Fulfilling Prophe­cies Aren’t Always About Self-Awareness

John_Maxwell18 Nov 2019 23:11 UTC
14 points
7 comments4 min readLW link

The Dual­ist Pre­dict-O-Matic ($100 prize)

John_Maxwell17 Oct 2019 6:45 UTC
16 points
35 comments5 min readLW link

Why GPT wants to mesa-op­ti­mize & how we might change this

John_Maxwell19 Sep 2020 13:48 UTC
55 points
32 comments9 min readLW link

2019 Re­view Rewrite: Seek­ing Power is Often Ro­bustly In­stru­men­tal in MDPs

TurnTrout23 Dec 2020 17:16 UTC
35 points
0 comments4 min readLW link
(www.lesswrong.com)

Seek­ing Power is Often Con­ver­gently In­stru­men­tal in MDPs

5 Dec 2019 2:33 UTC
153 points
38 comments16 min readLW link2 reviews
(arxiv.org)

Un­der­stand­ing and con­trol­ling auto-in­duced dis­tri­bu­tional shift

LRudL13 Dec 2021 14:59 UTC
26 points
3 comments16 min readLW link

Evan Hub­inger on Ho­mo­gene­ity in Take­off Speeds, Learned Op­ti­miza­tion and Interpretability

Michaël Trazzi8 Jun 2021 19:20 UTC
28 points
0 comments55 min readLW link

Fight­ing Akra­sia: In­cen­tivis­ing Action

G Gordon Worley III29 Apr 2009 13:48 UTC
11 points
58 comments2 min readLW link

Graph­i­cal World Models, Coun­ter­fac­tu­als, and Ma­chine Learn­ing Agents

Koen.Holtman17 Feb 2021 11:07 UTC
6 points
2 comments10 min readLW link

Trans­form­ing my­opic op­ti­miza­tion to or­di­nary op­ti­miza­tion—Do we want to seek con­ver­gence for my­opic op­ti­miza­tion prob­lems?

tailcalled11 Dec 2021 20:38 UTC
12 points
1 comment5 min readLW link

How com­plex are my­opic imi­ta­tors?

Vivek Hebbar8 Feb 2022 12:00 UTC
23 points
1 comment15 min readLW link

AI safety via mar­ket making

evhub26 Jun 2020 23:07 UTC
55 points
45 comments11 min readLW link

In­ter­pretabil­ity’s Align­ment-Solv­ing Po­ten­tial: Anal­y­sis of 7 Scenarios

Evan R. Murphy12 May 2022 20:01 UTC
45 points
0 comments59 min readLW link

Ac­cept­abil­ity Ver­ifi­ca­tion: A Re­search Agenda

12 Jul 2022 20:11 UTC
43 points
0 comments1 min readLW link
(docs.google.com)

Laz­i­ness in AI

Richard Henage2 Sep 2022 17:04 UTC
11 points
5 comments1 min readLW link

Gen­er­a­tive, Epi­sodic Ob­jec­tives for Safe AI

Michael Glass5 Oct 2022 23:18 UTC
11 points
3 comments8 min readLW link

Limit­ing an AGI’s Con­text Temporally

EulersApprentice17 Feb 2019 3:29 UTC
5 points
11 comments1 min readLW link

Simulators

janus2 Sep 2022 12:45 UTC
459 points
102 comments44 min readLW link
(generative.ink)