RSS

Myopia

TagLast edit: 2 Oct 2020 23:31 UTC by Ben Pace

Myopia means short-sighted, particularly with respect to planning—neglecting long-term consequences in favor of the short term. The extreme case, in which only immediate rewards are considered, is of particular interest. We can think of a myopic agent as one that only considers how best to answer the single question that you give to it rather than considering any sort of long-term consequences. Such an agent might have a number of desirable safety properties, such as a lack of instrumental incentives.

Par­tial Agency

abramdemski27 Sep 2019 22:04 UTC
55 points
18 comments9 min readLW link

The Credit As­sign­ment Problem

abramdemski8 Nov 2019 2:50 UTC
77 points
40 comments17 min readLW link1 review

Towards a mechanis­tic un­der­stand­ing of corrigibility

evhub22 Aug 2019 23:20 UTC
44 points
26 comments6 min readLW link

Open Prob­lems with Myopia

10 Mar 2021 18:38 UTC
49 points
16 comments8 min readLW link

Defin­ing Myopia

abramdemski19 Oct 2019 21:32 UTC
31 points
18 comments8 min readLW link

LCDT, A My­opic De­ci­sion Theory

3 Aug 2021 22:41 UTC
49 points
50 comments15 min readLW link

Ar­gu­ments against my­opic training

Richard_Ngo9 Jul 2020 16:07 UTC
55 points
39 comments12 min readLW link

An overview of 11 pro­pos­als for build­ing safe ad­vanced AI

evhub29 May 2020 20:38 UTC
184 points
34 comments38 min readLW link2 reviews

Bayesian Evolv­ing-to-Extinction

abramdemski14 Feb 2020 23:55 UTC
38 points
13 comments5 min readLW link

Ran­dom Thoughts on Pre­dict-O-Matic

abramdemski17 Oct 2019 23:39 UTC
29 points
3 comments9 min readLW link

The Parable of Pre­dict-O-Matic

abramdemski15 Oct 2019 0:49 UTC
282 points
41 comments14 min readLW link2 reviews

Self-Fulfilling Prophe­cies Aren’t Always About Self-Awareness

John_Maxwell18 Nov 2019 23:11 UTC
14 points
7 comments4 min readLW link

The Dual­ist Pre­dict-O-Matic ($100 prize)

John_Maxwell17 Oct 2019 6:45 UTC
16 points
35 comments5 min readLW link

Why GPT wants to mesa-op­ti­mize & how we might change this

John_Maxwell19 Sep 2020 13:48 UTC
55 points
32 comments9 min readLW link

2019 Re­view Rewrite: Seek­ing Power is Often Ro­bustly In­stru­men­tal in MDPs

TurnTrout23 Dec 2020 17:16 UTC
35 points
0 comments4 min readLW link
(www.lesswrong.com)

Seek­ing Power is Often Con­ver­gently In­stru­men­tal in MDPs

5 Dec 2019 2:33 UTC
146 points
35 comments16 min readLW link2 reviews
(arxiv.org)

Un­der­stand­ing and con­trol­ling auto-in­duced dis­tri­bu­tional shift

LRudL13 Dec 2021 14:59 UTC
24 points
2 comments16 min readLW link

Evan Hub­inger on Ho­mo­gene­ity in Take­off Speeds, Learned Op­ti­miza­tion and Interpretability

Michaël Trazzi8 Jun 2021 19:20 UTC
28 points
0 comments55 min readLW link

Fight­ing Akra­sia: In­cen­tivis­ing Action

G Gordon Worley III29 Apr 2009 13:48 UTC
11 points
58 comments2 min readLW link

Graph­i­cal World Models, Coun­ter­fac­tu­als, and Ma­chine Learn­ing Agents

Koen.Holtman17 Feb 2021 11:07 UTC
6 points
2 comments10 min readLW link

Trans­form­ing my­opic op­ti­miza­tion to or­di­nary op­ti­miza­tion—Do we want to seek con­ver­gence for my­opic op­ti­miza­tion prob­lems?

tailcalled11 Dec 2021 20:38 UTC
12 points
1 comment5 min readLW link

How com­plex are my­opic imi­ta­tors?

Vivek Hebbar8 Feb 2022 12:00 UTC
23 points
1 comment15 min readLW link

AI safety via mar­ket making

evhub26 Jun 2020 23:07 UTC
52 points
45 comments11 min readLW link

In­ter­pretabil­ity’s Align­ment-Solv­ing Po­ten­tial: Anal­y­sis of 7 Scenarios

Evan R. Murphy12 May 2022 20:01 UTC
42 points
0 comments59 min readLW link
No comments.