Outline of Possible Sources of Values

I don’t know what my val­ues are. I don’t even know how to find out what my val­ues are. But do I know some­thing about how I (or an FAI) may be able to find out what my val­ues are? Per­haps… and I’ve or­ga­nized my an­swer to this ques­tion in the form of an “Out­line of Pos­si­ble Sources of Values”. I hope it also serves as a sum­mary of the ma­jor open prob­lems in this area.

  1. External

    1. god(s)

    2. other humans

    3. other agents

  2. Behavioral

    1. ac­tual (his­tor­i­cal/​ob­served) behavior

    2. coun­ter­fac­tual (simu­lated/​pre­dicted) behavior

  3. Sub­con­scious Cognition

    1. model-based de­ci­sion making

      1. ontology

      2. heuris­tics for ex­trap­o­lat­ing/​up­dat­ing model

      3. (par­tial) util­ity function

    2. model-free de­ci­sion making

      1. iden­tity based (adopt a so­cial role like “en­vi­ron­men­tal­ist” or “aca­demic” and em­u­late an ap­pro­pri­ate role model, ac­tual or ideal­ized)

      2. habits

      3. re­in­force­ment based

  4. Con­scious Cognition

    1. de­ci­sion mak­ing us­ing ex­plicit ver­bal and/​or quan­ti­ta­tive reasoning

      1. con­se­quen­tial­ist (similar to model-based above, but us­ing ex­plicit rea­son­ing)

      2. deontological

      3. virtue ethical

      4. iden­tity based

    2. rea­son­ing about ter­mi­nal goals/​val­ues/​prefer­ences/​moral principles

      1. re­sponses (changes in state) to moral ar­gu­ments (pos­si­bly con­text de­pen­dent)

      2. dis­tri­bu­tions of au­tonomously gen­er­ated moral ar­gu­ments (pos­si­bly con­text de­pen­dent)

      3. log­i­cal struc­ture (if any) of moral reasoning

    3. ob­ject-level in­tu­itions/​judgments

      1. about what one should do in par­tic­u­lar eth­i­cal situations

      2. about the de­sir­a­bil­ities of par­tic­u­lar outcomes

      3. about moral principles

    4. meta-level in­tu­itions/​judgments

      1. about the na­ture of morality

      2. about the com­plex­ity of values

      3. about what the valid sources of val­ues are

      4. about what con­sti­tutes cor­rect moral reasoning

      5. about how to ex­plic­itly/​for­mally/​effec­tively rep­re­sent val­ues (util­ity func­tion, mul­ti­ple util­ity func­tions, de­on­tolog­i­cal rules, or some­thing else) (if util­ity func­tion(s), for what de­ci­sion the­ory and on­tol­ogy?)

      6. about how to ex­tract/​trans­late/​com­bine sources of val­ues into a rep­re­sen­ta­tion of values

        1. how to solve on­tolog­i­cal crisis

        2. how to deal with na­tive util­ity func­tion or re­vealed prefer­ences be­ing partial

        3. how to trans­late non-con­se­quen­tial­ist sources of val­ues into util­ity func­tion(s)

        4. how to deal with moral prin­ci­ples be­ing vague and incomplete

        5. how to deal with con­flicts be­tween differ­ent sources of values

        6. how to deal with lack of cer­tainty in one’s in­tu­itions/​judgments

      7. whose in­tu­ition/​judg­ment ought to be ap­plied? (may be differ­ent for each of the above)

        1. the sub­ject’s (at what point in time? cur­rent in­tu­itions, even­tual judg­ments, or some­thing in be­tween?)

        2. the FAI de­sign­ers’

        3. the FAI’s own philo­soph­i­cal conclusions

Us­ing this out­line, we can ob­tain a con­cise un­der­stand­ing of what many metaeth­i­cal the­o­ries and FAI pro­pos­als are claiming/​sug­gest­ing and how they differ from each other. For ex­am­ple, Nyan_Sand­wich’s “moral­ity is awe­some” the­sis can be in­ter­preted as the claim that the most im­por­tant source of val­ues is our in­tu­itions about the de­sir­a­bil­ity (awe­some­ness) of par­tic­u­lar out­comes.

As an­other ex­am­ple, Aaron Swartz ar­gued against “re­flec­tive equil­ibrium” by which he meant the claim that the valid sources of val­ues are our ob­ject-level moral in­tu­itions, and that cor­rect moral rea­son­ing con­sists of work­ing back and forth be­tween these in­tu­itions un­til they reach co­her­ence. His own po­si­tion was that in­tu­itions about moral prin­ci­ples are the only valid source of val­ues and we should dis­count our in­tu­itions about par­tic­u­lar eth­i­cal situ­a­tions.

A fi­nal ex­am­ple is Paul Chris­ti­ano’s “Indi­rect Nor­ma­tivity” pro­posal (n.b., “Indi­rect Nor­ma­tivity” was origi­nally coined by Nick Bostrom to re­fer to an en­tire class of de­signs where the AI’s val­ues are defined “in­di­rectly”) for FAI, where an im­por­tant source of val­ues is the dis­tri­bu­tion of moral ar­gu­ments the sub­ject is likely to gen­er­ate in a par­tic­u­lar simu­lated en­vi­ron­ment and their re­sponses to those ar­gu­ments. Also, just about ev­ery meta-level ques­tion is left for the (simu­lated) sub­ject to an­swer, ex­cept for the de­ci­sion the­ory and on­tol­ogy of the util­ity func­tion that their val­ues must fi­nally be en­coded in, which is fixed by the FAI de­signer.

I think the out­line in­cludes most of the ideas brought up in past LW dis­cus­sions, or in moral philoso­phies that I’m fa­mil­iar with. Please let me know if I left out any­thing im­por­tant.