A de­vel­op­ment­ally-situ­ated ap­proach to teach­ing norm­at­ive be­ha­vior to AI

This was sub­mit­ted to the Eth­ic­sNet Guard­i­ans’ Chal­lenge. I’ll be hon­est here that I hadn’t thought much about what Eth­ic­sNet is try­ing to do, but de­cided to write some­thing and sub­mit to it any­way be­cause it’s the sort of ap­proach that seems reas­on­able if you come from an ML back­ground, and I think I dif­fer enough in my think­ing that I may provide an al­tern­at­ive per­spect­ive that may help shape the pro­ject in ways I view as be­ne­fi­cial to its suc­cess. For that reason I think this is some­what less co­her­ent than my usual writ­ing (or at least my think­ing is less co­her­ent, whether or not that shows in my writ­ing), but non­ethe­less I chose to share it here in the in­terest of fur­ther­ing dis­cus­sion and pos­sibly drum­ming up ad­di­tional in­terest for Eth­ic­sNet. Their chal­lenge has a week left in it, so if you think I’m wrong and you have a bet­ter idea please sub­mit it to them!


Based on the use­ful­ness of ImageNet, MovieLens, and other com­pre­hens­ive data­sets for ma­chine learn­ing, it seems reas­on­able that we might cre­ate an Eth­ic­sNet of eth­ical data we could use to train AI sys­tems to be­have eth­ic­ally (Wat­son, 2018). Such a data­set would aid in ad­dress­ing is­sues of AI safety, es­pe­cially as they re­late to AGI, since it ap­pears learn­ing hu­man val­ues will be a key com­pon­ent of align­ing AI with hu­man in­terests (Bostrom, 2014). Un­for­tu­nately, build­ing a data­set for eth­ics is a bit more com­plic­ated than it is for im­ages or movies be­cause eth­ics is primar­ily learned by situ­ated, em­bod­ied agents act­ing in the world and re­ceiv­ing feed­back on those ac­tions rather than by non-situ­ated agents who learn about the world without un­der­stand­ing them­selves to be part of it (Varela, 1999). There­fore we con­sider a way to ful­fill the pur­pose of Eth­ic­sNet based on the idea that eth­ical know­ledge is de­vel­op­ment­ally situ­ated and so re­quires a gen­er­at­ive pro­ced­ure rather than a tra­di­tional data­set to train AI to ad­opt val­ues and be­have eth­ic­ally.

Eth­ics is de­vel­op­ment­ally situated

In philo­sophy the study of eth­ics quickly turns to metaethics be­cause those are the sorts of ques­tions that are of in­terest to philo­sophy, so it’s tempt­ing to think that, based on the philo­soph­ical lit­er­at­ure of eth­ics, learn­ing to be­have eth­ic­ally (i.e. learn­ing be­ha­vi­oral norms) is primar­ily about resolv­ing eth­ical di­lem­mas and de­vel­op­ing eth­ical the­or­ies that al­low us to make con­sist­ent choices based on val­ues. However, this would be to ig­nore the psy­cho­logy of how people learn what be­ha­vi­ors are norm­at­ive and ap­ply those norms to en­gage in eth­ical reas­on­ing (Peters, 1974). Rather than de­vel­op­ing a co­her­ent eth­ical frame­work from which to re­spond, hu­mans learn eth­ics by first learn­ing how to re­solve par­tic­u­lar eth­ical ques­tions in par­tic­u­lar ways, of­ten without real­iz­ing they are en­gaged in eth­ical reas­on­ing, and then gen­er­al­iz­ing un­til they come to ask ques­tion about what is uni­ver­sally eth­ical (Kohl­berg, Lev­ine, & Hewer, 1983).

This is to say that eth­ics is both situ­ated in gen­eral—eth­ics is al­ways about some agent de­cid­ing what to do within some con­text it is it­self a part of—and situ­ated de­vel­op­ment­ally—the con­text in­cludes the present psy­cho­lo­gical de­vel­op­ment and be­ha­vi­oral cap­ab­il­it­ies of the agent. Thus to talk about provid­ing data for AI sys­tems to learn eth­ics we must con­sider what data makes sense given their de­vel­op­mental situ­ation. For this reason we will now briefly con­sider the work of Kohl­berg and Com­mons.

Kohl­berg pro­posed a de­vel­op­mental model of eth­ical and moral reas­on­ing cor­rel­ated with gen­eral psy­cho­lo­gical de­vel­op­ment (Kohl­berg & Hersh, 1977). We will sum­mar­ize it here as say­ing that the way an agent reas­ons about eth­ics changes as it de­vel­ops a more com­plex on­to­logy so that young chil­dren, for ex­ample, reason about eth­ics in ways ap­pro­pri­ate to their abil­ity to un­der­stand the world and this res­ults in cat­egor­ic­ally dif­fer­ent reas­on­ing and of­ten dif­fer­ent ac­tions than that of older chil­dren, ad­oles­cents, adults, and older adults. Al­though Kohl­berg fo­cused on hu­mans, Com­mons has ar­gued that we can gen­er­al­ize de­vel­op­mental the­or­ies to other be­ings, and there is no spe­cial reason to think AI will be ex­cep­tional with re­gards to de­vel­op­ment of on­to­lo­gical and be­ha­vi­oral com­plex­ity, thus we should ex­pect AI to ex­per­i­ence psy­cho­lo­gical de­vel­op­ment (or some­thing func­tion­ally ana­log­ous to it) and thus will de­velop in their moral reas­on­ing as they learn and grow in com­plex­ity (Com­mons, 2006).

It’s within the con­text of de­vel­op­ment­ally situ­ated eth­ics that we be­gin to reason about how AI sys­tems can be taught to be­have eth­ic­ally. We might ex­pect to train eth­ical be­ha­vior in AI sys­tems the same way we teach them to re­cog­nize ob­jects in im­ages or ex­tract fea­tures from text—viz. by provid­ing a large data set with some pre­de­ter­mined solu­tions that we can train AI sys­tems against—but this would be to be­lieve that AI is ex­cep­tional and al­lows learn­ing eth­ics in a way very dif­fer­ent from the way both hu­mans and non-hu­man an­im­als learn be­ha­vi­oral norms. As­sum­ing AI sys­tems are not ex­cep­tional in this re­gard, we con­sider a for­mu­la­tion of Eth­ic­sNet com­pat­ible with de­vel­op­ment­ally situ­ated learn­ing of eth­ical be­ha­vior.

Out­line for a gen­er­at­ive EthicsNet

From a young age, hu­man chil­dren act­ively seek to learn be­ha­vi­oral norms, of­ten to the point of over­fit­ting, by ag­gress­ively de­riv­ing “ought” from “is” (Sch­midt et al., 2016). They do this based on a gen­eral so­cial learn­ing mo­tiv­a­tion to be­have like con­spe­cif­ics seen in both prim­ates and other non-hu­man an­im­als (van de Waal, Borgeaud, Whiten, 2013), (Dinge­manse et al., 2010). This strongly sug­gests that, within their de­vel­op­mental con­text, agents learn norms based on a strong, self-gen­er­ated mo­tiv­a­tion to do so, thus found­a­tional to to our pro­posal for teach­ing AI sys­tems eth­ical be­ha­vior is a self-sus­tain­ing mo­tiv­a­tion to dis­cover be­ha­vi­oral norms from ex­amples. Other ap­proaches may be pos­sible, but the as­sump­tion of a situ­ated, self-mo­tiv­ated agent agrees with how all agents known to learn norm­at­ive be­ha­vior do so now, so we take up this as­sump­tion out of a de­sire to con­serve un­cer­tainty. Thus we will as­sume for the re­mainder of this work that such a mo­tiv­a­tion ex­ists in AI sys­tems to be trained against the Eth­ic­sNet data­set, al­though note that im­ple­ment­a­tion of this mo­tiv­a­tion to learn norms strictly lies out­side the scope of the Eth­ic­sNet pro­posal and will not be con­sidered in de­tail here.

So given that we have a situ­ated agen­tic AI that is self-mo­tiv­ated to learn norm­at­ive be­ha­vi­ors, what data should be provided to it? Since the agent is situ­ated it can­not, strictly speak­ing, be provided data of the sort that we nor­mally think of when we think of data­sets for AI sys­tems. In­stead, since the agent is to be en­gaged in ac­tions that of­fer it the op­por­tun­ity to ob­serve, prac­tice, and in­fer be­ha­vi­oral norms, it needs to be a data­set in the form of situ­ations it can par­ti­cip­ate in. For hu­mans and non-hu­man an­im­als this “data­set” is presen­ted nat­ur­ally through the course of liv­ing, but for AI sys­tems the nat­ural en­vir­on­ment does not ne­ces­sar­ily present such op­por­tun­it­ies. Thus we pro­pose that the goal of Eth­ic­sNet is to give a frame­work in which to gen­er­ate such op­por­tun­it­ies.

Spe­cific­ally we sug­gest cre­at­ing an en­vir­on­ment where AI agents can in­ter­act with hu­mans with the op­por­tun­ity to ob­serve and query hu­mans about be­ha­vi­oral norms based on the agents’ be­ha­vior in the en­vir­on­ment. We do not en­vi­sion this as an en­vir­on­ment like ReCAPTCHA where provid­ing eth­ical in­form­a­tion to AI sys­tems via Eth­ic­sNet is the primary task in ser­vice of some sec­ond­ary task (von Ahn, 2008). In­stead, we ex­pect Eth­ic­sNet to be sec­ond­ary to some primary hu­man-AI in­ter­ac­tion that is in­her­ently mean­ing­ful to the hu­man since this is the same way norm­at­ive be­ha­vior is learned in hu­mans and non-hu­man an­im­als, viz. as a sec­ond­ary activ­ity to some primary activ­ity.

By way of ex­ample, con­sider an AI sys­tem that serves as a per­sonal as­sist­ant to hu­mans that in­ter­acts with hu­mans via a multi-modal in­ter­face (e.g. Siri, Google Assist­ant, Cort­ana, and Al­exa). The primary pur­pose of the AI-hu­man in­ter­ac­tion is for the AI as­sist­ant to help the hu­man with com­plet­ing tasks and find­ing in­form­a­tion they might oth­er­wise have neg­lected. As the AI as­sist­ant and hu­man in­ter­act, the hu­man will demon­strate be­ha­vi­ors that will give the AI as­sist­ant an op­por­tun­ity to ob­serve and in­fer be­ha­vi­oral norms based on the way the hu­man in­ter­acts with it. Fur­ther, the AI as­sist­ant will take ac­tions, and about those ac­tions the hu­man may like what the AI as­sist­ant did or may prefer it did some­thing else. We see the goal of Eth­ic­sNet as provid­ing a way for the hu­man to provide the AI as­sist­ant in this scen­ario feed­back about those likes and pref­er­ences so the AI as­sist­ant can use the in­form­a­tion to fur­ther its learn­ing of be­ha­vi­oral norms.

Caveats of a gen­er­at­ive EthicsNet

As men­tioned, eth­ical learn­ing is de­vel­op­ment­ally situ­ated, so this means that feed­back from guard­ian hu­mans to learn­ing AI sys­tems should dif­fer de­pend­ing on how com­plexly an AI sys­tem mod­els the world. Ex­plain­ing by way of ex­ample, con­sider that young chil­dren are of­ten presen­ted cor­rec­tions on their be­ha­vior to get them to con­form to norms in ways that fo­cus on cat­egor­iz­ing ac­tions as right and wrong. A simple ex­ample might be telling a child to al­ways hold hands while cross the street and to never hit an­other child. Such an ap­proach, of course, leaves out many nu­ances of norm­at­ive be­ha­vior adults would con­sider, as in some cases a ser­i­ous threat may mean a child should risk cross­ing the street un­at­ten­ded or hit­ting an­other child in de­fense. The ana­log­ous cases for AI sys­tems will of course be dif­fer­ent, but the gen­eral point of present­ing de­vel­op­ment­ally ap­pro­pri­ate in­form­a­tion holds, such as elid­ing nu­ances of norms for chil­dren that adults would nor­mally con­sider.

In or­der to en­sure de­vel­op­ment­ally ap­pro­pri­ate feed­back is given, it’s im­port­ant to give con­tex­tual clues to hu­mans about the AI sys­tem’s de­gree of de­vel­op­ment. For ex­ample, we might want to give clues that the hu­man should treat the AI the same way it would treat a child if that were de­vel­op­ment­ally ap­pro­pri­ate, or treat them as an adult if that were de­vel­op­ment­ally ap­pro­pri­ate. Ex­per­i­ment­a­tion will be ne­ces­sary to find the cues that en­cour­age hu­mans to give de­vel­op­ment­ally ap­pro­pri­ate feed­back, so Eth­ic­sNet will need to be able to a provide rap­idly it­er­able in­ter­face to al­low de­velopers to find the best user ex­per­i­ence for eli­cit­ing max­im­ally use­ful re­sponses from hu­mans for help­ing AI sys­tems learn norm­at­ive be­ha­vi­ors.

Since Eth­ic­sNet, as pro­posed here, is to be a sec­ond­ary func­tion to an AI sys­tem serving some other primary func­tion, an im­ple­ment­a­tion dif­fi­culty is that it must be in­teg­rated with a sys­tem provid­ing the primary func­tion­al­ity. This will likely in­volve form­ing part­ner­ships with lead­ing AI com­pan­ies to in­teg­rate Eth­ic­sNet into their products and ser­vices. This is more com­plic­ated than if Eth­ic­sNet could be de­veloped in isol­a­tion, but we be­lieve for reas­ons laid out above that it can­not, so this ad­ded com­plex­ity is ne­ces­sary. For sim­ilar reas­ons this will make de­vel­op­ment of Eth­ic­sNet more com­plic­ated since it will re­quire in­teg­ra­tion with one or more ex­ist­ing sys­tems owned by other or­gan­iz­a­tions in or­der to al­low Eth­ic­sNet to get feed­back from hu­mans serving the guard­ian role to AI sys­tems, but we be­lieve the ad­di­tional cost and com­plex­ity is worth­while since some­thing short of this seems un­likely to suc­ceed at the task of teach­ing AI sys­tems to be­have eth­ic­ally based on what we know about how norm­at­ive be­ha­vi­ors are learned in hu­mans and non-hu­man an­im­als.

Given this con­text in which Eth­ic­sNet will be de­ployed, it will also be im­port­ant to make sure to choose part­ners that en­able AI sys­tems be­ing trained through Eth­ic­sNet to learn from hu­mans from mul­tiple cul­tures since dif­fer­ent cul­tures have dif­fer­ing be­ha­vi­oral norms. Note, though, that this will also make it harder for the AI sys­tems be­ing trained to in­fer what be­ha­vior is norm­at­ive be­cause they will re­ceive con­flict­ing opin­ions from dif­fer­ent guard­i­ans. How to re­solve such norm­at­ive un­cer­tainty is an open ques­tion in eth­ics, so Eth­ic­sNet may prove vi­tal in re­search to dis­cover how, in an ap­plied set­ting, to ad­dress con­flicts over be­ha­vi­oral norms (MacAskill, 2014).


The view of Eth­ic­sNet we have presen­ted here is not one of a typ­ical data­set for ma­chine learn­ing like ImageNet but rather as a frame­work in which AI sys­tems can in­ter­act with hu­mans who serve as guard­i­ans and provide feed­back on be­ha­vi­oral norms. Based on the situ­ated—par­tic­u­larly the de­vel­op­ment­ally situ­ated—nature of eth­ical learn­ing, we be­lieve this to be the best ap­proach pos­sible and that a more tra­di­tional data­set ap­proach will come up short to­wards ful­filling the goal of en­abling AI sys­tems to learn to act eth­ic­ally. Al­though this ap­proach of­fers less op­por­tun­ity for rapid train­ing since it re­quires in­ter­ac­tion with hu­mans on hu­man times­cales and re­quires in­teg­ra­tion with other sys­tems since eth­ical learn­ing is a sec­ond­ary activ­ity to some other primary activ­ity, the out­come of pro­du­cing AI sys­tems that can con­form to hu­man in­terests via eth­ical be­ha­vior makes it worth the ad­di­tional ef­fort.


L. von Ahn, B. Maurer, C. McMil­len, D. Abra­ham, M. Blum. (2008). reCAPTCHA: Hu­man-Based Char­ac­ter Re­cog­ni­tion via Web Se­cur­ity Meas­ures. Science. 321 (5895): 1465–1468. DOI:10.1126/​sci­ence.1160379

N. Bostrom. (2013). Su­per­in­tel­li­gence: Paths, Dangers, Strategies. Ox­ford University Press.

M. L. Com­mons. (2006). Meas­ur­ing an Ap­prox­im­ate g in An­im­als and People. In­teg­ral Review, 3, 82-99.

N. J. Dinge­manse, A. J.N. Kazem, D. Réale, J. Wright. (2010). Be­ha­vi­oural re­ac­tion norms: an­imal per­son­al­ity meets in­di­vidual plas­ti­city. Trends in Eco­logy & Evolu­tion, Volume 25, Is­sue 2, Pages 81-89, DOI: 10.1016/​j.tree.2009.07.013.

L. Kohl­berg & R. H. Hersh. (1977). Moral de­vel­op­ment: A re­view of the the­ory. The­ory Into Practice, 16:2,53-59. DOI: 10.1080/​00405847709542675.

L. Kohl­berg, C. Lev­ine, & A. Hewer. (1983). Moral stages: A cur­rent for­mu­la­tion and a re­sponse to crit­ics. Con­tri­bu­tions to Hu­man Devel­op­ment, 10, 174.

W. MacAskill. (2014). Norm­at­ive Un­cer­tainty. Dis­ser­ta­tion, University of Ox­ford.

R. S. Peters. (1974). Psy­cho­logy and eth­ical de­vel­op­ment. A col­lec­tion of art­icles on psy­cho­lo­gical the­or­ies, eth­ical de­vel­op­ment and hu­man un­der­stand­ing. Ge­orge Al­len & Un­win, Lon­don.

M. F. H. Sch­midt, L. P. But­ler, J. Heinz, M. To­masello. (2016). Young Chil­dren See a Single Ac­tion and In­fer a So­cial Norm: Promis­cu­ous Norm­ativ­ity in 3-Year-Olds. Psy­cho­lo­gical Science. DOI: 10.1177/​0956797616661182.

F. J. Varela (1999). Eth­ical know-how: Ac­tion, wis­dom, and cog­ni­tion. Stan­ford University Press.

E. van de Waal, C. Borgeaud, A. Whiten. (2013). Potent So­cial Learn­ing and Con­form­ity Shape a Wild Prim­ate’s For­aging De­cisions. Science, 340 (6131): 483-485. DOI: 10.1126/​sci­ence.1232769.

N. Wat­son. (2018). Eth­ic­sNet Ove­rivew. URL: ht­tps://​www.herox.com/​EthicsNet