A developmentally-situated approach to teaching normative behavior to AI

This was sub­mit­ted to the Ethic­sNet Guardians’ Challenge. I’ll be hon­est here that I hadn’t thought much about what Ethic­sNet is try­ing to do, but de­cided to write some­thing and sub­mit to it any­way be­cause it’s the sort of ap­proach that seems rea­son­able if you come from an ML back­ground, and I think I differ enough in my think­ing that I may provide an al­ter­na­tive per­spec­tive that may help shape the pro­ject in ways I view as benefi­cial to its suc­cess. For that rea­son I think this is some­what less co­her­ent than my usual writ­ing (or at least my think­ing is less co­her­ent, whether or not that shows in my writ­ing), but nonethe­less I chose to share it here in the in­ter­est of fur­ther­ing dis­cus­sion and pos­si­bly drum­ming up ad­di­tional in­ter­est for Ethic­sNet. Their challenge has a week left in it, so if you think I’m wrong and you have a bet­ter idea please sub­mit it to them!


Based on the use­ful­ness of ImageNet, MovieLens, and other com­pre­hen­sive datasets for ma­chine learn­ing, it seems rea­son­able that we might cre­ate an Ethic­sNet of eth­i­cal data we could use to train AI sys­tems to be­have eth­i­cally (Wat­son, 2018). Such a dataset would aid in ad­dress­ing is­sues of AI safety, es­pe­cially as they re­late to AGI, since it ap­pears learn­ing hu­man val­ues will be a key com­po­nent of al­ign­ing AI with hu­man in­ter­ests (Bostrom, 2014). Un­for­tu­nately, build­ing a dataset for ethics is a bit more com­pli­cated than it is for images or movies be­cause ethics is pri­mar­ily learned by situ­ated, em­bod­ied agents act­ing in the world and re­ceiv­ing feed­back on those ac­tions rather than by non-situ­ated agents who learn about the world with­out un­der­stand­ing them­selves to be part of it (Varela, 1999). There­fore we con­sider a way to fulfill the pur­pose of Ethic­sNet based on the idea that eth­i­cal knowl­edge is de­vel­op­men­tally situ­ated and so re­quires a gen­er­a­tive pro­ce­dure rather than a tra­di­tional dataset to train AI to adopt val­ues and be­have eth­i­cally.

Ethics is de­vel­op­men­tally situated

In philos­o­phy the study of ethics quickly turns to metaethics be­cause those are the sorts of ques­tions that are of in­ter­est to philos­o­phy, so it’s tempt­ing to think that, based on the philo­soph­i­cal liter­a­ture of ethics, learn­ing to be­have eth­i­cally (i.e. learn­ing be­hav­ioral norms) is pri­mar­ily about re­solv­ing eth­i­cal dilem­mas and de­vel­op­ing eth­i­cal the­o­ries that al­low us to make con­sis­tent choices based on val­ues. How­ever, this would be to ig­nore the psy­chol­ogy of how peo­ple learn what be­hav­iors are nor­ma­tive and ap­ply those norms to en­gage in eth­i­cal rea­son­ing (Peters, 1974). Rather than de­vel­op­ing a co­her­ent eth­i­cal frame­work from which to re­spond, hu­mans learn ethics by first learn­ing how to re­solve par­tic­u­lar eth­i­cal ques­tions in par­tic­u­lar ways, of­ten with­out re­al­iz­ing they are en­gaged in eth­i­cal rea­son­ing, and then gen­er­al­iz­ing un­til they come to ask ques­tion about what is uni­ver­sally eth­i­cal (Kohlberg, Lev­ine, & Hewer, 1983).

This is to say that ethics is both situ­ated in gen­eral—ethics is always about some agent de­cid­ing what to do within some con­text it is it­self a part of—and situ­ated de­vel­op­men­tally—the con­text in­cludes the pre­sent psy­cholog­i­cal de­vel­op­ment and be­hav­ioral ca­pa­bil­ities of the agent. Thus to talk about pro­vid­ing data for AI sys­tems to learn ethics we must con­sider what data makes sense given their de­vel­op­men­tal situ­a­tion. For this rea­son we will now briefly con­sider the work of Kohlberg and Com­mons.

Kohlberg pro­posed a de­vel­op­men­tal model of eth­i­cal and moral rea­son­ing cor­re­lated with gen­eral psy­cholog­i­cal de­vel­op­ment (Kohlberg & Hersh, 1977). We will sum­ma­rize it here as say­ing that the way an agent rea­sons about ethics changes as it de­vel­ops a more com­plex on­tol­ogy so that young chil­dren, for ex­am­ple, rea­son about ethics in ways ap­pro­pri­ate to their abil­ity to un­der­stand the world and this re­sults in cat­e­gor­i­cally differ­ent rea­son­ing and of­ten differ­ent ac­tions than that of older chil­dren, ado­les­cents, adults, and older adults. Although Kohlberg fo­cused on hu­mans, Com­mons has ar­gued that we can gen­er­al­ize de­vel­op­men­tal the­o­ries to other be­ings, and there is no spe­cial rea­son to think AI will be ex­cep­tional with re­gards to de­vel­op­ment of on­tolog­i­cal and be­hav­ioral com­plex­ity, thus we should ex­pect AI to ex­pe­rience psy­cholog­i­cal de­vel­op­ment (or some­thing func­tion­ally analo­gous to it) and thus will de­velop in their moral rea­son­ing as they learn and grow in com­plex­ity (Com­mons, 2006).

It’s within the con­text of de­vel­op­men­tally situ­ated ethics that we be­gin to rea­son about how AI sys­tems can be taught to be­have eth­i­cally. We might ex­pect to train eth­i­cal be­hav­ior in AI sys­tems the same way we teach them to rec­og­nize ob­jects in images or ex­tract fea­tures from text—viz. by pro­vid­ing a large data set with some pre­de­ter­mined solu­tions that we can train AI sys­tems against—but this would be to be­lieve that AI is ex­cep­tional and al­lows learn­ing ethics in a way very differ­ent from the way both hu­mans and non-hu­man an­i­mals learn be­hav­ioral norms. As­sum­ing AI sys­tems are not ex­cep­tional in this re­gard, we con­sider a for­mu­la­tion of Ethic­sNet com­pat­i­ble with de­vel­op­men­tally situ­ated learn­ing of eth­i­cal be­hav­ior.

Out­line for a gen­er­a­tive EthicsNet

From a young age, hu­man chil­dren ac­tively seek to learn be­hav­ioral norms, of­ten to the point of overfit­ting, by ag­gres­sively de­riv­ing “ought” from “is” (Sch­midt et al., 2016). They do this based on a gen­eral so­cial learn­ing mo­ti­va­tion to be­have like con­speci­fics seen in both pri­mates and other non-hu­man an­i­mals (van de Waal, Borgeaud, Whiten, 2013), (Dinge­manse et al., 2010). This strongly sug­gests that, within their de­vel­op­men­tal con­text, agents learn norms based on a strong, self-gen­er­ated mo­ti­va­tion to do so, thus foun­da­tional to to our pro­posal for teach­ing AI sys­tems eth­i­cal be­hav­ior is a self-sus­tain­ing mo­ti­va­tion to dis­cover be­hav­ioral norms from ex­am­ples. Other ap­proaches may be pos­si­ble, but the as­sump­tion of a situ­ated, self-mo­ti­vated agent agrees with how all agents known to learn nor­ma­tive be­hav­ior do so now, so we take up this as­sump­tion out of a de­sire to con­serve un­cer­tainty. Thus we will as­sume for the re­main­der of this work that such a mo­ti­va­tion ex­ists in AI sys­tems to be trained against the Ethic­sNet dataset, al­though note that im­ple­men­ta­tion of this mo­ti­va­tion to learn norms strictly lies out­side the scope of the Ethic­sNet pro­posal and will not be con­sid­ered in de­tail here.

So given that we have a situ­ated agen­tic AI that is self-mo­ti­vated to learn nor­ma­tive be­hav­iors, what data should be pro­vided to it? Since the agent is situ­ated it can­not, strictly speak­ing, be pro­vided data of the sort that we nor­mally think of when we think of datasets for AI sys­tems. In­stead, since the agent is to be en­gaged in ac­tions that offer it the op­por­tu­nity to ob­serve, prac­tice, and in­fer be­hav­ioral norms, it needs to be a dataset in the form of situ­a­tions it can par­ti­ci­pate in. For hu­mans and non-hu­man an­i­mals this “dataset” is pre­sented nat­u­rally through the course of liv­ing, but for AI sys­tems the nat­u­ral en­vi­ron­ment does not nec­es­sar­ily pre­sent such op­por­tu­ni­ties. Thus we pro­pose that the goal of Ethic­sNet is to give a frame­work in which to gen­er­ate such op­por­tu­ni­ties.

Speci­fi­cally we sug­gest cre­at­ing an en­vi­ron­ment where AI agents can in­ter­act with hu­mans with the op­por­tu­nity to ob­serve and query hu­mans about be­hav­ioral norms based on the agents’ be­hav­ior in the en­vi­ron­ment. We do not en­vi­sion this as an en­vi­ron­ment like ReCAPTCHA where pro­vid­ing eth­i­cal in­for­ma­tion to AI sys­tems via Ethic­sNet is the pri­mary task in ser­vice of some sec­ondary task (von Ahn, 2008). In­stead, we ex­pect Ethic­sNet to be sec­ondary to some pri­mary hu­man-AI in­ter­ac­tion that is in­her­ently mean­ingful to the hu­man since this is the same way nor­ma­tive be­hav­ior is learned in hu­mans and non-hu­man an­i­mals, viz. as a sec­ondary ac­tivity to some pri­mary ac­tivity.

By way of ex­am­ple, con­sider an AI sys­tem that serves as a per­sonal as­sis­tant to hu­mans that in­ter­acts with hu­mans via a multi-modal in­ter­face (e.g. Siri, Google As­sis­tant, Cor­tana, and Alexa). The pri­mary pur­pose of the AI-hu­man in­ter­ac­tion is for the AI as­sis­tant to help the hu­man with com­plet­ing tasks and find­ing in­for­ma­tion they might oth­er­wise have ne­glected. As the AI as­sis­tant and hu­man in­ter­act, the hu­man will demon­strate be­hav­iors that will give the AI as­sis­tant an op­por­tu­nity to ob­serve and in­fer be­hav­ioral norms based on the way the hu­man in­ter­acts with it. Fur­ther, the AI as­sis­tant will take ac­tions, and about those ac­tions the hu­man may like what the AI as­sis­tant did or may pre­fer it did some­thing else. We see the goal of Ethic­sNet as pro­vid­ing a way for the hu­man to provide the AI as­sis­tant in this sce­nario feed­back about those likes and prefer­ences so the AI as­sis­tant can use the in­for­ma­tion to fur­ther its learn­ing of be­hav­ioral norms.

Caveats of a gen­er­a­tive EthicsNet

As men­tioned, eth­i­cal learn­ing is de­vel­op­men­tally situ­ated, so this means that feed­back from guardian hu­mans to learn­ing AI sys­tems should differ de­pend­ing on how com­plexly an AI sys­tem mod­els the world. Ex­plain­ing by way of ex­am­ple, con­sider that young chil­dren are of­ten pre­sented cor­rec­tions on their be­hav­ior to get them to con­form to norms in ways that fo­cus on cat­e­go­riz­ing ac­tions as right and wrong. A sim­ple ex­am­ple might be tel­ling a child to always hold hands while cross the street and to never hit an­other child. Such an ap­proach, of course, leaves out many nu­ances of nor­ma­tive be­hav­ior adults would con­sider, as in some cases a se­ri­ous threat may mean a child should risk cross­ing the street unat­tended or hit­ting an­other child in defense. The analo­gous cases for AI sys­tems will of course be differ­ent, but the gen­eral point of pre­sent­ing de­vel­op­men­tally ap­pro­pri­ate in­for­ma­tion holds, such as elid­ing nu­ances of norms for chil­dren that adults would nor­mally con­sider.

In or­der to en­sure de­vel­op­men­tally ap­pro­pri­ate feed­back is given, it’s im­por­tant to give con­tex­tual clues to hu­mans about the AI sys­tem’s de­gree of de­vel­op­ment. For ex­am­ple, we might want to give clues that the hu­man should treat the AI the same way it would treat a child if that were de­vel­op­men­tally ap­pro­pri­ate, or treat them as an adult if that were de­vel­op­men­tally ap­pro­pri­ate. Ex­per­i­men­ta­tion will be nec­es­sary to find the cues that en­courage hu­mans to give de­vel­op­men­tally ap­pro­pri­ate feed­back, so Ethic­sNet will need to be able to a provide rapidly iter­able in­ter­face to al­low de­vel­op­ers to find the best user ex­pe­rience for elic­it­ing max­i­mally use­ful re­sponses from hu­mans for helping AI sys­tems learn nor­ma­tive be­hav­iors.

Since Ethic­sNet, as pro­posed here, is to be a sec­ondary func­tion to an AI sys­tem serv­ing some other pri­mary func­tion, an im­ple­men­ta­tion difficulty is that it must be in­te­grated with a sys­tem pro­vid­ing the pri­mary func­tion­al­ity. This will likely in­volve form­ing part­ner­ships with lead­ing AI com­pa­nies to in­te­grate Ethic­sNet into their prod­ucts and ser­vices. This is more com­pli­cated than if Ethic­sNet could be de­vel­oped in iso­la­tion, but we be­lieve for rea­sons laid out above that it can­not, so this added com­plex­ity is nec­es­sary. For similar rea­sons this will make de­vel­op­ment of Ethic­sNet more com­pli­cated since it will re­quire in­te­gra­tion with one or more ex­ist­ing sys­tems owned by other or­ga­ni­za­tions in or­der to al­low Ethic­sNet to get feed­back from hu­mans serv­ing the guardian role to AI sys­tems, but we be­lieve the ad­di­tional cost and com­plex­ity is worth­while since some­thing short of this seems un­likely to suc­ceed at the task of teach­ing AI sys­tems to be­have eth­i­cally based on what we know about how nor­ma­tive be­hav­iors are learned in hu­mans and non-hu­man an­i­mals.

Given this con­text in which Ethic­sNet will be de­ployed, it will also be im­por­tant to make sure to choose part­ners that en­able AI sys­tems be­ing trained through Ethic­sNet to learn from hu­mans from mul­ti­ple cul­tures since differ­ent cul­tures have differ­ing be­hav­ioral norms. Note, though, that this will also make it harder for the AI sys­tems be­ing trained to in­fer what be­hav­ior is nor­ma­tive be­cause they will re­ceive con­flict­ing opinions from differ­ent guardians. How to re­solve such nor­ma­tive un­cer­tainty is an open ques­tion in ethics, so Ethic­sNet may prove vi­tal in re­search to dis­cover how, in an ap­plied set­ting, to ad­dress con­flicts over be­hav­ioral norms (MacAskill, 2014).


The view of Ethic­sNet we have pre­sented here is not one of a typ­i­cal dataset for ma­chine learn­ing like ImageNet but rather as a frame­work in which AI sys­tems can in­ter­act with hu­mans who serve as guardians and provide feed­back on be­hav­ioral norms. Based on the situ­ated—par­tic­u­larly the de­vel­op­men­tally situ­ated—na­ture of eth­i­cal learn­ing, we be­lieve this to be the best ap­proach pos­si­ble and that a more tra­di­tional dataset ap­proach will come up short to­wards fulfilling the goal of en­abling AI sys­tems to learn to act eth­i­cally. Although this ap­proach offers less op­por­tu­nity for rapid train­ing since it re­quires in­ter­ac­tion with hu­mans on hu­man timescales and re­quires in­te­gra­tion with other sys­tems since eth­i­cal learn­ing is a sec­ondary ac­tivity to some other pri­mary ac­tivity, the out­come of pro­duc­ing AI sys­tems that can con­form to hu­man in­ter­ests via eth­i­cal be­hav­ior makes it worth the ad­di­tional effort.


L. von Ahn, B. Mau­rer, C. McMillen, D. Abra­ham, M. Blum. (2008). reCAPTCHA: Hu­man-Based Char­ac­ter Recog­ni­tion via Web Se­cu­rity Mea­sures. Science. 321 (5895): 1465–1468. DOI:10.1126/​sci­ence.1160379

N. Bostrom. (2013). Su­per­in­tel­li­gence: Paths, Dangers, Strate­gies. Oxford Univer­sity Press.

M. L. Com­mons. (2006). Mea­sur­ing an Ap­prox­i­mate g in An­i­mals and Peo­ple. In­te­gral Re­view, 3, 82-99.

N. J. Dinge­manse, A. J.N. Kazem, D. Réale, J. Wright. (2010). Be­havi­oural re­ac­tion norms: an­i­mal per­son­al­ity meets in­di­vi­d­ual plas­tic­ity. Trends in Ecol­ogy & Evolu­tion, Vol­ume 25, Is­sue 2, Pages 81-89, DOI: 10.1016/​j.tree.2009.07.013.

L. Kohlberg & R. H. Hersh. (1977). Mo­ral de­vel­op­ment: A re­view of the the­ory. The­ory Into Prac­tice, 16:2,53-59. DOI: 10.1080/​00405847709542675.

L. Kohlberg, C. Lev­ine, & A. Hewer. (1983). Mo­ral stages: A cur­rent for­mu­la­tion and a re­sponse to crit­ics. Con­tri­bu­tions to Hu­man Devel­op­ment, 10, 174.

W. MacAskill. (2014). Nor­ma­tive Uncer­tainty. Disser­ta­tion, Univer­sity of Oxford.

R. S. Peters. (1974). Psy­chol­ogy and eth­i­cal de­vel­op­ment. A col­lec­tion of ar­ti­cles on psy­cholog­i­cal the­o­ries, eth­i­cal de­vel­op­ment and hu­man un­der­stand­ing. Ge­orge Allen & Un­win, Lon­don.

M. F. H. Sch­midt, L. P. But­ler, J. Heinz, M. To­masello. (2016). Young Chil­dren See a Sin­gle Ac­tion and In­fer a So­cial Norm: Promis­cu­ous Nor­ma­tivity in 3-Year-Olds. Psy­cholog­i­cal Science. DOI: 10.1177/​0956797616661182.

F. J. Varela (1999). Eth­i­cal know-how: Ac­tion, wis­dom, and cog­ni­tion. Stan­ford Univer­sity Press.

E. van de Waal, C. Borgeaud, A. Whiten. (2013). Po­tent So­cial Learn­ing and Con­for­mity Shape a Wild Pri­mate’s For­ag­ing De­ci­sions. Science, 340 (6131): 483-485. DOI: 10.1126/​sci­ence.1232769.

N. Wat­son. (2018). Ethic­sNet Overivew. URL: https://​www.herox.com/​EthicsNet

No comments.