Friendly AI ideas needed: how would you ban porn?

To con­struct a friendly AI, you need to be able to make vague con­cepts crys­tal clear, cut­ting re­al­ity at the joints when those joints are ob­scure and frac­tal—and them im­ple­ment a sys­tem that im­ple­ments that cut.

There are lots of sug­ges­tions on how to do this, and a lot of work in the area. But hav­ing been over the same turf again and again, it’s pos­si­ble we’ve got a bit stuck in a rut. So to gen­er­ate new sug­ges­tions, I’m propos­ing that we look at a vaguely analo­gous but dis­tinctly differ­ent ques­tion: how would you ban porn?

Sup­pose you’re put in change of some gov­ern­ment and/​or le­gal sys­tem, and you need to ban pornog­ra­phy, and see that the ban is im­ple­mented. Pornog­ra­phy is the prob­lem, not eroti­cism. So a lonely lower-class guy wank­ing off to “Fuck Slaves of the Caribbean XIV” in a Pussy­cat Theatre is com­pletely off. But a mid­dle-class cou­ple ex­pe­rienc­ing a deli­cious fris­son when they see a nude ver­sion of “Pirates of Pen­zance” at the Met is perfectly fine—com­mend­able, even.

The dis­tinc­tion be­tween the two case is cer­tainly not easy to spell out, and many are re­duced to say­ing the equiv­a­lent of “I know it when I see it” when defin­ing pornog­ra­phy. In terms of AI, this is equiv­a­lent with “value load­ing”: re­fin­ing the AI’s val­ues through in­ter­ac­tions with hu­man de­ci­sion mak­ers, who an­swer ques­tions about edge cases and ex­am­ples and serve as “learned judges” for the AI’s con­cepts. But sup­pose that ap­proach was not available to you—what meth­ods would you im­ple­ment to dis­t­in­guish be­tween pornog­ra­phy and eroti­cism, and ban one but not the other? Suffi­ciently clear that a scriptwriter would know ex­actly what they need to cut or add to a movie in or­der to move it from one cat­e­gory to the other? What if the nude “Pirates of of Pen­zance” was at a Pussy­cat Theatre and “Fuck Slaves of the Caribbean XIV” was at the Met?

To get max­i­mal cre­ativity, it’s best to ig­nore the ul­ti­mate aim of the ex­er­cise (to find in­spira­tions for meth­ods that could be adapted to AI) and just fo­cus on the prob­lem it­self. Is it even pos­si­ble to get a rea­son­able solu­tion to this ques­tion—a ques­tion much sim­pler than de­sign­ing a FAI?