Against utility functions

I think we should stop talk­ing about util­ity func­tions.

In the con­text of ethics for hu­mans, any­way. In prac­tice I find util­ity func­tions to be, at best, an oc­ca­sion­ally use­ful metaphor for dis­cus­sions about ethics but, at worst, an idea that some peo­ple start tak­ing too se­ri­ously and which ac­tively makes them worse at rea­son­ing about ethics. To the ex­tent that we care about caus­ing peo­ple to be­come bet­ter at rea­son­ing about ethics, it seems like we ought to be able to do bet­ter than this.

The funny part is that the failure mode I worry the most about is already an en­trenched part of the Se­quences: it’s fake util­ity func­tions. The soft failure is peo­ple who think they know what their util­ity func­tion is and say bizarre things about what this im­plies that they, or per­haps all peo­ple, ought to do. The hard failure is peo­ple who think they know what their util­ity func­tion is and then do bizarre things. I hope the hard failure is not very com­mon.

It seems worth re­flect­ing on the fact that the point of the foun­da­tional LW ma­te­rial dis­cussing util­ity func­tions was to make peo­ple bet­ter at rea­son­ing about AI be­hav­ior and not about hu­man be­hav­ior.