Prin­ci­ples in AI alignment

WikiLast edit: 16 Feb 2017 18:54 UTC by Eliezer Yudkowsky

A ‘principle’ of AI alignment is something we want in a broad sense for the whole AI, which has informed narrower design proposals for particular parts or aspects of the AI.

For example:

Please be guarded about declaring things to be ‘principles’ unless they have already informed more than one specific design proposal and more than one person thinks they are a good idea. You could call them ‘proposed principles’ and post them under your own domain if you personally think they are a good idea. There are a lot of possible ‘broad design wishes’, or things that people think are ‘broad design wishes’, and the principles that have actually already informed specific design proposals would otherwise get lost in the crowd.

No comments.