I sometimes aim for aligning AI with an abstract notion of human values, defined in a way that coincides with my own preferred understanding of the term.
Or sometimes, one meta-level farther up: an abstract notion of human values according to humans’ own aggregated understanding, bootstrapped in a dynamic whose starting point was close to my own best understanding of the terms.
I sometimes aim for aligning AI with an abstract notion of human values, defined in a way that coincides with my own preferred understanding of the term.
Or sometimes, one meta-level farther up: an abstract notion of human values according to humans’ own aggregated understanding, bootstrapped in a dynamic whose starting point was close to my own best understanding of the terms.