The first thing that strikes me as off about AI_should is that if you use it, then your definition will be very long and complex (because Human Value is Complex) but your assumption will be short (because the code that takes “You should do X” and then does X can be minimal). This is backwards—definitions should be clean, elegant, and simple, while assumptions often have to be messy and complex.
When they can be clean, elegant, and simple, sure. However, they are often clean and elegant because of other assumed definitions. Consider the definition of a continuous function from the reals to the reals. This requires definitions of real numbers, limits, and functions. This further requires definitions of number, absolute value, <, >, etc. When you put it all together, it isn’t very clean and elegant. Point being, we can talk in shorthand about our utility functions (or shouldness) somewhat effectively, but we shouldn’t be surprised when it’s complicated when we try to program it into a computer.
Furthermore, so what if a definition isn’t elegant, assuming it “carves reality at it’s joints”?
It’s a useful category to draw because you believe certain moral facts (that those are the correct things to care about)
The category, the definition, is just a manifestation of the moral beliefs. It is, in fact, exactly isomorphic to those beliefs, otherwise it wouldn’t be useful. So why not just talk about the beliefs?
When they can be clean, elegant, and simple, sure. However, they are often clean and elegant because of other assumed definitions. Consider the definition of a continuous function from the reals to the reals. This requires definitions of real numbers, limits, and functions. This further requires definitions of number, absolute value, <, >, etc. When you put it all together, it isn’t very clean and elegant. Point being, we can talk in shorthand about our utility functions (or shouldness) somewhat effectively, but we shouldn’t be surprised when it’s complicated when we try to program it into a computer.
Furthermore, so what if a definition isn’t elegant, assuming it “carves reality at it’s joints”?
What joints does “good” carve reality at? Are most things either very good or very not good?
“everything I care about” or “everything Will cares about”, etc.
That is, in fact, a useful category to draw.
It’s a useful category to draw because you believe certain moral facts (that those are the correct things to care about)
The category, the definition, is just a manifestation of the moral beliefs. It is, in fact, exactly isomorphic to those beliefs, otherwise it wouldn’t be useful. So why not just talk about the beliefs?
It’s not quite the same as my moral beliefs. My moral beliefs are what I think I care about. Goodness refers to what I actually care about.
That being said, there’s no reason why my moral beliefs have to be defined in some clean and simple way. In fact, they probably aren’t.
But “what you actually care about” is defined as what your moral beliefs would be if you had more information, more intelligence, etc.
So what are your moral beliefs actually about? Are they beliefs about more beliefs?