I think we shouldn’t try to emulate rational agents at all, in the sense that we shouldn’t pretend to have rationality-style preferences and supergoals; as a matter of fact we don’t have them.
Up to here we seem to agree, we just use different terminology. I just don’t want to conflate rational preferences with human preferences because they the two systems behave very differently.
Just as an example, in signalling theories of behaviour, you may consciously believe that your preferences are very different from what your behaviour is actually optimizing for when noone is looking. A rational agent wouldn’t normally have separate conscious/unconscious minds unless only the conscious part was sbuject to outside inspection. In this example, it makes sense to update signalling-preferences sometimes, because they’re not your actual acting-preferences.
But if you consciously intend to act out your (conscious) preferences, and also intend to keep changing them in not-always-foreseeable ways, then that isn’t rationality, and when there could be confusion due to context (such as on LW most of the time) I’d prefer not to use the term “preferences” about humans, or to make clear what is meant.
Please see my reply to Nesov above, too.
I think we shouldn’t try to emulate rational agents at all, in the sense that we shouldn’t pretend to have rationality-style preferences and supergoals; as a matter of fact we don’t have them.
Up to here we seem to agree, we just use different terminology. I just don’t want to conflate rational preferences with human preferences because they the two systems behave very differently.
Just as an example, in signalling theories of behaviour, you may consciously believe that your preferences are very different from what your behaviour is actually optimizing for when noone is looking. A rational agent wouldn’t normally have separate conscious/unconscious minds unless only the conscious part was sbuject to outside inspection. In this example, it makes sense to update signalling-preferences sometimes, because they’re not your actual acting-preferences.
But if you consciously intend to act out your (conscious) preferences, and also intend to keep changing them in not-always-foreseeable ways, then that isn’t rationality, and when there could be confusion due to context (such as on LW most of the time) I’d prefer not to use the term “preferences” about humans, or to make clear what is meant.