Note that knowing != doing, so in principle there is a gap between a world model which includes lots of information about what the user is feeling (what you call cognitive empathy), and acting on that information in prosocial/beneficial ways.
Similarly, one can considers anothers emotions both to mislead or to comfort someone.
There is a bit of tricky framing/ training work in making a model that “knows” what a user is feeling, having that at a low enough layer that the activation is useful, and actually acting on that in a beneficial way.
Definitely, it’s an interesting tension that seems to be resolved in different directions. My expectation is that world-model based (cognitive) empathy is a bigger risk, as it’s the most important ingredient for dark empathy. While affective empathy is more likely to be create unintentionally toxic patterns and holds a bigger ethical red flag with regards to autonomy of AIs in general.
I am wondering if we might need to end up doing the cliche of the “emotion core” where we suborne a more fluid descision systems to one that is well tuned for empathetic processing. I made a simulacra of this awhile back with a de bono’s thinking hat technique, and the results tended to be better formed than without. However, in terms of creating a stable psychology there needs to be enough hooks for the emotional points to not overtake while still coloring choices positively.
Steering as a taxonomy is an interesting idea, it harkens back to the idea of perspectives from Diaspora which is a structure I find natively appealing. But in that world they had a lot of “restricted” perspectives because they were either self-terminating or caused destabilization of the user.
This new realm of AI neuro-sociology is going to be an entrancing nightmare.
Note that knowing != doing, so in principle there is a gap between a world model which includes lots of information about what the user is feeling (what you call cognitive empathy), and acting on that information in prosocial/beneficial ways.
Similarly, one can considers anothers emotions both to mislead or to comfort someone.
There is a bit of tricky framing/ training work in making a model that “knows” what a user is feeling, having that at a low enough layer that the activation is useful, and actually acting on that in a beneficial way.
Steering might help taxonimize here.
Definitely, it’s an interesting tension that seems to be resolved in different directions. My expectation is that world-model based (cognitive) empathy is a bigger risk, as it’s the most important ingredient for dark empathy. While affective empathy is more likely to be create unintentionally toxic patterns and holds a bigger ethical red flag with regards to autonomy of AIs in general.
I am wondering if we might need to end up doing the cliche of the “emotion core” where we suborne a more fluid descision systems to one that is well tuned for empathetic processing. I made a simulacra of this awhile back with a de bono’s thinking hat technique, and the results tended to be better formed than without. However, in terms of creating a stable psychology there needs to be enough hooks for the emotional points to not overtake while still coloring choices positively.
Steering as a taxonomy is an interesting idea, it harkens back to the idea of perspectives from Diaspora which is a structure I find natively appealing. But in that world they had a lot of “restricted” perspectives because they were either self-terminating or caused destabilization of the user.
This new realm of AI neuro-sociology is going to be an entrancing nightmare.