Think of it as vaguely like I-am-juggling versus you-are-juggling.
Here, I can see how they would overlap to a reasonable degree—I don’t think this easily carries over to emotions. Emotions atleast feel like this weird, distinct thing such that any statement along the lines “I’m happy” does it injustice. Therefore I can’t see it being carried over to “She’s happy”, their intersection wouldn’t be robust enough such that it won’t falsely trigger for actually unrelated things. That is, “She’s happy” ≈ “I’m happy” ≉ experiencing happiness.
Facial cues (as one example, it makes sense that there would be other things like higher-pitched voices when enjoying oneself etc) eliminate this problem because opposed to something introspective being the link, a more objective state of the mind, like “He’s sad”, will be the learned link.
this might sound like I’m being unnecessarily picky about this, but imo these associations need to be very exact, else humans would be reward-hacking all day: it’s reasonable to assume that the activations of thinking “She’s happy” are very similar to trying to convince oneself “She’s happy” internally, even ‘knowing’ the truth. But if both resulted in big feelings of internal happiness, we would have a lot more psychopaths.
regarding micro expressions specifically, it’s definitely not a hill i want to die on, it kind of just popped in my mind as I was writing about facial cues and by micro I really mean ‘micro micro’ - e.g. smiles that aren’t perfectly symmetrical for quarter of a second, something I at least can’t really pick up on; what is their evolutionary advantage if they don’t atleast offer some kind of subconscious effect on conspecifics? But yea, if you can’t consciously pick up on it, linking the two is pointless or even bad.
I read the linked post roughly, but as I read neither so far, i probably can’t relate too well to it. seems reasonable (or honestly, obvious) though that it’s a mix rather than either of those extreme statements.
This isn’t as much a question as it is just sharing some thoughts I had, but I would love to hear your thoughts :) Let’s imagine we are our own brain’s optimizer. We just received a bad signal, we feel pain. Let’s say, we realized someone else is soon going to feel pain, so we feel pain. What could the optimizer do now? Well, there are only 2 things it can do:
Try to disconnect “she feels pain” from the concept of pain that then triggered pain in yourself
Try to disconnect your previous thoughts from arriving at “she feels pain”
You speak a lot to (1), explaining the symbol grounding mechanism that continuously symbol grounds it in the ground truth, so the optimizer trying to move “she feels pain” away from its previous position in the feature space won’t work (at least as long as we continuously have such ground truth input—this sheds light on the very immoral but very interesting experiment of having an individual not exposed to such input for long periods, like not seeing any human face for multiple months, be it in person, on pictures or on your phone. There, this theory should predict that such a move in feature space could happen and will be successful—to be dramatic, you become a psychopath).
You don’t speak much to (2) though. One option for example here would be to unlearn the concept of “future”—babies first gradually learn about it therefore it’s reasonable to assume that you could unlearn it again. Luckily, this doesn’t seem to happen, so there must be some opposing force, something that promises reward if this concept persists.
Specifically, this concept must offer you insight into your actions such that your future expected reward rises. This is obvious in this case—without the concept “future”, you can hardly make any intelligent decisions at all. But it also carries over to much more specific and even human invented associations/knowledge:
Let’s say you work in cyber-security and the reason you think this person will feel pain is because using those cyber-security skills enabled you to make an association the normal person wouldn’t. The optimizer could try to unlearn these skills, but actually those skills lead to higher expected reward, else you wouldn’t be pursuing it: be it the nice house you can afford, the social status you enjoy because of it or simply the joy you receive from enacting it.
In other words, anything you learned, you learned because you assumed it would result in a higher expected reward and anything you act out (after learning), you do because it results in a higher expected reward. To forget these concepts will at least require a reward matching theirs.
This doesn’t imply it should be impossible though—let’s say you learned something that you hate, like say chiseling stone. You did this because the market would pay insane wages because only few could do the job and so the reward you saw attached to those wages was immense and you pushed through the boring education of becoming an expert in chiseling stone. And once you got there, you realize, you weren’t the only one with the idea: wages drop quicker than the average pump & dump crypto coin. In fact the profession you enacted before, which you intrinsically enjoy, even pays better.
As I’m writing this, I realize there are no good stories for why chiseling stone might give you a better glimpse into someone’s future pain, but let’s just take it for granted. Then the reward of the knowledge of chiseling stone is pretty much zero, maybe even negative because whenever you recall it, you recall all the effort that didn’t pay off.
Yet I have never heard of something along these lines happening. It would be quite a great mechanism for the free market though, the wages would jump right up: let’s hope our individual in question doesn’t once again try to learn to chisel stone, completely forgetting this tale of unreciprocated effort.
You could maybe argue something like: precisely the things that fall in this category are things we gave up on, that is, their occurrence in our day-to-day life is incredibly rare. Therefore, with a normal learning rate, we simply wouldn’t iterate over them often enough to forget them meaningfully.
Lastly, just for completeness, naturally ‘disconnecting your previous thoughts from arriving at “she feels pain”’ also entails your previous actions—it’s a very special occurrence to know somebody will feel pain in the future, unless you had a play in it yourself. Naturally those decisions back then will be optimized on as well, hopefully leading you to make better decisions in the future.