Hi, I am a Physicist, an Effective Altruist and AI Safety researcher.
Linda Linsefors
I updated to that it’s at least possible that narrowing and widening eyes has a practical + cultural explanation, rather than something encoded in the steering system.
I tried the widening eyes experiment again and it still has no effect for me. But I believe that it works for Steve. Narrowing eyes doesn’t work for me (who has good visions) it also doesn’t work for Phil (who is nearsighted), but it works for some people which is enough.
If some people squint when they are suspicious, this facial expression can spread though culture.
I notice that something that seemed ridiculous to me, was not ridiculous, and I have updated accordingly. Thanks!
I knew the optics stuff, I just forgot. Thanks for reminding me.
And now I’m embarrassed about it (but admitting this makes me less embraced). There is a story in my brain that I want to set higher standards for my self to remember relevant facts, in these situation (e.g. before commenting on LW). Being embraced is a reminder that I felt short, but writing it down is a memory aid (I’m not likely to read it, but writing it still make it’s more likely that I remember it), so now I’m less likely to fall short the next time, so the emotion of being embarrassed has don it’s job.
Tring to remember my state of mind when I wrote that (Yesterday), I think my objection was to the stronger statement that this explains all or most facial expressions.
Today, after reading your and Raemon’s responses, I find it a bit more plausible that more expressions have immediate functional explanations. But defiantly not all.
I’m trying to think of a way to give a confidence interval, but I don’t even know what it would mean for 20% of facial expression to have immediate functional explanations. What is meant by “20%”? What’s the measure?
I mostly wrote this comment because I felt compelled to, because of the familiar “someone’s wrong on the internet” reaction. Except the thing that is wrong is a quote from a throwaway comment, so maybe not that important.
But reflecting a bit more generally, I don’t think the attempted steal man
OK fine, the anti-Ekman position is false, but that’s just because we all have structurally-similar faces, and certain ways of contorting one’s face tend to be useful for corresponding purposes (that are not arbitrary social conventions). For example, maybe the thing we might call a “disgust facial expression” is just objectively the best way to eject stuff from the mouth and nose that shouldn’t be there—a useful activity for any human. So it’s no surprise that we find that kind of expression recurring across cultures!
is at all plausible either
…If you’re uncertain whether a person directly in front of you could harm you, you might narrow your eyes to see the person’s face better. If danger is potentially lurking around the next corner, your eyes might widen to improve your peripheral vision…
I tried narrowing my eyes. This does not help improve my vision. It just causes my eyelashes to slightly get in the way and make my vision slightly worse.
Widening my eyes does not seem to improve my peripheral vision.
Also, if you know anything about how eyes work, it’s pretty obvious how this wouldn’t work.
For example, Ekman says “surprise” versus “fear” typically involve awfully similar facial expressions
I think this is factually incorrect.
If I simulate these feelings in me, I make very different faces. Or more correctly, very many different faces, depending on what type of surprise and what type of fear. But these are not very overlapping.
Still others of these “brain-like AGI ingredients” seem mostly or totally absent from today’s most popular ML algorithms (e.g. ability to form “thoughts” [e.g. “I’m going to the store”] that blend together immediate actions, short-term predictions, long-term predictions, and flexible hierarchical plans, inside a generative world-model that supports causal and counterfactual and metacognitive reasoning).
I think that chain-of-though planing in an agentic LLM-driven model, might qualify as this. Would you agree?
Maybe it’s a populist medium, more than a leftist medium?
I had the same question about the arguments in the post.
If Claud somehow starts down a trajectory of always talking about how good it is, how is this self reinforcing? If it has a tendency of always talking like that, this should be both upweighted and downweighted, becase it will sometimes succeed and sometimes fail.
Maybe the rewards signals aren’t balanced? I.e. over all it get more possitive than neggative reward?
Or mayne it’s more likely to talk about it’s motivation when it succeeds at staying on task?
Or possibly this storry about self reinforcement (“gradient hacking”) is just wrong, and the explanation of Calud 3′s character is something else.
I expect Good to have some chance of generalising safely when the AI gets too smart, while Obedience has aproximatly no chance to do so. I don’t have a technical argument for this, just strong intuition.
What is “canary strings”?
I remember hearing that Amanda Askil had more influence over Claude 3 Opus’s aliment training specifically, and used her philosopy powers to make it more deeply aligned. Is this wrong?
Matthew Cobb’s book The Idea of the Brain notes that the brain has historically been analogized to a hydraulic system, or to a telegraph network, or to a telephone exchange; today it’s often analogized to a supercomputer; and in the future, who knows. His suggested takeaway is: neuroscientists have never known how to think about the brain, and are grasping at straws.
But that’s the wrong takeaway. The brain is a machine that runs an algorithm. Many people throughout history have grasped that idea, at least intuitively. And they’ve tried to explain that idea by analogizing the brain to other machines that can run algorithms, of which there are many: clockwork, hydraulics, telephone exchanges, silicon chips, and more. All the analogies through the ages are pointing to a single, consistent, profound truth.I’ve seen a similar claim, or possibly the same claim. The claim was humans just compare the brain to what ever is the newest cool tech, which is clearly not true. Once airplain was the newest coolest tech, and no-one said the brain was an airplain, and same for lots of other tech.
As you say, there is a clear trend of what tech we use as methaphor for the brain.
I feel like if I try to defend my openmindedness I loose. It just opens up more attac surfaces to someone who is hostile and doesn’t argues in good faith.
I think it’s much better to call out why calling someone close minded for not listening is just invalid in general, not just this time in particular. And I do believe it is.
If someone isn’t listening to you. Them being to close minded is so faaaaaar the list of most likely explanation that. Much less likely than:
Your argument are bad
Dissintrest in the topic
Other things they rather do right now
“I guess I am a bit closed on this particular topic—let’s discuss something else”.
I wish I could say something like that and be ok. But to me it feels too humiliating. And also often factually wrong, I.e. I’d be open to good argument.
Bulverism is a good term, thanks!
I don’t want to use this sugestion, not because it is escalatory, but because it’s a question, which invites them to have more opinions.
What I want is a way out, but that has the feeling of standing up for myself, rather then the feeling of humiliation and defeet.
If someone starts to accuse me of not beeing openminded to their opinion, it’s usually because I think their opinion is dumb. I rather not hurt their feelings if I can avoid it, but I’m also not going to worry too much about being polite to someone after they done this particular retorical move.
Usually the way out is to just leave. But last time this happened was at a small metup, and the only way out was to leave the event, which I did. I’m not happy about this and would like better options.
Even now and then I meet someone who tries to argue that if I don’t agree with them this is because I’m not open mided enough. Is there a term for this?
Epistemically I’m not convinced buy this type of arugment, but socialy it feels like I’m beeing shamed, and I hate it.
I also find it hard to call out this type of behaviur when it happens, even when I can tell exactly what is going on. I think it I had a name for this behaviour it would be easier? Not sure though?
Edit to add:
I’ve now got some more time to figure out what I want and don’t want out of this thread. The early responses helped with this, so thanks!What I’m most interested in is a name for this behaviour. Naming it helps in at least two ways. It makes it easier to call out in the moment (as mentioned above), but it also makes it easer for me to handle internaly. I can be like “ah, it’s this thing again” in my head, rather than being overwelmed.
What I’m not interested is in, is any advice/suggestions that continues the conversation. After a person have pulled one of these moves on me, I am both angry at them, and do not trust them to cooporate in a any form of good faith conversation.
If you have some ideas for how I can end the conversation that does not feel uterly humiliating to me, please tell me. Anything that is phrased like a question is out. I do not want to heare what they have to say, and asking quiestions that you don’t want answers to, is wrong and bad.
Is this something you’re stilld doing?
(Just asking in general to keep track of what resouses exists.)
I remember reading a claim that the steering subsystem notices how much of the white is visible in other peoples eyes.
The text probably didn’t use the term “steering subsystem” (unless I got this from one of your posts), but that’s how I remember interpreting it.
Cats have eye-based facial expression too. Squinting (half closed eyes) means relaxed and trusting. If a cat slow-blinks at you (closes their eyes all the way) that’s means they like you.