Bit of a tangent, but if you ever run across someone for whom this doesn’t seem to work, check the hypothesis that they don’t parse praise as a positive reinforcer. I don’t know how common this is, but I actually have to make a conscious effort to keep it from acting as a mild punishment in most cases when it’s applied to me. (Ditto M&Ms in the given context, I expect. Attention Bad.)
You are correct that there are many kinds of reinforcers, and it’s important to make sure that the one you choose to use is something the receiver will desire.
“In other studies, animals and people given a choice between performing a task for either of two reinforcers often show strong preferences (Parsons & Reid, 1990; Simmons, 1924). Identifying preferred reinforcers can improve the effectiveness of a reinforcement procedure in applied settings (Mace et al., 1997).”
Furthermore at least one person I know (er, myself) picks up on any sort of test-like or game-like or we’re-judging-you-so-you-better-not-screw-up-like context and starts acting in extremely confusing/uninformative/atypical/misleading ways so as not to be seen as the kind of person who is easily manipulable (there are probably other motivations involved too). Any incentive structure I’m put under thus has to somehow take this into account, even e.g. the LessWrong karma system. Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don’t get the impression this sort of defense mechanism is very common.
So you are saying that, to change your mode of behavior, all one has to do is create a judging context? That would actually make you very easy to manipulate..
Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don’t get the impression this sort of defense mechanism is very common.
I experience this as common, but I suspect it’s because of a small number of exceptionally vocal “manipulation is evil!” types in my life, rather than a larger number of typically vocal ones.
Yes, the situation is usually not so easy that behavior is just a result of inputs, like this:
output := f(input)
People have minds, and a mind is an environment, different for different people. The real equation would be more like this:
[mind1, output] := f([mind0, input])
For example many people like attention of others, but some people may be trained (for example by a previous abuse) that attention of others is usually followed by pain. For them, a positive reinforcement by giving them attention wouldn’t work, because the important things is not the attention per se, but what it means for them.
On a meta level, for someone even the idea of “learning” or “improving” or “changing” may be already associated with pain, so they will resist any such process if they notice it. A human mind can be messed up rather easily.
[...] On a meta level, for someone even the idea of “learning” or “improving” or “changing” may be already associated with pain, so they will resist any such process if they notice it. A human mind can be messed up rather easily.
This becomes painfully common (and obvious to any observant third-party that knows these concepts) for subjects that students “are just not made for”, such as large amounts of students that “just don’t get” maths. They’ve been trained in so many ways to associate actual learning (especially the actions taken when attempting to learn a concept) with negativity that it becomes obviously so much more rewarding to just guess the teacher’s password, and so they are positively reinforced into doing everything they can to avoid mental modeling and seek password-guessing through aggregation and correlation of symbol-data. In most cases I’ve observed, they become experts at the skill of subconsciously forming “truth-tables” of teachers’ passwords through brute-force trial-and-error tactics. What’s more, this tactic, which they’ve been trained to do and learned so well and associate so much with positive feedback, often feeds itself into a vicious circle through several possible methods, which makes getting out (or, for that matter, even realizing that it’s there and you need to get out of it) so much more difficult than if that behavior had been blocked immediately when it first appeared.
When I realized that, I’ve started to feel sad for every student I see showing signs of spending hours upon hours studying and memorizing and headsmashing against the same math problems “until they finally understand them”, when in truth they haven’t really gained anything worthwhile (IMO) from the experience.
Most people want to be sincerely praised. Someone who reads this post and applies it poorly is going to be saying praise while their body language says something else entirely. Or acting out of character for themselves, leading the reinforcee to suspect that the praise is insincere. Or they may go around praising seemingly everything, causing the reinforcee to interpret the praise as meaningless noise.
There are lots of ways for using praise as reinforcement to go wrong, and if someone is in one of those environments for long enough they will end up being conditioned to interpret praise as neutral or negative.
I suspect it is common enough that when you observe that praising someone doesn’t reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.
I suspect it is common enough that when you observe that praising someone doesn’t reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.
And also, that you might just be really bad at it. ;-)
This was my problem for quite a while: believing that I ought to praise people, while alieving that there wasn’t anything to praise and that they didn’t deserve it, due to all their obvious imperfections.
This, as you can imagine, produced sub-optimal results. ;-)
I would expect the rate at which people run counter to “usual” reinforcers to be far less than the rate at which people claim to run counter to “usual” reinforcers. Humans are not naturally very good at reflection of certain sorts.
Having said that I imagine reinforcing techniques would work brilliantly on me whether I knew I was being trained or not. I’d rather know I was being trained (and therefore know what I was being trained to do) and I would therefore wish to reinforce the trainer’s openness about the process by having it work better when s/he is open. But even when I didn’t realize it, I have every reason to believe that I am not some sort of exception to human neurobiology.
I would expect the rate at which people run counter to “usual” reinforcers to be far less than the rate at which people claim to run counter to “usual” reinforcers.
Yes, which is why I said ‘someone for whom this doesn’t seem to work’, not ‘someone who claims that this doesn’t work on them’ - though of course in the latter case it’s at least polite to humor them.
I also didn’t say that reinforcing techniques don’t work on me—I’ve never run into anyone for whom that was even remotely plausible, in fact. Just, you have to use things that don’t squick me out as positive reinforcers, and overt praise and rewards aren’t in that category.
Bit of a tangent, but if you ever run across someone for whom this doesn’t seem to work, check the hypothesis that they don’t parse praise as a positive reinforcer. I don’t know how common this is, but I actually have to make a conscious effort to keep it from acting as a mild punishment in most cases when it’s applied to me. (Ditto M&Ms in the given context, I expect. Attention Bad.)
You are correct that there are many kinds of reinforcers, and it’s important to make sure that the one you choose to use is something the receiver will desire.
-Learning and Behavior, p149
Furthermore at least one person I know (er, myself) picks up on any sort of test-like or game-like or we’re-judging-you-so-you-better-not-screw-up-like context and starts acting in extremely confusing/uninformative/atypical/misleading ways so as not to be seen as the kind of person who is easily manipulable (there are probably other motivations involved too). Any incentive structure I’m put under thus has to somehow take this into account, even e.g. the LessWrong karma system. Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don’t get the impression this sort of defense mechanism is very common.
Excellent insight. Downvoted.
So you are saying that, to change your mode of behavior, all one has to do is create a judging context? That would actually make you very easy to manipulate..
I experience this as common, but I suspect it’s because of a small number of exceptionally vocal “manipulation is evil!” types in my life, rather than a larger number of typically vocal ones.
Yes, the situation is usually not so easy that behavior is just a result of inputs, like this:
output := f(input)
People have minds, and a mind is an environment, different for different people. The real equation would be more like this:
[mind1, output] := f([mind0, input])
For example many people like attention of others, but some people may be trained (for example by a previous abuse) that attention of others is usually followed by pain. For them, a positive reinforcement by giving them attention wouldn’t work, because the important things is not the attention per se, but what it means for them.
On a meta level, for someone even the idea of “learning” or “improving” or “changing” may be already associated with pain, so they will resist any such process if they notice it. A human mind can be messed up rather easily.
This becomes painfully common (and obvious to any observant third-party that knows these concepts) for subjects that students “are just not made for”, such as large amounts of students that “just don’t get” maths. They’ve been trained in so many ways to associate actual learning (especially the actions taken when attempting to learn a concept) with negativity that it becomes obviously so much more rewarding to just guess the teacher’s password, and so they are positively reinforced into doing everything they can to avoid mental modeling and seek password-guessing through aggregation and correlation of symbol-data. In most cases I’ve observed, they become experts at the skill of subconsciously forming “truth-tables” of teachers’ passwords through brute-force trial-and-error tactics. What’s more, this tactic, which they’ve been trained to do and learned so well and associate so much with positive feedback, often feeds itself into a vicious circle through several possible methods, which makes getting out (or, for that matter, even realizing that it’s there and you need to get out of it) so much more difficult than if that behavior had been blocked immediately when it first appeared.
When I realized that, I’ve started to feel sad for every student I see showing signs of spending hours upon hours studying and memorizing and headsmashing against the same math problems “until they finally understand them”, when in truth they haven’t really gained anything worthwhile (IMO) from the experience.
I’d have to say that it shouldn’t be that common. Most people want to be praised.
Most people want to be sincerely praised. Someone who reads this post and applies it poorly is going to be saying praise while their body language says something else entirely. Or acting out of character for themselves, leading the reinforcee to suspect that the praise is insincere. Or they may go around praising seemingly everything, causing the reinforcee to interpret the praise as meaningless noise.
There are lots of ways for using praise as reinforcement to go wrong, and if someone is in one of those environments for long enough they will end up being conditioned to interpret praise as neutral or negative.
I suspect it is common enough that when you observe that praising someone doesn’t reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.
And also, that you might just be really bad at it. ;-)
This was my problem for quite a while: believing that I ought to praise people, while alieving that there wasn’t anything to praise and that they didn’t deserve it, due to all their obvious imperfections.
This, as you can imagine, produced sub-optimal results. ;-)
Yep. It’s not a situation you’re likely to come across often, but when you do, it’s worth having the alternate theory available to check.
I would expect the rate at which people run counter to “usual” reinforcers to be far less than the rate at which people claim to run counter to “usual” reinforcers. Humans are not naturally very good at reflection of certain sorts.
Having said that I imagine reinforcing techniques would work brilliantly on me whether I knew I was being trained or not. I’d rather know I was being trained (and therefore know what I was being trained to do) and I would therefore wish to reinforce the trainer’s openness about the process by having it work better when s/he is open. But even when I didn’t realize it, I have every reason to believe that I am not some sort of exception to human neurobiology.
Yes, which is why I said ‘someone for whom this doesn’t seem to work’, not ‘someone who claims that this doesn’t work on them’ - though of course in the latter case it’s at least polite to humor them.
I also didn’t say that reinforcing techniques don’t work on me—I’ve never run into anyone for whom that was even remotely plausible, in fact. Just, you have to use things that don’t squick me out as positive reinforcers, and overt praise and rewards aren’t in that category.