AdeleneDawner comments on The Power of Reinforcement

AdeleneDawner 21 Jun 2012 5:03 UTC
62 points
Bit of a tangent, but if you ever run across someone for whom this doesn’t seem to work, check the hypothesis that they don’t parse praise as a positive reinforcer. I don’t know how common this is, but I actually have to make a conscious effort to keep it from acting as a mild punishment in most cases when it’s applied to me. (Ditto M&Ms in the given context, I expect. Attention Bad.)
- daenerys 21 Jun 2012 6:37 UTC
  16 points
  Parent
  You are correct that there are many kinds of reinforcers, and it’s important to make sure that the one you choose to use is something the receiver will desire.
  
  “In other studies, animals and people given a choice between performing a task for either of two reinforcers often show strong preferences (Parsons & Reid, 1990; Simmons, 1924). Identifying preferred reinforcers can improve the effectiveness of a reinforcement procedure in applied settings (Mace et al., 1997).”
  
  -Learning and Behavior, p149
- Will_Newsome 21 Jun 2012 22:17 UTC
  15 points
  Parent
  Furthermore at least one person I know (er, myself) picks up on any sort of test-like or game-like or we’re-judging-you-so-you-better-not-screw-up-like context and starts acting in extremely confusing/uninformative/atypical/misleading ways so as not to be seen as the kind of person who is easily manipulable (there are probably other motivations involved too). Any incentive structure I’m put under thus has to somehow take this into account, even e.g. the LessWrong karma system. Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don’t get the impression this sort of defense mechanism is very common.
  - Oligopsony 27 Jun 2012 20:37 UTC
    44 points
    Parent
    Excellent insight. Downvoted.
  - Lethalmud 26 Jun 2013 14:45 UTC
    8 points
    Parent
    So you are saying that, to change your mode of behavior, all one has to do is create a judging context? That would actually make you very easy to manipulate..
  - TheOtherDave 21 Jun 2012 22:21 UTC
    3 points
    Parent
    
    Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don’t get the impression this sort of defense mechanism is very common.
    
    I experience this as common, but I suspect it’s because of a small number of exceptionally vocal “manipulation is evil!” types in my life, rather than a larger number of typically vocal ones.
- Viliam_Bur 21 Jun 2012 9:56 UTC
  4 points
  Parent
  Yes, the situation is usually not so easy that behavior is just a result of inputs, like this:
  
  output := f(input)
  
  People have minds, and a mind is an environment, different for different people. The real equation would be more like this:
  
  [mind1, output] := f([mind0, input])
  
  For example many people like attention of others, but some people may be trained (for example by a previous abuse) that attention of others is usually followed by pain. For them, a positive reinforcement by giving them attention wouldn’t work, because the important things is not the attention per se, but what it means for them.
  
  On a meta level, for someone even the idea of “learning” or “improving” or “changing” may be already associated with pain, so they will resist any such process if they notice it. A human mind can be messed up rather easily.
  - DaFranker 7 Jul 2012 4:16 UTC
    1 point
    Parent
    
    [...] On a meta level, for someone even the idea of “learning” or “improving” or “changing” may be already associated with pain, so they will resist any such process if they notice it. A human mind can be messed up rather easily.
    
    This becomes painfully common (and obvious to any observant third-party that knows these concepts) for subjects that students “are just not made for”, such as large amounts of students that “just don’t get” maths. They’ve been trained in so many ways to associate actual learning (especially the actions taken when attempting to learn a concept) with negativity that it becomes obviously so much more rewarding to just guess the teacher’s password, and so they are positively reinforced into doing everything they can to avoid mental modeling and seek password-guessing through aggregation and correlation of symbol-data. In most cases I’ve observed, they become experts at the skill of subconsciously forming “truth-tables” of teachers’ passwords through brute-force trial-and-error tactics. What’s more, this tactic, which they’ve been trained to do and learned so well and associate so much with positive feedback, often feeds itself into a vicious circle through several possible methods, which makes getting out (or, for that matter, even realizing that it’s there and you need to get out of it) so much more difficult than if that behavior had been blocked immediately when it first appeared.
    
    When I realized that, I’ve started to feel sad for every student I see showing signs of spending hours upon hours studying and memorizing and headsmashing against the same math problems “until they finally understand them”, when in truth they haven’t really gained anything worthwhile (IMO) from the experience.
- [deleted] 21 Jun 2012 5:20 UTC
  1 point
  Parent
  I’d have to say that it shouldn’t be that common. Most people want to be praised.
  - erratio 21 Jun 2012 13:21 UTC
    28 points
    Parent
    Most people want to be sincerely praised. Someone who reads this post and applies it poorly is going to be saying praise while their body language says something else entirely. Or acting out of character for themselves, leading the reinforcee to suspect that the praise is insincere. Or they may go around praising seemingly everything, causing the reinforcee to interpret the praise as meaningless noise.
    
    There are lots of ways for using praise as reinforcement to go wrong, and if someone is in one of those environments for long enough they will end up being conditioned to interpret praise as neutral or negative.
  - JGWeissman 21 Jun 2012 5:35 UTC
    6 points
    Parent
    I suspect it is common enough that when you observe that praising someone doesn’t reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.
    - pjeby 21 Jun 2012 16:41 UTC
      14 points
      Parent
      
      I suspect it is common enough that when you observe that praising someone doesn’t reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.
      
      And also, that you might just be really bad at it. ;-)
      
      This was my problem for quite a while: believing that I ought to praise people, while alieving that there wasn’t anything to praise and that they didn’t deserve it, due to all their obvious imperfections.
      
      This, as you can imagine, produced sub-optimal results. ;-)
    - AdeleneDawner 21 Jun 2012 6:27 UTC
      2 points
      Parent
      Yep. It’s not a situation you’re likely to come across often, but when you do, it’s worth having the alternate theory available to check.
- mwengler 3 Jul 2012 21:45 UTC
  0 points
  Parent
  I would expect the rate at which people run counter to “usual” reinforcers to be far less than the rate at which people claim to run counter to “usual” reinforcers. Humans are not naturally very good at reflection of certain sorts.
  
  Having said that I imagine reinforcing techniques would work brilliantly on me whether I knew I was being trained or not. I’d rather know I was being trained (and therefore know what I was being trained to do) and I would therefore wish to reinforce the trainer’s openness about the process by having it work better when s/he is open. But even when I didn’t realize it, I have every reason to believe that I am not some sort of exception to human neurobiology.
  - AdeleneDawner 12 Jul 2012 3:20 UTC
    2 points
    Parent
    
    I would expect the rate at which people run counter to “usual” reinforcers to be far less than the rate at which people claim to run counter to “usual” reinforcers.
    
    Yes, which is why I said ‘someone for whom this doesn’t seem to work’, not ‘someone who claims that this doesn’t work on them’ - though of course in the latter case it’s at least polite to humor them.
    
    I also didn’t say that reinforcing techniques don’t work on me—I’ve never run into anyone for whom that was even remotely plausible, in fact. Just, you have to use things that don’t squick me out as positive reinforcers, and overt praise and rewards aren’t in that category.