Vaniver comments on Link: Writing exercise closes the gender gap in university-level physics

Vaniver 28 Nov 2010 19:12 UTC
8 points
From the picture accompanying the article (so numbers may be slightly off): on the final exam, men’s average score minus women’s average score was 11.3+-4.6% in the control group and 2.4+-3.8% in the experimental group. The difference in gap was thus 8.9+-6%, so about 1.5 standard deviations from no difference.

Women’s score in the experimental group minus the control group was 5.9+-5.2%. Respectable, but only a bit above 1 standard deviation.

Men’s score in the experimental group minus the control group was −3.3+-3.8%. Focusing on their values, rather than values other people have, made men worse at this test by a comparable amount to how much it made women better (in terms of standard deviations, not absolutes). The standard deviations narrowed for both groups- for the women, this was reported as the worst women doing better, and for the men, it seems reasonable to assume this means the best men did worse.

So, what the heck is going on here? Most likely seems statistical fluke- the experimental group happened to contain worse men and better women. These results don’t seem terribly statistically significant (to get my numbers, I added together four normals with stdevs of the error bars on the picture; it would be better to check the statistical analysis in the paper itself), and so that possibility is rather strong.

An alternative is that most of these “gap-closing” mechanisms actually impede the superior group and actually help the inferior group. The control group’s male score minus the experimental group’s female score is 5.6+-4%- almost 1.5 stdevs from no difference (control male—control female was almost 2.5 stdevs from no difference).

Two ways to have half of each: this might have been a statistical fluke that the experimental men did worse, but this actually improves female performance. Or, the value affirmation might have made everyone do worse, but the women by some fluke did better (this is least likely, given that the women would have to be 2 stdevs unlikely upwards in the experimental group).
- thomblake 2 Dec 2010 14:25 UTC
  0 points
  Parent
  
  An alternative is that most of these “gap-closing” mechanisms actually impede the superior group and actually help the inferior group.
  
  That was my first thought. If a physics teacher made me waste 15 minutes on such a stupid, non-physics-related exercise, I’d likely do very badly in the class (more likely, walk out and drop the class immediately).
  - Vaniver 2 Dec 2010 18:54 UTC
    0 points
    Parent
    Everybody wasted 15 minutes. The question was just what they focused on (and both options weren’t physics related).
    - thomblake 2 Dec 2010 21:10 UTC
      0 points
      Parent
      I think I might be missing your point—I already thought that was the case.
      - Vaniver 2 Dec 2010 21:24 UTC
        4 points
        Parent
        That would explain a possible difference between an experimental group that spent a 15 minute exercise on stuff other than physics and a control group that did just physics- the best students might leave the experimental group, bringing down its mean and standard deviation. But as only the focus differed between the two groups, I don’t see how the impulse to leave classes that waste your time would manifest itself as a difference between the experimental and control groups. If such an effect is measurable in outcomes, it would not be noticed in this experiment.
        thomblake 2 Dec 2010 23:58 UTC
        0 points
        Parent
        Ah, missed that detail, thanks.
        
        Here I had just assumed one of the groups would have been taught some physics during that 15 minutes. I guess we’ll just have to keep wondering how much better teaching physics does at making people learn physics, than not teaching physics.