That is all quite fascinating, in a “fancy that!” fashion, but whenever I see correlational data reported I wonder about the magnitude of the effect, and a measure of that magnitude in terms of bits of information. The first result they report is that if there were no influence between name and state of residence, the proportion of coincidences would be 0.1664, while the observed level is 0.1986. How large an influence does this represent?
I am not quite sure what the correct calculation to make is—perhaps someone more versed in these matters can say—but when I calculate the Kullback-Leibler divergence between two binary distributions, one with p=0.1664 and the other with p=0.1986, I get about 0.005 bits. When I estimate the mutual information between name and state, making various assumptions about the data I’d need for a precise calculation, I get a similar figure.
In short, if you want to predict someone’s name from their state, or vice versa, the result is completely useless. Of course, making such a prediction was not the authors’ purpose. But then, what was? What can you do with less than a hundredth of a bit?
How justifiable is it to report the finding in these words (quotes from the paper):
people are attracted to places that resemble their own names.
and
these findings challenge traditional assumptions about how people make major life decisions
I have just found where Andrew Gelman has blogged about this (search his blog for “Pelham”). I don’t have time to read what he says at the moment, but his headlines indicate he doesn’t rate it.
More specifically, such a small effect does not require a widespread bias; if just a tiny number of people have a stronger (even conscious) bias, it could explain the data.
Ah. I figured he must have done it at some point since the only copy of the PDF file I could find was on Andrew’s site with the name “stuff-for-blog”, but Google searches for “Gelman Pelham” and “Gelman name letter” didn’t turn anything up. If I’d known I would have just linked him. I hope he’s not upset that I’m “copying” him.
I did not first read this study on Gelman’s blog. Actually, there is a story behind where I first read it. It was in a college psychology class. I was quite nervous throughout class that day, because I was going to ask the professor after class to write me a letter of recommendation for a postgrad program in Scotland I wanted to get into. We spent an hour or so going over this paper and implicit egoism, and then after class I asked the professor to help me get into the program, and she started cracking up.
...see, my real name is Scott, and it was a program in Scotland, and we’d just finished studying the name letter effect...the next day she told the entire class about it, and I was suitably embarrassed, and the name letter effect has stuck in my memory ever since.
That is all quite fascinating, in a “fancy that!” fashion, but whenever I see correlational data reported I wonder about the magnitude of the effect, and a measure of that magnitude in terms of bits of information. The first result they report is that if there were no influence between name and state of residence, the proportion of coincidences would be 0.1664, while the observed level is 0.1986. How large an influence does this represent?
I am not quite sure what the correct calculation to make is—perhaps someone more versed in these matters can say—but when I calculate the Kullback-Leibler divergence between two binary distributions, one with p=0.1664 and the other with p=0.1986, I get about 0.005 bits. When I estimate the mutual information between name and state, making various assumptions about the data I’d need for a precise calculation, I get a similar figure.
In short, if you want to predict someone’s name from their state, or vice versa, the result is completely useless. Of course, making such a prediction was not the authors’ purpose. But then, what was? What can you do with less than a hundredth of a bit?
How justifiable is it to report the finding in these words (quotes from the paper):
and
I have just found where Andrew Gelman has blogged about this (search his blog for “Pelham”). I don’t have time to read what he says at the moment, but his headlines indicate he doesn’t rate it.
Blog posts by Andrew Gelman:
Why it’s not so weird that so many dentists are named Dennis: a story of conditional probability
How many people choose careers based on their names?
Is there a reason NOT to link to the posts directly and have the readers repeat the search?
More specifically, such a small effect does not require a widespread bias; if just a tiny number of people have a stronger (even conscious) bias, it could explain the data.
Ah. I figured he must have done it at some point since the only copy of the PDF file I could find was on Andrew’s site with the name “stuff-for-blog”, but Google searches for “Gelman Pelham” and “Gelman name letter” didn’t turn anything up. If I’d known I would have just linked him. I hope he’s not upset that I’m “copying” him.
I did not first read this study on Gelman’s blog. Actually, there is a story behind where I first read it. It was in a college psychology class. I was quite nervous throughout class that day, because I was going to ask the professor after class to write me a letter of recommendation for a postgrad program in Scotland I wanted to get into. We spent an hour or so going over this paper and implicit egoism, and then after class I asked the professor to help me get into the program, and she started cracking up.
...see, my real name is Scott, and it was a program in Scotland, and we’d just finished studying the name letter effect...the next day she told the entire class about it, and I was suitably embarrassed, and the name letter effect has stuck in my memory ever since.