Among mathematicians, of course, one ought to be able to go from one interpretation to the other as necessary (I think it boils down to switching from an algebraic interpretation of your random variables to a geometric one).
If non-mathematicians were exposed to the angle version… that might help a lot of people understand how correlations interact. But, ultimately, the result is still a black box: in practice, I expect people to memorize “0.99 very good, 0.7 okay, 0.05 bad” which just turns into “8 degrees good, 45 degrees okay, 87 degrees bad”.
In some cases, a (positive) correlation coefficient can be thought of as a probability, which is another point in its favor.
Edit: I started out giving a somewhat confusing example so I’m just going to edit this comment to say something less confusing.
Let X be any random variable, and define Y as follows: with probability R, Y=X, and otherwise Y is independently drawn from the same distribution as X. Then X and Y are identically distributed and have correlation R.
This one. As far as I know, it’s the only kind of correlation that the cosine interpretation is valid for.
If you want to verify my claim, it will help to assume that your distribution has mean 0 and variance 1, in which case the correlation between X and Y is just E[XY]. But correlation is invariant under shifting and scaling, so this is fully general.
Edit: I suppose I was imprecise when referring to the correlation between bit strings; what I mean there is simply the correlation between any pair of corresponding bits. This sort of confusion appears to be standard.
Yes, I was wondering what you meant by the correlation of a string. Also, the definition doesn’t apply to binary variables unless you somehow interpret bits as numbers, but as you say, it doesn’t matter how you do so. There’s a reason I asked what definition you were using, not the original post.
This sort of confusion appears to be standard.
Which confusion? The confusion between a random variable and sequence of iid draws from it? And what do you mean by standard?
By “this sort of confusion” I meant the ambiguity between the correlation of two bit strings and the correlations between the individual bits. By “standard” I mean that I didn’t make this up; I’ve seen other people do this.
Anyway, perhaps it’s better to focus on the more general (and less notation-abusing) example I gave.
Among mathematicians, of course, one ought to be able to go from one interpretation to the other as necessary (I think it boils down to switching from an algebraic interpretation of your random variables to a geometric one).
If non-mathematicians were exposed to the angle version… that might help a lot of people understand how correlations interact. But, ultimately, the result is still a black box: in practice, I expect people to memorize “0.99 very good, 0.7 okay, 0.05 bad” which just turns into “8 degrees good, 45 degrees okay, 87 degrees bad”.
In some cases, a (positive) correlation coefficient can be thought of as a probability, which is another point in its favor.
I haven’t come across that. Could you amplify it?
Edit: I started out giving a somewhat confusing example so I’m just going to edit this comment to say something less confusing.
Let X be any random variable, and define Y as follows: with probability R, Y=X, and otherwise Y is independently drawn from the same distribution as X. Then X and Y are identically distributed and have correlation R.
What definition of correlation are you using?
This one. As far as I know, it’s the only kind of correlation that the cosine interpretation is valid for.
If you want to verify my claim, it will help to assume that your distribution has mean 0 and variance 1, in which case the correlation between X and Y is just E[XY]. But correlation is invariant under shifting and scaling, so this is fully general.
Edit: I suppose I was imprecise when referring to the correlation between bit strings; what I mean there is simply the correlation between any pair of corresponding bits. This sort of confusion appears to be standard.
Yes, I was wondering what you meant by the correlation of a string. Also, the definition doesn’t apply to binary variables unless you somehow interpret bits as numbers, but as you say, it doesn’t matter how you do so. There’s a reason I asked what definition you were using, not the original post.
Which confusion? The confusion between a random variable and sequence of iid draws from it? And what do you mean by standard?
By “this sort of confusion” I meant the ambiguity between the correlation of two bit strings and the correlations between the individual bits. By “standard” I mean that I didn’t make this up; I’ve seen other people do this.
Anyway, perhaps it’s better to focus on the more general (and less notation-abusing) example I gave.