# [deleted] comments on Stupid Questions August 2015

• 11 Aug 2015 20:13 UTC
1 point

Thank you! (In part, for such faith in my abil­ities:) Have to go hunt my­self a pro­gram­mer for din­ner...)

It seems that if M-r 1 gives M-r 2 the same sub­sam­ple of the mid­dle la­tent vari­ables (pho­toes of fields of vi­sion, scor­ing them gives you the dat­a­points), and the x1 is com­pared with x2, they can see the least differ­ence be­tween them, which is (largely?) sam­ple-in­de­pen­dent. If, how­ever, M-r 1 and M-r 2 each draw their sub­sam­ples in­de­pen­dently, the differ­ence be­tween x1 and x2 should be larger due to chance, right?.. So if we look at the differ­ence in differ­ences be­tween x1and x2, and it is greater for some mid­dle la­tent vari­ables (ways of stain­ing) than for oth­ers, can we use it as a mea­sure of ‘the over­all vari­abil­ity of the mea­sur­ing method’? Say, if we have ten mea­sur­ers and four mea­sur­ing meth­ods...

(I’m ask­ing you this be­cause it is rel­a­tively sim­ple to do in prac­tice, not be­cause I think this would be the most effi­cient way.)

• You can es­ti­mate the bias of each mea­surer much more effi­ciently if you have them mea­sure the same sam­ple, yes, analo­gous to crossover: now the differ­ences are due less to the wide di­ver­sity of the sam­pled pop­u­la­tion and more to the par­tic­u­lar mea­surer.

(To put it a lit­tle more mathily, when each mea­surer mea­sures differ­ent sam­ples, then the mea­sure­ments will be spread very widely be­cause it’s Var(mea­surer-bias) + Var(pop­u­la­tion); but if we have the mea­sur­ers mea­sure the same sam­ple, then Var(pop­u­la­tion) drops out and now there’s just Var(mea­surer-bias). If I mea­sure a sam­ple and get 2.9 and you mea­sure it as well and get 3.1, then prob­a­bly the sam­ple is re­ally ~3.0 and my bias is −0.1 and your bias is +0.1. If I mea­sure one sam­ple and get 2.9 and you mea­sure a differ­ent sam­ple and get 3.1, then my bias and your bias are… ???)

For ex­am­ple, the clas­sic ex­am­ple for MLMs is you have n class­rooms’ test scores, and you want to figure out the teach­ers’ effects. It’s hard to tell be­cause the class­rooms’ av­er­age scores will differ a lot on their own. This is analo­gous to your origi­nal de­scrip­tion: each mea­surer gets their own batch of sam­ples. But what if you had a crossed de­sign of one class­room with test scores af­ter it’s taught by each teacher? Then much of the differ­ences in the av­er­age score will be due to the par­tic­u­lar effect of each teacher and that will be much eas­ier to es­ti­mate.

So if we look at the differ­ence in differ­ences be­tween x1and x2, and it is greater for some mid­dle la­tent vari­ables (ways of stain­ing) than for oth­ers, can we use it as a mea­sure of ‘the over­all vari­abil­ity of the mea­sur­ing method’? Say, if we have ten mea­sur­ers and four mea­sur­ing meth­ods...

I guess. From a fac­tor anal­y­sis per­spec­tive, you just want to pick the one with the high­est load­ing on X, I think.

• Huh. Your an­swer was even more use­ful for me than I ex­pected. My ‘se­cret agenda’ is to put forth an­other moun­tant medium, which might have ad­van­tages over the one in use, but I will have to show that they do not differ in prepa­ra­tion qual­ity. I think I am go­ing to do a 2-by-2 crossover.

So—thank you! Analo­gies for the win!

• The prob­lem is that what­ever one I will find the most de­sir­able, other peo­ple will con­tinue us­ing the meth­ods they are good at. And I will have to some­how com­pare x(A)1, x(B)32 and x(C)3...

And this is a rel­a­tively straight­for­ward situ­a­tion, things are of­ten much less clear in en­vi­ron­men­tal sci­ence, already on the method­ol­ogy level.

• The prob­lem is that what­ever one I will find the most de­sir­able, other peo­ple will con­tinue us­ing the meth­ods they are good at. And I will have to some­how com­pare x(A)1, x(B)32 and x(C)3...

I don’t re­ally un­der­stand the prob­lem. Yes, maybe you can’t con­trol them and get ev­ery­one onto the same method page. But I’ve already ex­plained how you deal with that, given you the rele­vant key­words to search for like ‘mea­sure­ment er­ror’, and also given you ex­am­ple R code im­ple­ment­ing sev­eral ap­proaches.

They all take the ba­sic ap­proach of treat­ing it as data/​mea­sure­ments which load on a la­tent vari­able for each method, and each method loads on the la­tent vari­able which is what you ac­tu­ally want; then you can in­fer what­ever you need to. The first level of la­tent vari­ables helps you es­ti­mate the bi­ases of each cat­e­gory, some of which may be smaller than oth­ers, and then you col­lec­tively use them to es­ti­mate the fi­nal la­tent vari­able. Now you have a prin­ci­pled way to unify all your data from dis­parate meth­ods which mea­sure in similar but not iden­ti­cal way the vari­able you care about. If some­one else comes up with a new method, it can be in­cor­po­rated like the rest.

• Right—sorry, melt­ing brain. (Also, I had just thought that the as­sumed 10% differ­ence be­tween two mea­sur­ers has not, in fact, been es­tab­lished rigor­ously, and that de­railed the still-solid brain...)

• ...okay, I started the Cross-over tri­als by Jones and Ken­ward, and im­me­di­ately got an­other stupid ques­tion (yay, me): if we do a two-pe­riod two-treat­ment de­sign, with sub­ject group 1 cross­ing over from A to B and sub­ject group 2 cross­ing from B to A, and we note the effects for A and B, how many con­trols do we need to run? As in, surely we would need a sg 3 which re­ceives no treat­ment, sg 4, 5 and 6 which re­ceive only treat­ment A (in the first half; in the sec­ond half; for the full du­ra­tion of the ex­per­i­ment) and sg 7, 8 and 9 which re­ceive only B?..