While only having read the abstract at the moment, this seems to confirm my belief that one should generate a large amount of hypotheses when one wants a more rigorous answer to a question. I’ve started doing this in my PhD research, mostly by compiling others’ hypotheses, but also by generating my own. I’ve been struck by how few researchers actually do this. However, the researchers who indeed do consider multiple hypotheses (e.g., in my field one major researcher who does is Rolf Reitz) earn greater respect from me.

Also, hypothesis generation is definitely non-trivial in real scientific domains. Both generating entirely new hypotheses and steelmanning existing hypotheses are non-trivial. It doesn’t matter if your scientific method will converge to the right hypothesis if it’s in your considered set if most sets don’t contain the “correct” hypothesis...

Very interesting paper. I will be reading this closely. Thanks for posting this link.

This is a real elephant in the room. It’s been mentioned a few times here, but it remains a major epidiment to Bayes is the Only Epistemology you need, and other cherished notions.

The problem of ignored hypotheses with known relations

The biggest problem is with the “hypotheses and logical relations” setup.

The setup is deceptively easy to use in toy problems where you can actually list all of the possible hypotheses. The classic example is a single roll of a fair six-sided die. There is a finite list of distinct hypotheses one could have about the outcome, and they are all generated by conjunction/disjunction of the six “smallest” hypotheses, which assert that the die will land on one specific face. Using the set formalism, we can write these as

{1}, {2}, {3}, {4}, {5}, {6}

Any other hypothesis you can have is just a set with some of these numbers in it. “2 or 5″ is {2, 5}. “Less than 3” is just {1, 2}, and is equivalent to “1 or 2.” “Odd number” is {1, 3, 5}.

Since we know the specific faces are mutually exclusive and exhaustive, and we know their probabilities (all ^{1}⁄_{6}), it’s easy to compute the probability of any other hypothesis: just count the number of elements. {2, 5} has probability ^{2}⁄_{6}, and so forth. Conditional probabilities are easy too: conditioning on “odd number” means the possible faces are {1, 3, 5}, so now {2, 5} has conditional probability ^{1}⁄_{3}, because only one of the three possibilities is in there.

Because we were building sets out of individual members, here, we automatically obeyed the logical consistency rules, like not assigning “A or B” a smaller probability than “A.” We assigned probability ^{2}⁄_{6} to “2 or 5″ and probability ^{1}⁄_{6} to “2,” but we didn’t do that by thinking “hmm, gotta make sure we follow the consistency rules.” We could compute the probabilities exactly from first principles, and of course they followed the rules.

In most real-world cases of interest, though, we are not building up hypotheses from atomic outcomes in this exact way. Doing that is equivalent to stating exact necessary and sufficient conditions in terms of the finest-grained events we can possibly imagine; to do it for a hypothesis like “Trump will be re-elected in 2020,” we’d have to write down all the possible worlds where Trump wins, and the ones where he doesn’t, in terms of subatomic physics.

Instead, what we have in the real world is usually a vast multiple of conceivable hypotheses, very few of which we have actively considered (or will ever consider), and – here’s the kicker – many of these unconsidered hypotheses have logical relations to the hypotheses under consideration which we’d know if we considered them.

Thanks for pointing out that post by nostalgebraist. I had not seen it before and it definitely is of interest to me. I’m interested in hearing anything else along these lines, particularly information about solving this problem.

Seems interesting, but almost everything of it flew over my head. Still, even only the table of the biases might be of use in self-critique; like, am I unpacking the hypothesis A into several unlikely smaller hypotheses (A1, A2...) and so underestimating the probability of A because I see it as a conjunction of events of small probability?.. am I, on the other hand, thinking of A as a disjunction, and so overestimating its probability?..

It is hard (at least, for me) to even judge whether I consider something a disjunction or a conjunction—I usually have my ‘hypotheses’ as chunks of thought, of variable length, just running their course from the spark of noticing (something observed or imagined) to the end (often a variation of ‘who cares’).

(And if shower-thoughts are so helpful in thinking up alternative explanations and ‘just random stuff’… then I hate having had long hair for most of my life. It meant head colds in the cold season after every second bath! Hooray for short, easily driable hair!

OTOH, when we had to digest, in college, chunks that were noticeably too large—like, cramming for a test on Arthropoda, or for the final test on virology, or even a seminar on photosynthesis, - sometime half-way to the end I began having this feeling, ‘stop, this is too elaborate, rewind, this isn’t beautiful anymore, quit’. And whoa, it turned out that Arthropoda were even more “monstrous” than I could take in (= even more variable and specialized), viruses were “Just Doing It” with no regard for the “everyday” ranges of any nameable constraints, and the lovely, elegant photosynthesis seemed a total mess when coupled to the rest of the plant.

I mean, first, they give you something that obviously wouldn’t work as a human plot, then they show you that yes, it works, and then they tell you ‘oh btw, you graduated now, go do your thinking for yourself, there’s a good chap, and don’t forget to penalize complex hypotheses! Because conjunctions!’ Life is unfaaaair.)

Which is to say, I think there is a big problem in teaching people (biologists, at least) to think up hypotheses. In that, the reasoner knows the final product must be of some non-trivial complexity, in the colloquial sense of the word, but that he isn’t really allowed to put forward anything too complicated, because it’s too easy to imagine a just-so story. (So now I will imagine myself a just-so story, because duh.)

One way to tinker with research is to think of your tools, whether material or mathematical, as part of the picture instead of the brush and paints that nobody displays at the exhibition. For example, people who compare methods for staining plants’ roots for fungi see that the results are different and wonder what that tells about the physiology of the root as a whole; the relationship of the dye and the cytoplasm may depend on pH, and there comes a point when one can swear the difference in pH must be the property of the cytoplasm and not of the procedure of staining. From this, one may assume that the acidity of the cells changes depending on the development of the mycelium inside the root (and the age of the root, and other things); but why, exactly?… And then, comparing modifications of the methods and knowing at least the chemical structure of the dye, one may build hypotheses about the underlying physiological processes.

It’s hard to view, for example, scanning electron microscopy (or something comparably multi-stage) as an “alien artefact in the world” rather than “a great big thing that by the good will of its operator lets me see the surface of very small things”, but I think that such approach might be fruitful for generating hypotheses, at least in some not-too-applied cases. There must be fascinating examples of mathematical tools so toyed with. I mean, almost (?) all that I have read about math that was interesting for a layman, was written from this angle. But in school and college, they go with the “great big things” approach.
(hash tag ramblingasalways)

No. They discuss some of the other possibilities. For example, you don’t get the autocorrelation from other sampling methods like importance sampling or ABC sampling, which are drawing points independently of each other. They also mention particle sampling but I’m not sure how particle sampling vs classic MCMC would yield any observable psychological differences.

It doesn’t necessarily mean exactly the same biases since you could deliberately implement the opposite bias in some of those cases. But it suggests that such algorithms will consistently have overall a similar set of biases.

While only having read the abstract at the moment, this seems to confirm my belief that one should generate a large amount of hypotheses when one wants a more rigorous answer to a question. I’ve started doing this in my PhD research, mostly by compiling others’ hypotheses, but also by generating my own. I’ve been struck by how few researchers actually do this. However, the researchers who indeed do consider multiple hypotheses (e.g., in my field one major researcher who does is Rolf Reitz) earn greater respect from me.

Also, hypothesis generation is definitely non-trivial in real scientific domains. Both generating entirely new hypotheses and steelmanning existing hypotheses are non-trivial. It doesn’t matter if your scientific method will converge to the right hypothesis if it’s in your considered set if most sets don’t contain the “correct” hypothesis...

Very interesting paper. I will be reading this closely. Thanks for posting this link.

This is a real elephant in the room. It’s been mentioned a few times here, but it remains a major epidiment to Bayes is the Only Epistemology you need, and other cherished notions.

http://nostalgebraist.tumblr.com/post/161645122124/bayes-a-kinda-sorta-masterpost

Thanks for pointing out that post by nostalgebraist. I had not seen it before and it definitely is of interest to me. I’m interested in hearing anything else along these lines, particularly information about solving this problem.

Reminds me of “Burn-in, bias, and the rationality of anchoring”, Lieder et al 2012.

Seems interesting, but almost everything of it flew over my head. Still, even only the table of the biases might be of use in self-critique; like, am I unpacking the hypothesis A into several unlikely smaller hypotheses (A1, A2...) and so underestimating the probability of A because I see it as a conjunction of events of small probability?.. am I, on the other hand, thinking of A as a disjunction, and so overestimating its probability?..

It is hard (at least, for me) to even judge whether I consider something a disjunction or a conjunction—I usually have my ‘hypotheses’ as chunks of thought, of variable length, just running their course from the spark of noticing (something observed or imagined) to the end (often a variation of ‘who cares’).

(And if shower-thoughts are so helpful in thinking up alternative explanations and ‘just random stuff’… then I hate having had long hair for most of my life. It meant head colds in the cold season after every second bath! Hooray for short, easily driable hair!

OTOH, when we had to digest, in college, chunks that were noticeably too large—like, cramming for a test on Arthropoda, or for the final test on virology, or even a seminar on photosynthesis, - sometime half-way to the end I began having this feeling, ‘stop, this is too elaborate, rewind, this isn’t beautiful anymore, quit’. And whoa, it turned out that Arthropoda were even more “monstrous” than I could take in (= even more variable and specialized), viruses were “Just Doing It” with no regard for the “everyday” ranges of any nameable constraints, and the lovely, elegant photosynthesis seemed a total mess when coupled to the rest of the plant.

I mean, first, they give you something that obviously wouldn’t work

as a human plot, then they show you that yes, it works, andthenthey tell you ‘oh btw, you graduated now, go do your thinking for yourself, there’s a good chap, and don’t forget to penalize complex hypotheses! Because conjunctions!’ Life is unfaaaair.)I mean, thanks for the link, it’s pretty scary:)

Which is to say, I think there is a big problem in teaching people (biologists, at least) to think up hypotheses. In that, the reasoner

knowsthe final product must be of some non-trivial complexity, in the colloquial sense of the word, but that he isn’t reallyallowedto put forward anything too complicated, because it’s too easy to imagine a just-so story. (So now I will imagine myself a just-so story, because duh.)One way to tinker with research is to think of your tools, whether material or mathematical, as part of the picture instead of the brush and paints that nobody displays at the exhibition. For example, people who compare methods for staining plants’ roots for fungi see that the results are different and wonder what that tells about the physiology of the root as a whole; the relationship of the dye and the cytoplasm may depend on pH, and there comes a point when one can swear the difference in pH must be the property of the cytoplasm and not of the procedure of staining. From this, one may assume that the acidity of the cells changes depending on the development of the mycelium inside the root (and the age of the root, and other things); but why, exactly?… And then, comparing modifications of the methods and knowing at least the chemical structure of the dye, one may build hypotheses about the underlying physiological processes.

It’s hard to view, for example, scanning electron microscopy (or something comparably multi-stage) as an “alien artefact in the world” rather than “a great big thing that by the good will of its operator lets me see the surface of very small things”, but I think that such approach might be fruitful for generating hypotheses, at least in some not-too-applied cases. There must be fascinating examples of mathematical tools so toyed with. I mean, almost (?) all that I have read

aboutmath that was interesting for a layman, was written from this angle. But in school and college, they go with the “great big things” approach. (hash tag ramblingasalways)Wow. Does the result mean that any algorithm generating hypotheses via sampling will have the same biases?

No. They discuss some of the other possibilities. For example, you don’t get the autocorrelation from other sampling methods like importance sampling or ABC sampling, which are drawing points independently of each other. They also mention particle sampling but I’m not sure how particle sampling vs classic MCMC would yield any observable psychological differences.

It doesn’t necessarily mean exactly the same biases since you could deliberately implement the opposite bias in some of those cases. But it suggests that such algorithms will consistently have overall a similar set of biases.