Gah—didn’t see this in time! I’ll be at the next one. Is there a mailing list? bdeutsch at illinois dot edu.
aspera
Hi all. I’m a scientist (postdoc) working on optical inverse problems. I got to LW through the quantum sequence, but my interest lies in probability theory and how it can change the way science is typically done. By comparison, cognitive bias and decision theory are fairly new to me. I look forward to learning what the community has to teach me about these subjects.
In general, I’m startled at the degree to which my colleagues are ignorant of the concepts covered in the sequences and beyond, and I’m here to learn how to be a better ambassador of rationality and probability. Expect my comments to focus on reconciling unfamiliar ideas about bias and heuristics with familiar ideas about optimal problem solving with limited information.
I’ll also be interested to interact with other overt atheists. In physics, I’m pretty well buffered from theistic arguments, but theism is still one of the most obvious and unavoidable reminders of a non-rational society (that and Jersey Shore?). In particular, I’m expecting a son, and I would love to hear some input about non-theistic and rationalist parenting from those with experience.
By the way, I wonder if someone can clear something up for me about “making beliefs pay rent.” Eliezer draws a sharp distinction between falsifiable and non-falsifiable beliefs (though he states these concepts differently), and constructs stand-alone webs of beliefs that only support themselves.
But the correlation between predicted experience and actual experience is never perfect: there’s always uncertainty. In some cases, there’s rather a lot of uncertainty. Conversely, it’s extremely difficult to make a statement in English that does not contain ANY information regarding predicted or retrodicted experience. In that light, it doesn’t seem useful to draw such a sharp division between two idealized kinds of beliefs. Would Eliezer assign value to a belief based on its probability of predicting experience?
How would you quantify that? Could we define some kind of correlation function between the map and the territory?
Thanks Tim.
In the post I’m referring to, EY evaluates a belief in the laws of kinematics based on predicting how long a bowling ball will take to hit the ground when tossed off a building, and then presumably testing it. In this case, our belief clearly “pays rent” in anticipated experience. But what if we know that we can’t measure the fall time accurately? What if we can only measure it to within an uncertainty of 80% or so? Then our belief isn’t strictly falsifiable, but we can gather some evidence for or against it. In that case, would we say it pays some rent?
My argument is that nearly every belief pays some rent, and no belief pays all the rent. Almost everything couples in some weak way to anticipated experience, and nothing couples perfectly.
Occam’s Razor is non-Bayesian? Correct me if I’m wrong, but I thought it falls naturally out of Bayesian model comparison, from the normalization factors, or “Occam factors.” As I remember, the argument is something like: given two models with independent parameters {A} and {A,B}, the P(AB model) \propto P(AB are correct) and P(A model) \propto P(A is correct). Then P(AB model) ⇐ P(A model).
Even if the argument is wrong, I think the result ends up being that more plausible models tend to have fewer independent parameters.
Crystal clear. Sorry to distract from the point.
Unless I misunderstand, this story is a parable. EY is communicating with a handwaving example that the effectiveness of a code doesn’t depend on the alphabet used. In the code used to describe the plate phenomenon, “magic” and “heat conduction” are interchangeable symbols which formally carry zero information, since the coder doesn’t use them to discriminate among cases.
I’m sincerely confused as to why comments center on the motivations of the students and the professor. Isn’t that irrelevant? Or did EY mean for the discussion to go this way? Does it matter?
I agree with you, a year and a half late. In fact, the idea can be extended to EY’s concept of “floating beliefs,” webs of code words that are only defined with respect to one another, and not with respect to evidence. It should be noted that if at any time, a member of the web is correlated in some way with evidence, then so is the entire web.
In that sense, it doesn’t seem like wasted effort to maintain webs of “passwords,” as long as we’re responsible about updating our best guesses about reality based on only those beliefs that are evidence-related. In the long term, given enough memory capacity, it should speed our understanding.
My mother’s husband professes to believe that our actions have no control over the way in which we die, but that “if you’re meant to die in a plane crash and avoid flying, then a plane will end up crashing into you!” for example.
After explaining how I would expect that belief to constrain experience (like how it would affect plane crash statistics), as well as showing that he himself was demonstrating his unbelief every time he went to see a doctor, he told me that you “just can’t apply numbers to this,” and “Well, you shouldn’t tempt fate.”
My question to the LW community is this: How do you avoid kicking people in the nuts all of the time?
I jest, but the sense of the question is serious. I really do want to teach the people I’m close to how to get started on rationality, and I recognize that I’m not perfect at it either. Is there a serious conversation somewhere on LW about being an aspiring rationalist living in an irrational world? Best practices, coping mechanisms, which battles to pick, etc?
Is this what CFAR is trying to do?
I would be interested to hear what other members of the community think about this. I accidentally found Bayes after being trained as a physicist, which is not entirely unlike traditional rationality. But I want to teach my brother, who doesn’t have any science or rationality background. Has anyone had success with starting at Bayes and going from there?
I think this is the kind of causal loop he has in mind. But a key feature of the hypothesis is that you can’t predict what’s meant to happen. In that case, he’s equally good at predicting any outcome, so it’s a perfectly uninformative hypothesis.
There are a couple things I still don’t understand about this.
Suppose I have a bent coin, and I believe that P(heads) = 0.6. Does that belief pay rent? Is it a “floating belief?” It is not, in principle, falsifiable. It’s not a question of measurement accuracy in this case (unless you’re a frequentist, I guess). But I can gather some evidence for or against it, so it’s not uninformative either. It is useful to have something between grounded and floating beliefs to describe this belief.
Second, when LWers talk about beliefs, or “the map,” are they referring to a model of what we expect to observe, or how things actually happen? This would dictate how we deal with measurement uncertainties. In the first case, they must be included in the map, trivially. In the second case, the map still has an uncertainty associated with it that results from back-propagation of measurement uncertainty in the updating process. But then it might make sense to talk only about grounded or floating beliefs, and to attribute the fuzzy stuff in between to our inability to observe without uncertainty.
Your distinction makes sense—I’m just not sure how to apply it.
That’s very helpful, thanks. I’m trying to shove everything I read here into my current understanding of probability and estimation. Maybe I should just read more first.
Maybe this is covered in another post, but I’m having trouble cramming this into my brain, and I want to make sure I get this straight:
Consider a thingspace. We can divide the thingspace into any number of partially-overlapping sets that don’t necessarily span the space. Each set is assigned a word, and the words are not unique.
Our job is to compress mental concepts in a lossy way into short messages to send between people, and we do so by referring to the words. Inferences drawn from the message have associated uncertanties that depended on the characteristics we believe members of the sets to have, word redundancy, etc.
In principle, we can draw whichever boundaries we like in thingspace (and, I suppose, they don’t need to be hard boundaries). But EY is saying that it’s wise to draw the boundaries in a way that “feels” right, which presumably means that the members have certain things in common. Then when we make inferences, the pdfs are sharply peaked (since we required that for set membership), and the calculation is simpler to do.
He also says that it’s possible to make a “mistake” in defining the sets. Does this result from the failure to be consistent in our definitions, a failure to assign uncertainties correctly, or a failure to define the sets in a wise way?
You can’t remember whether or not bleggs exist in real life.
That thought occurred to me too, and then I decided that EY was using “entropy” as “the state to which everything naturally tends” But after all, I think it’s possible to usefully extend the metaphor.
There is a higher number of possible cultish microstates than non-cultish microstates, because there are fewer logically consistent explanations for a phenomenon than logically inconsistent ones. In each non-cultish group, rational argument and counter-argument should naturally push the group toward one describing observed reality. By contrast, cultish groups can fill up the rest of concept-space.
Is this the same as Jaynes’ method for construction of a prior using transformation invariance on acquisition of new evidence?
Does conservation of expected evidence always uniquely determine a probability distribution? If so, it should eliminate a bunch of extraneous methods of construction of priors. For example, you would immediately know if an application of MaxEnt was justified.
I’m going to be a postdoc there. I’m in.