Same person as nostalgebraist2point0, but now I have my account back.
Elsewhere:
Same person as nostalgebraist2point0, but now I have my account back.
Elsewhere:
Incidentally, Eliezer, I don’t think you’re right about the example at the beginning of the post. The two frequentist tests are asking distinct questions of the data, and there is not necessarily any inconsistency when we ask two different questions of the same data and get two different answers.
Suppose A and B are tossing coins. A and B both get the same string of results—a whole bunch of heads (let’s say 9999) followed by a single tail. But A got this by just deciding to flip a coin 10000 times, while B got it by flipping a coin until the first tail came up. Now suppose they each ask the question “what is the probability that, when doing what I did, one will come up with at most the number of tails I actually saw?”
In A’s case the answer is of course very small; most strings of 10000 flips have many more than one tail. In B’s case the answer is of course 1; B’s method ensures that exactly one tail is seen, no matter what happens. The data was the same, but the questions were different, because of the “when doing what I did” clause (since A and B did different things). Frequentist tests are often like this—they involve some sort of reasoning about hypothetical repetitions of the procedure, and if the procedure differs, the question differs.
If we wanted to restate this in Bayesian terms, we’d have to do so by taking into account that the interpreter knows what the method is, not just what the data is, and the distributions used by a Bayesian interpreter should take this into account. For instance, one would be a pretty dumb Bayesian if one’s prior for B’s method didn’t say you’d get one tail with probability one. The observation that’s causing us to update isn’t “string of data,” it’s “string of data produced by a given physical process,” where the process is different in the two cases.
(I apologize if this has all been mentioned before—I didn’t carefully read all the comments above.)
Sure, the likelihoods are the same in both cases, since A and B’s probability distributions assign the same probability to any sequence that is in both of their supports. But the distributions are still different, and various functionals of them are still different—e.g., the number of tails, the moments (if we convert heads and tails to numbers), etc.
If you’re a Bayesian, you think any hypothesis worth considering can predict a whole probability distribution, so there’s no reason to worry about these functionals when you can just look at the probability of your whole data set given the hypothesis. If (as in actual scientific practice, at present) you often predict functionals but not the whole distribution, then the difference in the functionals matters. (I admit that the coin example is too basic here, because in any theory about a real coin, we really would have a whole distribution.)
My point is just that there are differences between the two cases. Bayesians don’t think these differences could possibly matter to the sort of hypotheses they are interested in testing, but that doesn’t mean that in principle there can be no reason to differentiate between the two.
I’m not sure if I’m understanding you correctly, but the reason why climate forecasts and meterological forecasts have different temporal ranges of validity is not that the climate models are coarser, it’s that they’re asking different questions.
Climate is (roughly speaking) the attractor on which the weather chaotically meanders on short (e.g. weekly) timescales. On much longer (1-100+ years) this attractor itself shifts. Weather forecasts want to determine the future state of the system itself as it evolves chaotically, which is impossible in principle after ~14 days because the system is chaotic. Climate forecasts want to track the slow shifts of the attractor. To do this, they run ensembles with slightly different initial conditions and observe the statistics of the ensemble at some future date, which is taken (via an ergodic assumption) to reflect the attractor at that date. None of the ensemble members are useful as “weather predictions” for 2050 or whatever, but their overall statistics are (it is argued) reliable predictions about the attractor on which the weather will be constrained to move in 2050 (i.e. “the climate in 2050″).
It’s analogous to the way we can precisely characterize the attractor in the Lorenz system, even if we can’t predict the future of any given trajectory in that system because it’s chaotic. (For a more precise analogy, imagine a version of the Lorenz system in which the attractor slowly changes over long time scales)
A simple way to explain the difference is that you have no idea what the weather will be in any particular place on June 19, 2016, but you can be pretty sure that in the Northern Hemisphere it will be summer in June 2016. This has nothing to do with differences in numerical model properties (you aren’t running a numerical model in your head), it’s just a consequence of the fact that climate and weather are two different things.
Apologies if you know all this. It just wasn’t clear to me if you did from your comment, and I thought I might spell it out since it might be valuable to someone reading the thread.
As the author the post you linked in the first paragraph, I may be able to provide some useful context, at least for that particular post.
Arguments for and against Strong Bayesianism have been a pet obsession of mine for a long time, and I’ve written a whole bunch about them over the years. (Not because I thought it was especially important to do so, just because I found it fun.) The result is that there are a bunch of (mostly) anti-Bayes arguments scattered throughout several years of posts on my tumblr. For quite a while, I’d had “put a bunch of that stuff in a single place” on my to-do list, and I wrote that post just to check that off my to-do list. Almost none of the material in there is new, and nothing in there would surprise anyone who had been keeping up with the Bayes-related posts on my tumblr. Writing the post was housekeeping, not nailing 95 theses on a church door.
As you might expect, I disagree with a number of the more specific/technical claims you’ve made in this post, but I am with you in feeling like these arguments are retreading old ground, and I’m at the point where writing more words on the internet about Bayes has mostly stopped being fun.
It’s also worth noting that my relation to the rationalist community is not very goal-directed. I like talking to rationalists, I do it all the time on tumblr and discord and sometimes in meatspace, and I find all the big topics (including AGI stuff) fun to talk about. I am not interested in pushing the rationalist community in one direction or another; if I argue about Bayes or AGI, it’s in order to have fun and/or because I value knowledge and insight (etc.) in general, not because I am worried that rationalists are “wasting time” on those things when they could be doing some other great thing I want them to do. Stuff like “what does it even mean to be a non-Bayesian rationalist?” is mostly orthogonal to my interests, since to me “rationalists” just means “a certain group of people whose members I often enjoy talking to.”
I am glad that this topic is being discussed. But IMO, this post contains too much about external factors that might have impeded the Craft-and-Community project, and not enough on what project work was done, why that work didn’t succeed, and/or why it wasn’t enough.
There have been a number of rationalist-branded organizations that tried to spread, develop, or apply LW-rationality. The main examples I have in mind are MetaMed and CFAR. My ideal postmortem would include a lot of words about why MetaMed failed, whether CFAR failed, whether the results in either case are surprising in hindsight, etc. This post doesn’t mention MetaMed, and only mentions CFAR briefly. And while I share your negative assesssment of CFAR, you don’t talk in any detail about how you came to this assessment or what lessons might be learned from it.
While it may be true that cultural, economic, etc. factors indirectly caused the Craft-and-Community project to fail, there is an intermediate causal layer, the actual things people did (or avoided doing) to further the project. If your boss tells you to build a rocket by next month, and next month there is no rocket, and your boss asks why, you can say things like “we were thinking about the problem in the wrong way” or “our team has a dysfunctional dynamic,” and these may well be true facts, but it is also important to address what you actually did, and why it didn’t produce the desired rocket.
I wrote more about this on tumblr, but my own suspicion is that the biggest challenges to face in building a world-improving community are organizational, and the Craft-and-Community project failed because LW-rationality was almost entirely about improving individual judgment. Hence, we shouldn’t have expected to get results back then, and we shouldn’t expect results now, (at least) until someone does promising work collecting/systemizing/testing management advice, or something in that general area.
“Most startups fail” is important to keep in mind here, and I now realize my previous comment implied more focus on why MetaMed specifically failed than is warranted. I still stand by the “build a rocket” analogy, but it should be applied broadly, and not just to the highest-profile projects (and not just to projects that actually materialized enough to fail).
Is there some sort of list of Craft-and-Community-relevant projects that have been attempted since 2009, besides just the meetup groups? If not, should there be one?
This is somewhat closer to what I was asking for, but still mostly about group dynamics rather than engineering (or rather, the analogue of “engineering” on the other side of the rocket analogy). But I take your point that it’s hard to talk about engineering if the team culture was so bad that no one ever tried any engineering.
I do think that it would be very helpful for you to give more specifics, even if they’re specifics like “this person/organization is doing something that is stupid and irrelevant for these reasons.” (If the engineers spent all their time working on their crackpot perpetual motion machine, describe that.)
Basically, I’m asking you to name names (of people/orgs), and give many more specifics about the names you have named (CFAR). This would require you to be less diplomatic than you have been so far, and may antagonize some people. But look, you’re trying to get people to move to a different city (in most cases, a different country) to be part of your new project. You’ve mostly motivated that project by saying, in broad terms, that currently existing rationalists are doing most everything wrong. Moving to a different country is already a risky move, and the EV goes down sharply once some currently-existing-rationalist considers that the founder may well fundamentally disagree with their goals and assumptions. The only way to make this look positive-EV to very many individuals would be to show much more explicitly which sorts of currently-existing-rationalists you disapprove of, so that individuals are able to say “okay, that’s not me.”
See e.g. this bit from one of your other comments
This depends entirely on how you measure it. If I was to throw all other goals under the bus for the sake of proving you wrong, I’m pretty sure I could find enough women to nod along to a watered down version. If instead we’re going for rationalist Rationalists then a lot of the fandom people wouldn’t make the cut and I suspect if we managed to outdo tech, we would be beating The Bay.
Consider an individual woman trying to decide whether to move to Manchester from Berkeley. As it stands, you have a not explicitly stated theory of what makes the good kind of rationalist, such that many/most female Berkeley rats do not qualify. Without further information, the woman will conclude that she probably does not qualify, in which case she’s definitely not going to move. The only way to fix this is to articulate the theory directly so individuals can check themselves against it.
This seems like a very good perspective to me.
It made me think about the way that classic biases are often explained by constructing money pumps. A money pump is taken to be a clear, knock-down demonstration of irrationality, since “clearly” no one would want to lose arbitrarily large amounts of money. But in fact any money pump could be rational if the agent just enjoyed making the choices involved. If I greatly enjoyed anchoring on numbers presented to me, I might well pay a lot of extra money to get anchored; this would be like buying a kind of enjoyable product. Likewise someone might just get a kick out of making choices in intransitive loops, or hyperbolic discounting, or whatever. (In the reverse direction, if you didn’t know I enjoyed some consumer good, you might think I was getting “money pumped” by paying for it again and again.)
So there is a missing step here, and to supply the step we need psychology. The reason these biases are biases and not values is “those aren’t the sort of things we care about,” but to formalize that, we need an account of “the sort of things we care about” which, as you say, can’t be solved for from policy data alone.
Thanks—this is informative and I think it will be useful for anyone trying to decide what to make of your project.
I have disagreements about the “individual woman” example but I’m not sure it’s worth hashing it out, since it gets into some thorny stuff about persuasion/rhetoric that I’m sure we both have strong opinions on.
Regarding MIRI, I want to note that although the organization has certainly become more competently managed, the more recent OpenPhil review included some very interesting and pointed criticism of the technical work, which I’m not sure enough people saw, as it was hidden in a supplemental PDF. Clearly this is not the place to hash out those technical issues, but they are worth noting, since the reviewer objections were more “these results do not move you toward you stated goal in this paper” than “your stated goal is pointless or quixotic,” so if true they are identifying a rationality failure.
I just wrote a long post on my tumblr about this sequence, which I am cross-posting here as a comment on the final post. (N.B. my tone is harsher and less conversational than it would have been if I had thought of it as a comment while writing.)
I finally got around to reading these posts. I wasn’t impressed with them.
The basic gist is something like:
“There are well-established game-theoretic reasons why social systems (governments, academia, society as a whole, etc.) may not find, or not implement, good ideas even when they are easy to find/implement and the expected benefits are great. Therefore, it is sometimes warranted to believe you’ve come up with a good, workable idea which ‘experts’ or ‘society’ have not found/implemented yet. You should think about the game-theoretic reasons why this might or might not be possible, on a case-by-case basis; generalized maxims about ‘how much you should trust the experts’ and the like are counterproductive.”
I agree with this, although it also seems fairly obvious to me. It’s possible that Yudkowsky is really pinpointing a trend (toward an extreme “modest epistemology”) that sounds obviously wrong once it’s pinned down, but is nonetheless pervasive; if so, I guess it’s good to argue against it, although I haven’t encountered it myself.
But the biggest reason I was not impressed is that Yudkowsky mostly ignores an which strikes me as crucial. He makes a case that, given some hypothetically good idea, there are reasons why experts/society might not find and implement it. But as individuals, what we see are not ideas known to be good.
What we see are ideas that look good, according to the models and arguments we have right now. There is some cost (in time, money, etc.) associated with testing each of these ideas. Even if there are many untried good ideas, it might still be the case that these are a vanishing small fraction of ideas that look good before they are tested. In that case, the expected value of “being an experimenter” (i.e. testing lots of good-looking ideas) could easily be negative, even though there are many truly good, untested ideas.
To me, this seems like the big determining factor for whether individuals can expect to regularly find and exploit low-hanging fruit.
The closest Yudkowsky comes to addressing this topic is in sections 4-5 of the post “Living in an Inadequate World.” There, he’s talking about the idea that even if many things are suboptimal, you should still expect a low base rate of exploitable suboptimalities in any arbitrarily/randomly chosen area. He analogizes this to finding exploits in computer code:
Computer security professionals don’t attack systems by picking one particular function and saying, “Now I shall find a way to exploit these exact 20 lines of code!” Most lines of code in a system don’t provide exploits no matter how hard you look at them. In a large enough system, there are rare lines of code that are exceptions to this general rule, and sometimes you can be the first to find them. But if we think about a random section of code, the base rate of exploitability is extremely low—except in really, really bad code that nobody looked at from a security standpoint in the first place.
Thinking that you’ve searched a large system and found one new exploit is one thing. Thinking that you can exploit arbitrary lines of code is quite another.
This isn’t really the same issue I’m talking about – in the terms of this analogy, my question is “when you think you have found an exploit, but you can’t costlessly test it, how confident should you be that there is really an exploit?”
But he goes on to say something that seems relevant to my concern, namely that most of the time you think you have found an exploit, you won’t be able to usefully act on it:
Similarly, you do not generate a good startup idea by taking some random activity, and then talking yourself into believing you can do it better than existing companies. Even where the current way of doing things seems bad, and even when you really do know a better way, 99 times out of 100 you will not be able to make money by knowing better. If somebody else makes money on a solution to that particular problem, they’ll do it using rare resources or skills that you don’t have—including the skill of being super-charismatic and getting tons of venture capital to do it.
To believe you have a good startup idea is to say, “Unlike the typical 99 cases, in this particular anomalous and unusual case, I think I can make a profit by knowing a better way.”
The anomaly doesn’t have to be some super-unusual skill possessed by you alone in all the world. That would be a question that always returned “No,” a blind set of goggles. Having an unusually good idea might work well enough to be worth trying, if you think you can standardly solve the other standard startup problems. I’m merely emphasizing that to find a rare startup idea that is exploitable in dollars, you will have to scan and keep scanning, not pursue the first “X is broken and maybe I can fix it!” thought that pops into your head.
To win, choose winnable battles; await the rare anomalous case of, “Oh wait, that could work.”
The problem with this is that many people already include “pick your battles” as part of their procedure for determining whether an idea seems good. People are more confident in their new ideas in areas where they have comparative advantages, and in areas where existing work is especially bad, and in areas where they know they can handle the implementation details (“the other standard startup problems,” in EY’s example).
Let’s grant that all of that is already part of the calculus that results in people singling out certain ideas as “looking good” – which seems clearly true, although doubtlessly many people could do better in this respect. We still have no idea what fraction of good-looking ideas are actually good.
Or rather, I have some ideas on the topic, and I’m sure Yudkowsky does too, but he does not provide any arguments to sway anyone who is pessimistic on this issue. Since optimism vs. pessimism on this issue strikes me as the one big question about low-hanging fruit, this leaves me feeling that the topic of low-hanging fruit has not really been addressed.
Yudkowsky mentions some examples of his own attempts to act upon good-seeming ideas. To his credit, he mentions a failure (his ketogenic meal replacement drink recipe) as well as a success (stringing up 130 light bulbs around the house to treat his wife’s Seasonal Affective Disorder). Neither of these were costless experiments. He specifically mentions the monetary cost of testing the light bulb hypothesis:
The systematic competence of human civilization with respect to treating mood disorders wasn’t so apparent to me that I considered it a better use of resources to quietly drop the issue than to just lay down the ~$600 needed to test my suspicion.
His wife has very bad SAD, and the only other treatment that worked for her cost a lot more than this. Given that the hypothesis worked, it was clearly a great investment. But not all hypotheses work. So before I do the test, how am I to know whether it’s worth $600? What if the cost is greater than that, or the expected benefit less? What does the right decision-making process look like, quantitatively?
Yudkowsky’s answer is that you can tell when good ideas in an area are likely to have been overlooked by analyzing the “adequacy” of the social structures that generate, test, and implement ideas. But this is only one part of the puzzle. At best, it tells us P(society hasn’t done it yet | it’s good). But what we need is P(it’s good | society hasn’t done it yet). And to get to one from the other, we need the prior probability of “it’s good,” as a function of the domain, my own abilities, and so forth. How can we know this? What if there are domains where society is inadequate yet good ideas are truly rare, and domains where society is fairly adequate but good ideas as so plentiful as to dominate the calculation?
In an earlier conversation about low-hanging fruit, tumblr user @argumate brought up the possibility that low-hanging fruit are basically impossible to find beforehand, but that society finds them by funding many different attempts and collecting on the rare successes. That is, every individual attempt to pluck fruit is EV-negative given risk aversion, but a portfolio of such attempts (such as a venture capitalist’s portfolio) can be net-positive given risk aversion, because with many attempts the probability of one big success that pays for the rest (a “unicorn”) goes up. It seems to me like this is plausible.
Let me end on a positive note, though. Even if the previous paragraph is accurate, it is a good thing for society if more individuals engage in experimentation (although it is a net negative for each of those individuals). Because of this, the individual’s choice to experiment can still be justified on other terms – as a sort of altruistic expenditure, say, or as a way of kindling hope in the face of personal maladies like SAD (in which case it is like a more prosocial version of gambling).
Certainly there is something emotionally and aesthetically appealing about a resurgence of citizen science – about ordinary people looking at the broken, p-hacked, perverse-incentived edifice of Big Science and saying “empiricism is important, dammit, and if The Experts won’t do it, we will.” (There is precedent for this, and not just as a rich man’s game – there is a great chapter in The Intellectual Life of the British Working Classes about widespread citizen science efforts in the 19th C working class.) I am pessimistic about whether my experiments, or yours, will bear fruit often enough to make the individual cost-benefit analysis work out, but that does not mean they should not be done. Indeed, perhaps they should.
I think the arguments here apply much better to the AGI alignment case than to the case of HPMOR. The structure of the post suggests (? not sure) that HPMOR is meant to be the “easier” case, the one in which the reader will assent to the arguments more readily, but it didn’t work that way on me.
In both cases, we have some sort of metric for what it would mean to succeed, and (perhaps competing) inside- and outside-view arguments for how highly we should expect to score on that metric. (More precisely, what probabilities we should assign to achieving different scores.) In both cases, this post tends to dismiss facts which involve social status as irrelevant to the outside view.
But what if our success metric depends on some facts which involve social status? Then we definitely shouldn’t ignore these facts, (even) in the inside view. And this is the situation we are in with HPMOR, at least, if perhaps less so with AGI alignment.
There are some success metrics for HPMOR mentioned in this post which can be evaluated largely without reference to status stuff (like “has it conveyed the experience of being rational to many people?”). But when specific successes—known to have been achieved in the actual world—come up, many of them are clearly related to status. If you want to know whether your fic will become one of the most reviewed HP fanfics on a fanfiction site, then it matters how it will be received by the sorts of people who review HP fanfics on those sites—including their status hierarchies. (Of course, this will be less important if we expect most of the review-posters to be people who don’t read HP fanfic normally and have found out about the story through another channel, but its importance is always nonzero, and very much so for some hypothetical scenarios.)
TBH, I don’t understand why so much of this post focuses on pure popularity metrics for HPMOR, ones that don’t capture whether it is having the intended effect on readers. (Even something like “many readers consider it the best book they’ve ever read” does not tell you much without specifying more about the readership; consider that if you were optimizing for this metric, you would have an incentive to select for readers who have read as few books as possible.)
I guess the idea may be that it is possible to surprise someone like Pat by hitting a measurable indictor of high status (because Pat thinks that’s too much of a status leap relative to the starting position), where Pat would be less surprised by HPMOR hitting idiosyncratic goals that are not common in HP fanfiction (and thus are not high status to him). But this pattern of surprise levels seems obviously correct to me! If you are trying to predict an indicator of status in a community, you should use information about the status system in that community in your inside view. (And likewise, if the indicator is unrelated to status, you may be able to ignore status information.)
In short, this post condemns using status-related facts for forecasting, even when they are relevant (because we are forecasting other status-related facts). I don’t mean the next statement as Bulverism, but as a hopefully useful hypothesis: it seems possible that the concept of status regulation has encouraged this confusion, by creating a pattern to match to (“argument involving status and the existing state of a field, to the effect that I shouldn’t expect to be capable of something”), even when some arguments matching that pattern are good arguments.
I agree. When I think about the “mathematician mindset” I think largely about the overwhelming interest in the presence or absence, in some space of interest, of “pathological” entities like the Weierstrass function. The truth or falsehood of “for all / there exists” statements tend to turn on these pathologies or their absence.
How does this relate to optimization? Optimization can make pathological entities more relevant, if
(1) they happen to be optimal solutions, or
(2) an algorithm that ignores them will be, for that reason, insecure / exploitable.
But this is not a general argument about optimization, it’s a contingent claim that is only true for some problems of interest, and in a way that depends on the details of those problems.
And one can make a separate argument that, when conditions like 1-2 do not hold, a focus on pathological cases is unhelpful: if a statement “fails in practice but works in theory” (say by holding except on a set of sufficiently small measure as to always be dominated by other contributions to a decision problem, or only for decisions that would be ruled out anyway for some other reason, or over the finite range relevant for some calculation but not in the long or short limit), optimization will exploit its “effective truth” whether or not you have noticed it. And statements about “effective truth” tend to be mathematically pretty uninteresting; try getting an audience of mathematicians to care about a derivation that rocket engineers can afford to ignore gravitational waves, for example.
I disagree that this answers my criticisms. In particular, my section 7 argues that it’s practically unfeasible to even write down most practical belief / decision problems in the form that the Bayesian laws require, so “were the laws followed?” is generally not even a well-defined question.
To be a bit more precise, the framework with a complete hypothesis space is a bad model for the problems of interest. As I detailed in section 7, that framework assumes that our knowledge of hypotheses and the logical relations between hypotheses are specified “at the same time,” i.e. when we know about a hypothesis we also know all its logical relations to all other hypotheses, and when we know (implicitly) about a logical relation we also have access (explicitly) to the hypotheses it relates. Not only is this false in many practical cases, I don’t even know of any formalism that would allow us to call it “approximately true,” or “true enough for the optimality theorems to carry over.”
(N.B. as it happens, I don’t think logical inductors fix this problem. But the very existence of logical induction as a research area shows that this is a problem. Either we care about the consequences of lacking logical omniscience, or we don’t—and apparently we do.)
It’s sort of like quoting an optimality result given access to some oracle, when talking about a problem without access to that oracle. If the preconditions of a theorem are not met by the definition of a given decision problem, “meet those preconditions” cannot be part of a strategy for that problem. “Solve a different problem so you can use my theorem” is not a solution to the problem as stated.
Importantly, this is not just an issue of “we can’t do perfect Bayes in practice, but if we were able, it’d be better.” Obtaining the kind of knowledge representation assumed by the Bayesian laws has computational / resource costs, and in any real decision problem, we want to minimize these. If we’re handed the “right” knowledge representation by a genie, fine, but if we are talking about choosing to generate it, that in itself is a decision with costs.
As a side point, I am also skeptical of some of the optimality results.
I don’t disagree with any of this. But if I understand correctly, you’re only arguing against a very strong claim—something like “Bayes-related results cannot possibly have general relevance for real decisions, even via ‘indirect’ paths that don’t rely on viewing the real decisions in a Bayesian way.”
I don’t endorse that claim, and would find it very hard to argue for. I can imagine virtually any mathematical result playing some useful role in some hypothetical framework for real decisions (although I would be more surprised in some cases than others), and I can’t see why Bayesian stuff should be less promising in that regard than any arbitrarily chosen piece of math. But “Bayes might be relevant, just like p-adic analysis might be relevant!” seems like damning with faint praise, given the more “direct” ambitions of Bayes as advocated by Jaynes and others.
Is there a specific “indirect” path for the relevance of Bayes that you have in mind here?
If I understand your objection correctly, it’s one I tried to answer already in my post.
In short: Bayesianism is normative for problems to you can actually state in its formalism. This can be used as an argument for at least trying to state problems in its formalism, and I do think this is often a good idea; many of the examples in Jaynes’ book show the value of doing this. But when the information you have actually does not fit the requirements of the formalism, you can only use it if you get more information (costly, sometimes impossible) or forget some of what you know to make the rest fit. I don’t think Bayes normatively tells you to do those kinds of things, or at least that would require a type of argument different from the usual Dutch Books etc.
Using the word “brain” there was probably a mistake. This is only about brains insofar as it’s about the knowledge actually available to you in some situation, and the same idea applies to the knowledge available to some robot you are building, or some agent in a hypothetical decision problem (so long as it is a problem with the same property, of not fitting well into the formalism without extra work or forgetting).
You assume a creature that can’t see all logical consequences of hypotheses [...] Then you make it realize new facts about logical consequences of hypotheses
This is not quite what is going on in section 7b. The agent isn’t learning any new logical information. For instance, in jadagul’s “US in 2100″ example, all of the logical facts involved are things the agent already knows. ” ‘California is a US state in 2100’ implies ‘The US exists in 2100’ ” is not a new fact, it’s something we already knew before running through the exercise.
My argument in 7b is not really about updating—it’s about whether probabilities can adequately capture the agent’s knowledge, even at a single time.
This is in a context (typical of real decisions) where:
the agent knows a huge number of logical facts, because it can correctly interpret hypotheses written in a logically transparent way, like “A and B,” and because it knows lots of things about subsets in the world (like US / California)
but, the agent doesn’t have the time/memory to write down a “map” of every hypothesis connected by these facts (like a sigma-algebra). For example, you can read an arbitrary string of hypotheses “A and B and C and …” and know that this implies “A”, “A and C”, etc., but you don’t have in your mind a giant table containing every such construction.
So the agent can’t assign credences/probabilities simultaneously to every hypothesis on that map. Instead, they have some sort of “credence generator” that can take in a hypothesis and output how plausible it seems, using heuristics. In their raw form, these outputs may not be real numbers (they will have an order, but may not have e.g. a metric).
If we want to use Bayes here, we need to turn these raw credences into probabilities. But remember, the agent knows a lot of logical facts, and via the probability axioms, these all translate to facts relating probabilities to one another. There may not be any mapping from raw credence-generator-output to probabilities that preserves all of these facts, and so the agent’s probabilities will not be consistent.
To be more concrete about the “credence generator”: I find that when I am asked to produce subjective probabilities, I am translating them from internal representations like
Event A feels “very likely”
Event B, which is not logically entailed by A or vice versa, feels “pretty likely”
Event (A and B) feels “pretty likely”
If we demand that these map one-to-one to probabilities in any natural way, this is inconsistent. But I don’t think it’s inconsistent in itself; it just reflects that my heuristics have limited resolution. There isn’t a conjunction fallacy here because I’m not treating these representations as probabilities—but if I decide to do so, then I will have a conjunction fallacy! If I notice this happening, I can “plug the leak” by changing the probabilities, but I will expect to keep seeing new leaks, since I know so many logical facts, and thus there are so many consequences of the probability axioms that can fail to hold. And because I expect this to happen going forward, I am skeptical now that my reported probabilities reflect my actual beliefs—not even approximately, since I expect to keep deriving very wrong things like an event being impossible instead of likely.
None of this is meant disapprove of using probability estimates to, say, make more grounded estimates of cost/benefit in real-world decisions. I do find that useful, but I think it is useful for a non-Bayesian reason: even if you don’t demand a universal mapping from raw credences, you can get a lot of value out of saying things like “this decision isn’t worth it unless you think P(A) > 97%”, and then doing a one-time mapping of that back onto a raw credence, and this has a lot of pragmatic value even if you know the mappings will break down if you push them too hard.
Two comments:
1. You seem to be suggesting that the standard Bayesian framework handles logical uncertainty as a special case. (Here we are not exactly “uncertain” about sentences, but we have to update on their truth from some prior that did not account for it, which amounts to the same thing.) If this were true, the research on handling logical uncertainty through new criteria and constructions would be superfluous. I haven’t actually seen a proposal like this laid out in detail, but I think they’ve been proposed and found wanting, so I’ll be skeptical at least until I’m shown the details of such a proposal.
(In particular, this would need to involve some notion of conditional probabilities like P(A | A ⇒ B), and perhaps priors like P(A ⇒ B), which are not a part of any treatment of Bayes I’ve seen.)
2. Even if this sort of thing does work in principle, it doesn’t seem to help in the practical case at hand. We’re now told to update on “noticing” A ⇒ B by using objects like P(A | A ⇒ B), but these too have to be guessed using heuristics (we don’t have a map of them either), so it inherits the same problem it was introduced to solve.
Ah, yeah, you’re right that it’s possible to do this. I’m used to thinking in the Kolmogorov picture, and keep forgetting that in the Jaynesian propositional logic picture you can treat material conditionals as contingent facts. In fact, I went through the process of realizing this in a similar argument about the same post a while ago, and then forgot about it in the meantime!
That said, I am not sure what this procedure has to recommend it, besides that it is possible and (technically) Bayesian. The starting prior, with independence, does not really reflect our state of knowledge at any time, even at the time before we have “noticed” the implication(s). For, if we actually write down that prior, we have an entry in every cell of the truth table, and if we inspect each of those cells and think “do I really believe this?”, we cannot answer the question without asking whether we know facts such as A ⇒ B—at which point we notice the implication!
It seems more accurate to say that, before we consider the connection of A to B, those cells are “not even filled in.” The independence prior is not somehow logically agnostic; it assigns a specific probability to the conditional, just as our posterior does, except that in the prior that probability is, wrongly, not one.
Okay, one might say, but can’t this still be a good enough place to start, allowing us to converge eventually? I’m actually unsure about this, because (see below) the logical updates tend to push the probabilities of the “ends” of a logical chain further towards 0 and 1; at any finite time the distribution obeys Cromwell’s Rule, but whether it converges to the truth might depend on the way in which we take the limit over logical and empirical updates (supposing we do arbitrarily many of each type as time goes on).
I got curious about this and wrote some code to do these updates with arbitrary numbers of variables and arbitrary conditionals. What I found is that as we consider longer chains A ⇒ B ⇒ C ⇒ …, the propositions at one end get pushed to 1 or 0, and we don’t need very long chains for this to get extreme. With all starting probabilities set to 0.7 and three variables 0 ⇒ 1 ⇒ 2, the probability of variable 2 is 0.95; with five variables the probability of the last one is 0.99 (see the plot below). With ten variables, the last one reaches 0.99988. We can easily come up with long chains in the California example or similar, and following this procedure would lead us to absurdly extreme confidence in such examples.
I’ve also given a second plot below, where all the starting probabilities are 0.5. This shows that the growing confidence does not rely on an initial hunch one way or the other; simply updating on the logical relationships from initial neutrality (plus independences) pushes us to high confidence about the ends of the chain.
To quote Abram Demski in “All Mathematicians are Trollable”:
The main concern is not so much whether GLS-coherent mathematicians are trollable as whether they are trolling themselves. Vulnerability to an external agent is somewhat concerning, but the existence of misleading proof-orderings brings up the question: are there principles we need to follow when deciding what proofs to look at next, to avoid misleading ourselves?
My concern is not with the dangers of an actual adversary, it’s with the wild oscillations and extreme confidences that can arise even when logical facts arrive in a “fair” way, so long as it is still possible to get unlucky and experience a “clump” of successive observations that push P(A) way up or down.
We should expect such clumps sometimes unless the observation order is somehow specially chosen to discourage them, say via the kind of “principles” Demski wonders about.
One can also prevent observation order from mattering by doing what the Eisenstat prior does: adopt an observation model that does not treat logical observations as coming from some fixed underlying reality (so that learning “B or ~A” rules out some ways A could have been true), but as consistency-constrained samples from a fixed distribution. This works as far as it goes, but is hard to reconcile with common intuitions about how e.g. P=NP is unlikely because so many “ways it could have been true” have failed (Scott Aaronson has a post about this somewhere, arguing against Lubos Motl who seems to think like the Eisenstat prior), and more generally with any kind of mathematical intuition — or with the simple fact that the implications of axioms are fixed in advance and not determined dynamically as we observe them. Moreover, I don’t know of any way to (approximately) apply this model in real-world decisions, although maybe someone will come up with one.
This is all to say that I don’t think there is (yet) any standard Bayesian answer to the problem of self-trollability. It’s a serious problem and one at the very edge of current understanding, with only some partial stabs at solutions available.
“Bayesianism’s coherence and uniqueness proofs cut both ways. Just as any calculation that obeys Cox’s coherency axioms (or any of the many reformulations and generalizations) must map onto probabilities, so too, anything that is not Bayesian must fail one of the coherency tests. This, in turn, opens you to punishments like Dutch-booking (accepting combinations of bets that are sure losses, or rejecting combinations of bets that are sure gains).”
I’ve never understood why I should be concerned about dynamic Dutch books (which are the justification for conditionalization, i.e., the Bayesian update). I can understand how static Dutch books are relevant to finding out the truth: I don’t want my description of the truth to be inconsistent. But a dynamic Dutch book (in the gambling context) is a way that someone can exploit the combination of my belief at time (t) and my belief at time (t+1) to get something out of me, which doesn’t seem like it should carry over to the context of trying to find out the truth. When I want to find the truth, I simply want to have the best possible belief in the present—at time (t+1) -- so why should “money” I’ve “lost” at time (t) be relevant?
Perhaps I simply want to avoid getting screwed in life by falling into the equivalents of Dutch books in real, non-gambling-related situations. But if that’s the argument, it should depend on how frequently such situations actually crop up—the mere existence of a Dutch book shouldn’t matter if life is never going to make me take it. Why should my entire notion of rationality be based on avoiding one particular—perhaps rare—type of misfortune? On the other hand, if the argument is that falling for dynamic Dutch books constitutes “irrationality” in some direct intuitive sense (the same way that falling for static Dutch books does), then I’m not getting it.