This will probably be my last reply to Mitchell on consciousness or ontology.
It is now very highly probable that my differences with Mitchell in this thread stem from differences over epistemology. Specifically, Mitchell considers it epistemologically satisfactory to adhere to his current position until provided with strong evidence or strong argument against it. The best summary of that position in a few sentences is probably the following passage written by Mitchell less than 36 hours ago to be found in the parent of this comment:
A priori, I regard it as outlandish that color is an arrangement of things in space, or any other such composite property from the physical theory in question. They appear to be radically dissimilar things, as if one were to say that yesterday was the number 2. So when someone says that color is such a property, if they wish to convince me, they must overcome this skepticism and somehow explain how this can be so.
My position is that what Mitchell considers outlandish is a perfectly normal and perfectly satisfactory hypothesis. None of Mitchell’s indictments of the hypothesis strike me as actual handicaps in a proper contest among hypotheses (i.e., in a proper epistemological process). If you want a summary in a few sentences of my position (which is the standard position round here) on how hypotheses should be judged, see the first paragraph of something I wrote less than 48 hours ago. I would gladly elaborate on it and how it applies to Mitchell’s concerns if anyone is interested. Alternatively, the interested reader could just wait for the top-level submission promised by jimrandomh in a sister to this comment.
The only way I can see to impose a quantitative framework upon this disagreement is to construct a Bayesian belief network encompassing all the key propositions in both your argument and my argument, and then we try to find where your probabilistic dependencies are different to mine. But I wonder if that’s even necessary.
Here’s my reasoning:
Colors exist.
Colors would not exist in a universe consisting solely of colorless particles in motion through colorless space. (In a nutshell: you can’t get color from noncolor.)
Therefore, this is not such a universe.
Your reasoning is something like:
We explained everything else in terms of colorless particles, etc, so far.
Therefore we’ll do it this time too.
To come around to your view, I have to deny my second premise. I see three ways you can try to make me do that. First, you can show me a specific way to get color from noncolor—but no-one has shown me that. Second, you can use historical analogy to argue that my intuition is wrong. But in my most recent comment to RobinZ I explained why consciousness is different. Third, you can appeal to consensus: everyone else here thinks we can get color from noncolor somehow. But that consensus can be explained psychologically, culturally and historically.
Like I said, we can set about the laborious task of formalizing all this. But do we need to?
I am going to ignore the parts of Mitchell’s comment where Mitchell repeats points I already responded to, which leaves us with one point:
The only way I can see to impose a quantitative framework upon this disagreement is to construct a Bayesian belief network encompassing all the key propositions in both your argument and my argument, and then we try to find where your probabilistic dependencies are different to mine.
I do not know what you mean by a Bayesian belief network about an argument. I humbly suggest that when you wrote that sentence, you were confused about how a quantitative framework (as you call) or a formal epistemological treatment (as I would call it) would go. Please allow me to give the miminum amount of exposition necessary for present purposes. Although it is a clear improvement over what Mitchell wrote, there might be mistakes in the following exposition because I came to “technical epistemology” after the age of 40 and life circumstances have prevented me from giving it the study it deserves. If Eliezer, Anna Salamon or Steve Rayhawk takes exception to anything I say below, then believe them, not what I say below.
In the gospel according to Jaynes, Pearl, Solomonoff, etc, there is one Bayesian belief network out of all the possible Bayesian belief networks that is the accurate model of reality. If we knew which one it was, we would be able to use it to answer any question about any cause-and-effect relationship that is in principle answerable—or so it seems to me according to my untutored understanding. Parenthetically, in Causality Pearl opines that systems of equations similar to the structural equation models pioneered by the econometricians are probably a better representation than Bayesian belief networks. Hutter and Schmidhuber I think use Turing machines instead of Bayesian belief networks. Needless to say, if you have a formal model of reality in one representation (Bayesian belief network, say), it is a fairly easy mathematical exercise to put it into a different representation.
So, there is one “objectively true” model of reality, but I do not know which one it is. Consequently, what I have as my model of reality is a distribution over models—er, to be precise, a distribution over candidates for the One True model. (I think I used the word “hypothesis” in my previous comment, but right now I prefer “candidate model”.) By “distribution” I mean a mapping from candidates to real numbers in the interval between 0 and 1. I will refer to these real numbers as probabilities. There are an uncountable number of candidates, and the only way to get the probabilities of the candidates to sum to 1 is if the probabilities of arithmetically longer candidates are geometrically smaller. This is the formal version of Occam’s Razor. Why do the probabilities need to sum to 1? Well, the short answer is the Kolmogorov axioms say so. Who made the Kolmogorov axioms God? Cox’s theorem did.
Since physics is the study of fundamental reality, when I say “our civilization’s standard physical model” I refer to our civilization’s standard model of fundamental reality. The word “fundamental” is in there to indicate that “Fairbanks is the capital of Alaska” is not in the model. Our civilization’s model of fundamental reality remains _in_formal. To produce a formal model would require more than one generation of scientific effort in my humble estimate. In other words, to get it done would entail some community of scientists working on it till they became experts at the work, which I would think would take at least ten years. Then that first generation of scientists would have to train a second generation. But maybe I am wrong and it would take only one generation of scientific research to produce a formal model sufficiently useful that researchers wielding the model could compete with professional physicists trained the conventional way. (It would be a very cool achievement and parenthetically a potent way to remove human cognitive biases from scientific research it seems to me according to my untutored understanding of formal epistemology).
Even though I do not have a formal model of fundamental reality, my knowledge of formal epistemology which I have attempted to summarize briefly above is still useful because (probabalistically speaking, that is, excluding the freak case where all the air molecules go to one half of the room) any process that produces a true model of reality must approximate the process by which evidence updates a distribution on candidate models of reality outlined briefly above. (In particular, the process of natural selection that produced the scientists that produced our civilization’s physical models must approximate the process outlined above.)
Well, that should be enough to correct Mitchell’s false or confusing or easily-misinterpreted statements (quoted above) about formal epistemology—which is the only end for which I have any patience left in this top-level submission by Mitchell.
In the gospel according to Jaynes, Pearl, Solomonoff, etc, there is one Bayesian belief network out of all the possible Bayesian belief networks, which is the accurate model of reality. If we knew which one it was, we would be able to use it to answer any question about any cause-and-effect relationship that is in principle answerable—or so it seems to me according to my untutored understanding.
If I have The True Belief Network then I don’t need to predict cause-and-effect relationships. I just know the full state of the timeless universe. I mean to ask, why is a belief network constrained to representing physical laws and not physical state? After all, my current network has a bit of both...
If I have The True Belief Network then I don’t need to predict cause-and-effect relationships. I just know the full state of the timeless universe. I mean to ask, why is a belief network constrained to representing physical laws and not physical state?
I did not say it is constrained to represent physical laws, wedrifid.
Could it be that you believe that my use of “cause-and-effect relationship” implies that constraint? If so, I’m not conceding the implication.
I’m not asking you to concede anything. I’m trying to explore your meaning. What would you (or, for that matter, the experts you cite) say is the One True Model? I can imagine various types of mathematical abstractions but aren’t sure which kind you are referring to.
The casual reader might be saying to himself, “There goes Hollerith with another long comment about what he is calling formal epistemology. Why doesn’t he have the manners to refrain from injecting a long thread on an unrelated topic into Mitchell’s article on monads, consciousness, etc?”
Well, two replies to that. First, I say formal epistemology is not unrelated. Mitchell has been writing for many years around these parts on how consciousness presents a problem for standard physics. He has even solicited donations to support him in researching the matter further, saying that it is dangerous to have a singularity without having done that research. So, one of the ways formal epistemology enters naturally into this comment section is that I humbly sumbit that anyone engaged in such a project that Mitchell is engaged in should have as part of his technical background an education in formal epistemology. It leads to crisper thinking, and given how many resources Mitchell is devoting to the project, his dedicating some of those resources to learning formal epistemology is probably a good use of his time (and, oh, by the way, I’m not going to pay anymore attention to his writings on consciousness, ontology, etc, till he does).
Second, now that it has become plain that my mention of formal epistemology might lead to a long thread of conversation, I will indeed move the conversation to this place. It might move back to Less Wrong in the form of a top-level article by me with a title something like Why most people here should probably learn technical epistemology, a.k.a., the math of rationality. This prospective article would cover no ground that Eliezer has not already covered, but when it is important to publicize some point, then it often wins to have more than one voice making that point.
It would be possible for a person to maintain that only the natural numbers exist, and that there is nothing else. They could point to all the things which can be described using natural numbers; and if you insisted that some particular thing was not actually a natural number, but merely had a relationship to the numbers, they would keep returning their focus to the numerical part of the description of everything, and handwave away every other aspect as not really real, or as itself just being another number.
In the discussion of whether color can be reduced to the motions of particles in space, I feel myself to be in a comparable situation. The discussion of color as such repeatedly turns into a discussion of particles in the light source or particles in the brain…
Perhaps someone out there has conducted the subjective experiment of attending to actual color for a moment, and asking themselves afresh whether this thing could “really” be just particles in motion. The first thing to ask yourself is whether this alleged identity derives any impetus at all from the intrinsic nature of particles in motion. If somehow you knew nothing of color experiences or of neuropsychology, would you have any reason to think, in contemplating any assortment of particles circulating in space, that “color” or “the experience of color” was there? I think not. The motivation for the identity comes entirely from the belief that the world in general has already been explained by a physics of this form, and so color (and everything else about consciousness) must, somehow, also reduce to particles in motion. There is nothing in the intrinsic nature of color or the intrinsic nature of particles in motion to make you think that it is even possible for one to be the other.
That is the sort of argument that you have to resort to with someone who thinks that color is particles, or that everything is a number. You have to draw their attention to their actual experience, and make them question from the very beginning whether what they are saying makes sense. But Richard, I have no idea how to do that within these epistemic formalisms you promote, which seem to mostly be good for arriving at the simplest possible causal structure for hidden causes, and say nothing about how to correctly think about appearances as such, or how to ensure that you are placing a thing in the right ontological category.
But can this formal epistemology be the whole of epistemology? What is your formal epistemic basis for thinking that something exists, or that you have experienced blueness, or that 1+1=2?
It might move back to Less Wrong in the form of a top-level article by me with a title something like Why most people here should probably learn technical epistemology, a.k.a., the math of rationality. This prospective article would cover no ground that Eliezer has not already covered, but when it is important to publicize some point, then it often wins to have more than one voice making that point.
I do not know what you mean by a Bayesian belief network about an argument.
What I called a BBN (it may be a generalization of the standard concept) is a belief system schema constructed to be capable of representing your reasoning and my reasoning. Nodes are propositions and arrows are inferential steps. The schema must contain a node for every proposition that I use and every proposition that you use, and similarly must contain an arrow for every inferential step appearing in the argument of either person. Once we have that diagram, our two arguments may then be represented as each flowing through a portion of it. We arrive at opposite truth values for a common terminal proposition, so the arguments are in contradiction. To resolve the contradiction or at least identify its cause, we move upstream and try to identify where initial conditions differ.
This process will most likely require one to state opinions regarding certain implicit premises, used by the other person, which did not even play a role in one’s own argument, as well as to express differing opinions about the arrows, i.e., about the implications of one proposition for the truth of another. One of us may regard the truth of B as independent of the truth of A, whereas the other would say that if A is true, then B is definitely false—or probably false. It is merely a formal process meant to guarantee that the sources of disagreement are mutually understood, something which should happen anyway if the disagreement has developed in a lucid and orderly fashion.
I’m concerned that you want to dive down an explanatory hole with no bottom. Suppose we say: okay, the particles are colored; or space is colored; or otherwise, beneath the objects we perceive to be colored, there are colors. Won’t you want those colors explained too?
When I emphasize that the building blocks of physical ontology have no color, I’m not saying that everything would be solved if only they did. But if you start out without color, and your only way to make bigger things is through spatial and causal aggregation, color will not appear by itself—that is the message.
Certain properties can be described counterfactually—does that help? For instance, “fragility” can be a property of an object that never in fact breaks, as long as it would have been disposed to break under a greater proportion of conditions than many comparable objects. An object is a certain color if it is disposed to reflect light of one wavelength as light of a certain wavelength, which may be different from the original. An object can have this property even if it spends its lifetime in the dark.
The physical property to which you refer is deemed color only because it can induce an experience of color. And the experience of color can occur without that specific external stimulus. So the true nature of color must be sought within the brain.
The physical property to which you refer is deemed color only because it can induce an experience of color. And the experience of color can occur without that specific external stimulus. So the true nature of color must be sought within the brain.
Here is the confusion underlying this whole mess. There are three types of things which color can apply to: objects, light, and experiences. These are related causally: blue objects cause blue light which causes blue experiences; and evidentially: a blue experience is evidence that there was blue light, which is evidence that there was a blue object. However, color as it applies to objects, light, and experiences are three separate entities with different reductions. We use them interchangeably because the causal and evidential relationships allow them to substitute for eachother in almost all contexts.
If you start with one of blue objects, light, or experience clearly defined, then you can use that definition plus the causal/evidential relationships to define the other two. The natural way to define them is to define all three only in relation to eachother—ie, refer only to the entire structure, and depend on the ability to compare the color of reference objects/light/experiences to keep the definition stable. Fortunately, some discoveries from physics have enabled a simple physical description of blue light. Blue light is any light made predominately of photons with a wavelength close to 470nm. Based off that definition, a blue object is one that reflects or produces blue light, and a blue experience is one involving some particular set of neurons which I identify by their causal relationship to blue light. But stimulating these neurons without using light still makes a blue experience, and I could in principle identify those neurons some other way—for example, if I were to discover that protein X is found only in blue-experience neurons, then I could define a blue experience as an experience involving neurons containing protein X, and then define blue light and objects based on that.
There are some other strange entities which can have color because of causal relations, too. For example, the number 255 (#0000FF) is blue because it causes blue photons to be produced when written in the right part of a CSS file.
There is no confusion. I am disputing the adequacy of the reductions proposed for blue experience, specifically.
Let us suppose you have a definition of the physical correlate of blue experience which no longer refers to external causes, but just to intrinsic properties, such as the presence of protein X. That is an arrangement of atoms in space. My question remains, very simply, where is the blueness in this picture? I don’t see it.
How about you explain what meanings you would find problematic so I can be sure you’re not just trying to time-suck me as a cover for your own insufficient effort in developing a coherent position, which, as others who have argued with you here would agree, is a likely hypothesis?
ETA: In any other circumstance, if someone had given the response in this comment, even and especially if it were me giving the comment, I would accuse the person of being evasive because of an inconsistent position.
So, just to prove that that’s not the case here, I’ll show that I can define “chess” non-arbitrarily but in a way that’s applicable to deep blue and the broader issues of reductionism. However, I’d also need for Mitchell to generate his answer independently of knowing mine.
So, I’m up for having us both submit our answers to some trusted third party, who then reveals the answers. I would be responding to “what is chess [and how does it relate to Deep Blue]?”, while Mitchell would be responding to the question, “What are the problematic definitions of chess when trying to apply them to Deep Blue [and what are the implications to the alleged irreducibility of phenomenal blue]?”
Silas, the answer to your original question really does depend on the definition. You presumably brought it up with some intent, why don’t you go ahead and make the point you intended to make?
ETA: I can add right now that no particular definition of chess will be problematic, in the sense of leaving me at a loss for an answer.
My point was a response to your point by means of reductio: if you can’t see the blueness in atoms moving around, can you see the chess in atoms moving around in a chess computer? It’s the wrong question—chess is identified by a large-scale pattern of behavior. It does not require there to be a low-level ontologically-fundamental chess-thing in the picture.
chess is identified by a large-scale pattern of behavior.
A counter-question I thought of asking: is there tennis in a brick wall? You can’t get a whole game out of it, but you can get return-of-serve, rally practice, and volley practice. A brick wall has some of the capabilities you’d want in an ‘artificial tennis player’, just as Deep Blue has some of the capabilities of an ‘artificial chess player’. The brick wall achieves this by inducing a particular systematic transformation of the game-state (it inverts one component of the tennis ball’s momentum), just as Deep Blue does. Is there a sense in which there is chess in Deep Blue but not tennis in the brick wall?
Is there a sense in which there is chess in Deep Blue but not tennis in the brick wall?
Yes. Playing against Deep Blue tells you something about the rules that define chess, while playing against a wall tells you nothing about the rules that define tennis.
If you knew nothing about chess, then by playing against Deep Blue, you can update your probability distribution about what sorts of behaviors (legal moves) count as chess. If you knew nothing about tennis, you’re not going to learn its rules by playing against a brick wall.
See my last comment about when a computation is about something.
So does that resolve the hot-shot zinger that just popped into your head? Are we ready to go back to how blue can arise from “atoms moving around”?
To learn about chess by experimenting with Deep Blue, you must already know that it is a game-playing device, and something about how to engage it appropriately, such that your interaction with it will be an instance of the game. If you don’t know that, it is just a complicated finite-state object which responds to its boundary conditions in a certain way. And conversely, if you know that a brick wall allows some aspects of tennis game-play to be reproduced, and you know the appropriate form of interaction, then you will be able to infer a little about tennis. Not much, but something.
However, this is a side issue, compared to your avowed subjectivism about computation. You say:
the existence of a computation in a process is observer-dependent
I appreciate that your personal theory of consciousness is a work in progress (and you may want to examine Giulio Tononi’s theory, which I discovered simply by a combined search on “consciousness” and “mutual information”), so you may not have an answer to this question yet, just an intended answer, but—is the existence of an observer going to be observer-dependent as well?
If you don’t know that, it is just a complicated finite-state object which responds to its boundary conditions in a certain way.
So, if you don’t know that it is a playing chess but decide ‘hey, I want to maximise the amount of control I have over these little pieces here’ then learn to play chess anyway you are really learning ‘Zombie Chess’ and not actually Chess.
Depending on how you define ‘maximum amount of control’ you may find yourself playing for something other than checkmate, since the game ends for both of you in a mate. For example, if we define ‘amount of control of the board’ by the number of moves open to you, divided by the number of moves open to your opponent—or perhaps the sum of that quantity over all positions throughout the game—then you will be playing for a drawn position in which you have as much freedom to move as possible and your opponent has as little freedom to move as possible. This also assumes that you don’t control the pieces by directly manipulating the screen image, and that you don’t intervene in Deep Blue’s computational processes.
The game that you learnt while interacting with Deep Blue would depend on the utility function you brought to the experience, and on the range of interactions you permitted yourself. Of course there is a relationship between the game of chess and the state transition diagram for Deep Blue, but you cannot infer the former from the latter alone.
Of course there is a relationship between the game of chess and the state transition diagram for Deep Blue, but you cannot infer the former from the latter alone.
You’re right, you can’t. Now, assume I do in fact infer and adopt a utility function that so happens to be that of chess. This is not an unrealistic assumption, the guy has a crown and the game ends. In that case, is ‘Chess’ in the room, even though there’s just this silicone powered thing and me who has decided to fiddle with it? Were I to grant that you can’t make Blue out of non-Blue I would assume I also couldn’t make Chess out of Deep Blue.
Were I to grant that you can’t make Blue out of non-Blue I would assume I also couldn’t make Chess out of Deep Blue.
It’s a bit different because (from my perspective) the issue here is intentionality rather than qualia. You can’t turn something blue just by calling it blue. But you can make something part of a game by using it in the game. It has to be the right sort of thing to play the intended role, so its intrinsic properties do matter, but they only provide a necessary and not a sufficient condition. The other necessary condition is that it is being interpreted as playing the role, and so here we get back to the role of consciousness. If a copy of Deep Blue popped into being like a Boltzmann Brain and started playing itself in the intergalactic void, that really would be an instance of “zombie chess”.
We will have to return to definitions then. Can you have a game without players? Can you have a player without intentions? It is like arguing whether the Face on Mars is really a face. It is not the product of intention, but it does indeed look like a face. Is looking like a face enough for it to be a face? Deep Blue “plays chess” if you define chess as occurring whenever there is a conformance to certain appearances. But if chess requires the presence of a mind possessing certain minimal concepts and intentions, then Deep Blue in itself does not play chess.
Given the assumption that the computer is optimizing something, and given the awareness of the possibility of a game, you can infer essentially the whole of chess from the program. Chess consists of three things: the board and pieces, the movement rules, and the winning criterion. Observing the game, you will find that the computer steers the chessboard into different final regions depending on whether it moves the black pieces or the white pieces, and this will tell you the criterion the computer uses to optimize its position. And these will tell you that checkmate favors the party moving last and draws are preferred to being checkmated but not to checkmating.
One obvious difference is that nowhere in the brick wall is a representation of tennis. Chess computers have models of chessboards, recognize legal and illegal moves, and have some judgement of how good or poor a position is for either side.
By virtue of what property do these representations have the content that you attribute to them?
I assume you can see where I’m going here—this is an old question in philosophy of computation: what is it that makes a physical computation about anything in particular.
By virtue of what property do these representations have the content that you attribute to them?
By what virtue is a chess game a chess game and not two people playing with statues? The rules by which the chess computer operates parallel the rules by which chess operates—the behavior is mirrored within it. If you gave someone a brick wall, they couldn’t analyze it to learn how to play tennis, but if you gave someone a chess program, they could deduce from it the rules of chess.
By what virtue is a chess game a chess game and not two people playing with statues? The rules by which the chess computer operates parallel the rules by which chess operates
I don’t think it’s quite that simple. If a couple of four-year-olds encounter a chess set, and start moving the pieces around on the board, they might happen to take turns and make only legal “moves” until they got bored. I don’t think they’d be playing chess. Similarly, if a couple of incompetent adults encounter a chess set and try to play chess, but because they aren’t very smart or paying very close attention, about a quarter of their moves aren’t actually legal, they’re playing chess—they’re just making mistakes in so doing.
The equivalence I’m proposing isn’t between results or actions, but the causal springs of the actions. In your example, the children making legal chess moves are only doing so by luck—the causal chains determining their moves at no point involve the rules of chess—whereas the adults playing chess badly are doing so by a causal chain which includes the rules of chess. If you changed those rules, it would not change the children’s moves, but it would change the adults’.
this is an old question in philosophy of computation: what is it that makes a physical computation about anything in particular.
And it’s one that’s overhyped, but actually not that complicated.
A computation is “about” something else if and to the extent that there exists mutual information between the computation and the something else. Old thread on the matter.
Does observing the results of a physical process tell you something about the result of the computation 2+3? Then it’s an implementation of the addition of 2 and 3. Does it consistently tell you something about addition problems in general? Then it’s an implementation of addition.
This doesn’t fall into the trap of “joke interretations” where e.g. you apply complicated, convoluted transformations to molecular motions to hammer them into a mapping to addition. The reason is that by applying such a complicated (and probably ever-expanding) interpretation, the physical process is no longer telling you something about the answer; rather, the source of the output, by means of specifying the convoluted transformation, is you, and every result originates in you, not the physical process.
A computation is “about” something else if and to the extent that there exists mutual information between the computation and the something else.
Mutual information is defined for two random variables, and random variables are mappings from a common sample space to the variables’ domains. What are the mappings for two “things”? Mutual information doesn’t just “exist”, it is given by mappings which have to be somehow specified, and which can in general be specified to yield an arbitrary result.
This doesn’t fall into the trap of “joke interrelations” where e.g. you apply complicated, convoluted transformations to molecular motions to hammer them into a mapping to addition. The reason is that by applying such a complicated (and probably ever-expanding) interpretation, the physical process is no longer telling you something about the answer; rather, the source of the output, by means of specifying the convoluted transformation, is you, and every result originates in you, not the physical process.
When you distinguish between the mappings “originating” in the interpreter versus in the “physical process itself”, you are judging the relevance of output of mutual information calculation in the same motion (“no true Scotsman”). Mutual information doesn’t compute your answer, deciding whether the mapping came from an approved source does.
Mutual information is defined for two random variables, and random variables are mappings from a common sample space to the variables’ domains. What are the mappings for two “things”? Mutual information doesn’t just “exist”, it is given by mappings which have to be somehow specified, and which can in general be specified to yield an arbitrary result.
I wasn’t as precise as I should have been. By “mutual information”, I mean “mutual information conditional on yourself”. (Normally, “yourself” is part of the background knowledge predicating any probability and not explicitly represented.) So, as per the rest of my comment, the kind of mutual information I meant is well defined here: Physical process R implements computation C if and to the extent that, given yourself, learning R tells you something about C.
Yes, this has the counterintuitive result that the existence of a computation in a process is observer-dependent (not unlike every other physical law).
When you distinguish between the mappings “originating” in the interpreter versus in the “physical process itself”, you are judging the relevance of output of mutual information calculation in the same motion (“no true Scotsman”). Mutual information doesn’t compute your answer, deciding whether the mapping came from an approved source does.
No, mutual information is still the deciding factor. As per my above remark, if the source of the computation is really you, by means your ever-more-complex, carefully-designed mapping, then
P(C|self) = P(C|self,R)
i.e., learning about the physical process R didn’t change your beliefs about C. So, conditioning on yourself, there is no mutual information between C and R.
If you are the real source of the computation, that’s one reason the equality above can hold, but not the only reason.
I wasn’t as precise as I should have been. By “mutual information”, I mean “mutual information conditional on yourself”. (Normally, “yourself” is part of the background knowledge predicating any probability and not explicitly represented.) So, as per the rest of my comment, the kind of mutual information I meant is well defined here: Physical process R implements computation C if and to the extent that, given yourself, learning R tells you something about C.
Vague and doesn’t seem relevant. What is the sample space, what are the mappings? Conditioning means restricting to a subset of the sample space, and seeing how the mappings from the probability measure defined on it redraw the probability distributions on the variables’ domains. You still need those mappings, it’s what relates different variables to each other.
Are you saying, then, that the meaning of a computation depends on what the user thinks or the programmer intends, rather than being intrinsic to the computation?
Well, it could depend on what the computation thinks.
But my point was that the brick wall doesn’t keep track of the ball.
Whether a robot tennis player keeps track of the ball or not doesn’t depend on what I think it does or how I thought I designed it. It is a fact of the matter.
This will probably be my last reply to Mitchell on consciousness or ontology.
It is now very highly probable that my differences with Mitchell in this thread stem from differences over epistemology. Specifically, Mitchell considers it epistemologically satisfactory to adhere to his current position until provided with strong evidence or strong argument against it. The best summary of that position in a few sentences is probably the following passage written by Mitchell less than 36 hours ago to be found in the parent of this comment:
My position is that what Mitchell considers outlandish is a perfectly normal and perfectly satisfactory hypothesis. None of Mitchell’s indictments of the hypothesis strike me as actual handicaps in a proper contest among hypotheses (i.e., in a proper epistemological process). If you want a summary in a few sentences of my position (which is the standard position round here) on how hypotheses should be judged, see the first paragraph of something I wrote less than 48 hours ago. I would gladly elaborate on it and how it applies to Mitchell’s concerns if anyone is interested. Alternatively, the interested reader could just wait for the top-level submission promised by jimrandomh in a sister to this comment.
The only way I can see to impose a quantitative framework upon this disagreement is to construct a Bayesian belief network encompassing all the key propositions in both your argument and my argument, and then we try to find where your probabilistic dependencies are different to mine. But I wonder if that’s even necessary.
Here’s my reasoning:
Colors exist.
Colors would not exist in a universe consisting solely of colorless particles in motion through colorless space. (In a nutshell: you can’t get color from noncolor.)
Therefore, this is not such a universe.
Your reasoning is something like:
We explained everything else in terms of colorless particles, etc, so far.
Therefore we’ll do it this time too.
To come around to your view, I have to deny my second premise. I see three ways you can try to make me do that. First, you can show me a specific way to get color from noncolor—but no-one has shown me that. Second, you can use historical analogy to argue that my intuition is wrong. But in my most recent comment to RobinZ I explained why consciousness is different. Third, you can appeal to consensus: everyone else here thinks we can get color from noncolor somehow. But that consensus can be explained psychologically, culturally and historically.
Like I said, we can set about the laborious task of formalizing all this. But do we need to?
I am going to ignore the parts of Mitchell’s comment where Mitchell repeats points I already responded to, which leaves us with one point:
I do not know what you mean by a Bayesian belief network about an argument. I humbly suggest that when you wrote that sentence, you were confused about how a quantitative framework (as you call) or a formal epistemological treatment (as I would call it) would go. Please allow me to give the miminum amount of exposition necessary for present purposes. Although it is a clear improvement over what Mitchell wrote, there might be mistakes in the following exposition because I came to “technical epistemology” after the age of 40 and life circumstances have prevented me from giving it the study it deserves. If Eliezer, Anna Salamon or Steve Rayhawk takes exception to anything I say below, then believe them, not what I say below.
In the gospel according to Jaynes, Pearl, Solomonoff, etc, there is one Bayesian belief network out of all the possible Bayesian belief networks that is the accurate model of reality. If we knew which one it was, we would be able to use it to answer any question about any cause-and-effect relationship that is in principle answerable—or so it seems to me according to my untutored understanding. Parenthetically, in Causality Pearl opines that systems of equations similar to the structural equation models pioneered by the econometricians are probably a better representation than Bayesian belief networks. Hutter and Schmidhuber I think use Turing machines instead of Bayesian belief networks. Needless to say, if you have a formal model of reality in one representation (Bayesian belief network, say), it is a fairly easy mathematical exercise to put it into a different representation.
So, there is one “objectively true” model of reality, but I do not know which one it is. Consequently, what I have as my model of reality is a distribution over models—er, to be precise, a distribution over candidates for the One True model. (I think I used the word “hypothesis” in my previous comment, but right now I prefer “candidate model”.) By “distribution” I mean a mapping from candidates to real numbers in the interval between 0 and 1. I will refer to these real numbers as probabilities. There are an uncountable number of candidates, and the only way to get the probabilities of the candidates to sum to 1 is if the probabilities of arithmetically longer candidates are geometrically smaller. This is the formal version of Occam’s Razor. Why do the probabilities need to sum to 1? Well, the short answer is the Kolmogorov axioms say so. Who made the Kolmogorov axioms God? Cox’s theorem did.
Since physics is the study of fundamental reality, when I say “our civilization’s standard physical model” I refer to our civilization’s standard model of fundamental reality. The word “fundamental” is in there to indicate that “Fairbanks is the capital of Alaska” is not in the model. Our civilization’s model of fundamental reality remains _in_formal. To produce a formal model would require more than one generation of scientific effort in my humble estimate. In other words, to get it done would entail some community of scientists working on it till they became experts at the work, which I would think would take at least ten years. Then that first generation of scientists would have to train a second generation. But maybe I am wrong and it would take only one generation of scientific research to produce a formal model sufficiently useful that researchers wielding the model could compete with professional physicists trained the conventional way. (It would be a very cool achievement and parenthetically a potent way to remove human cognitive biases from scientific research it seems to me according to my untutored understanding of formal epistemology).
Even though I do not have a formal model of fundamental reality, my knowledge of formal epistemology which I have attempted to summarize briefly above is still useful because (probabalistically speaking, that is, excluding the freak case where all the air molecules go to one half of the room) any process that produces a true model of reality must approximate the process by which evidence updates a distribution on candidate models of reality outlined briefly above. (In particular, the process of natural selection that produced the scientists that produced our civilization’s physical models must approximate the process outlined above.)
Well, that should be enough to correct Mitchell’s false or confusing or easily-misinterpreted statements (quoted above) about formal epistemology—which is the only end for which I have any patience left in this top-level submission by Mitchell.
If I have The True Belief Network then I don’t need to predict cause-and-effect relationships. I just know the full state of the timeless universe. I mean to ask, why is a belief network constrained to representing physical laws and not physical state? After all, my current network has a bit of both...
I did not say it is constrained to represent physical laws, wedrifid.
Could it be that you believe that my use of “cause-and-effect relationship” implies that constraint? If so, I’m not conceding the implication.
I’m not asking you to concede anything. I’m trying to explore your meaning. What would you (or, for that matter, the experts you cite) say is the One True Model? I can imagine various types of mathematical abstractions but aren’t sure which kind you are referring to.
The casual reader might be saying to himself, “There goes Hollerith with another long comment about what he is calling formal epistemology. Why doesn’t he have the manners to refrain from injecting a long thread on an unrelated topic into Mitchell’s article on monads, consciousness, etc?”
Well, two replies to that. First, I say formal epistemology is not unrelated. Mitchell has been writing for many years around these parts on how consciousness presents a problem for standard physics. He has even solicited donations to support him in researching the matter further, saying that it is dangerous to have a singularity without having done that research. So, one of the ways formal epistemology enters naturally into this comment section is that I humbly sumbit that anyone engaged in such a project that Mitchell is engaged in should have as part of his technical background an education in formal epistemology. It leads to crisper thinking, and given how many resources Mitchell is devoting to the project, his dedicating some of those resources to learning formal epistemology is probably a good use of his time (and, oh, by the way, I’m not going to pay anymore attention to his writings on consciousness, ontology, etc, till he does).
Second, now that it has become plain that my mention of formal epistemology might lead to a long thread of conversation, I will indeed move the conversation to this place. It might move back to Less Wrong in the form of a top-level article by me with a title something like Why most people here should probably learn technical epistemology, a.k.a., the math of rationality. This prospective article would cover no ground that Eliezer has not already covered, but when it is important to publicize some point, then it often wins to have more than one voice making that point.
It would be possible for a person to maintain that only the natural numbers exist, and that there is nothing else. They could point to all the things which can be described using natural numbers; and if you insisted that some particular thing was not actually a natural number, but merely had a relationship to the numbers, they would keep returning their focus to the numerical part of the description of everything, and handwave away every other aspect as not really real, or as itself just being another number.
In the discussion of whether color can be reduced to the motions of particles in space, I feel myself to be in a comparable situation. The discussion of color as such repeatedly turns into a discussion of particles in the light source or particles in the brain…
Perhaps someone out there has conducted the subjective experiment of attending to actual color for a moment, and asking themselves afresh whether this thing could “really” be just particles in motion. The first thing to ask yourself is whether this alleged identity derives any impetus at all from the intrinsic nature of particles in motion. If somehow you knew nothing of color experiences or of neuropsychology, would you have any reason to think, in contemplating any assortment of particles circulating in space, that “color” or “the experience of color” was there? I think not. The motivation for the identity comes entirely from the belief that the world in general has already been explained by a physics of this form, and so color (and everything else about consciousness) must, somehow, also reduce to particles in motion. There is nothing in the intrinsic nature of color or the intrinsic nature of particles in motion to make you think that it is even possible for one to be the other.
That is the sort of argument that you have to resort to with someone who thinks that color is particles, or that everything is a number. You have to draw their attention to their actual experience, and make them question from the very beginning whether what they are saying makes sense. But Richard, I have no idea how to do that within these epistemic formalisms you promote, which seem to mostly be good for arriving at the simplest possible causal structure for hidden causes, and say nothing about how to correctly think about appearances as such, or how to ensure that you are placing a thing in the right ontological category.
But can this formal epistemology be the whole of epistemology? What is your formal epistemic basis for thinking that something exists, or that you have experienced blueness, or that 1+1=2?
I look forward to that.
What I called a BBN (it may be a generalization of the standard concept) is a belief system schema constructed to be capable of representing your reasoning and my reasoning. Nodes are propositions and arrows are inferential steps. The schema must contain a node for every proposition that I use and every proposition that you use, and similarly must contain an arrow for every inferential step appearing in the argument of either person. Once we have that diagram, our two arguments may then be represented as each flowing through a portion of it. We arrive at opposite truth values for a common terminal proposition, so the arguments are in contradiction. To resolve the contradiction or at least identify its cause, we move upstream and try to identify where initial conditions differ.
This process will most likely require one to state opinions regarding certain implicit premises, used by the other person, which did not even play a role in one’s own argument, as well as to express differing opinions about the arrows, i.e., about the implications of one proposition for the truth of another. One of us may regard the truth of B as independent of the truth of A, whereas the other would say that if A is true, then B is definitely false—or probably false. It is merely a formal process meant to guarantee that the sources of disagreement are mutually understood, something which should happen anyway if the disagreement has developed in a lucid and orderly fashion.
I’m concerned that you want to dive down an explanatory hole with no bottom. Suppose we say: okay, the particles are colored; or space is colored; or otherwise, beneath the objects we perceive to be colored, there are colors. Won’t you want those colors explained too?
When I emphasize that the building blocks of physical ontology have no color, I’m not saying that everything would be solved if only they did. But if you start out without color, and your only way to make bigger things is through spatial and causal aggregation, color will not appear by itself—that is the message.
Certain properties can be described counterfactually—does that help? For instance, “fragility” can be a property of an object that never in fact breaks, as long as it would have been disposed to break under a greater proportion of conditions than many comparable objects. An object is a certain color if it is disposed to reflect light of one wavelength as light of a certain wavelength, which may be different from the original. An object can have this property even if it spends its lifetime in the dark.
The physical property to which you refer is deemed color only because it can induce an experience of color. And the experience of color can occur without that specific external stimulus. So the true nature of color must be sought within the brain.
Here is the confusion underlying this whole mess. There are three types of things which color can apply to: objects, light, and experiences. These are related causally: blue objects cause blue light which causes blue experiences; and evidentially: a blue experience is evidence that there was blue light, which is evidence that there was a blue object. However, color as it applies to objects, light, and experiences are three separate entities with different reductions. We use them interchangeably because the causal and evidential relationships allow them to substitute for eachother in almost all contexts.
If you start with one of blue objects, light, or experience clearly defined, then you can use that definition plus the causal/evidential relationships to define the other two. The natural way to define them is to define all three only in relation to eachother—ie, refer only to the entire structure, and depend on the ability to compare the color of reference objects/light/experiences to keep the definition stable. Fortunately, some discoveries from physics have enabled a simple physical description of blue light. Blue light is any light made predominately of photons with a wavelength close to 470nm. Based off that definition, a blue object is one that reflects or produces blue light, and a blue experience is one involving some particular set of neurons which I identify by their causal relationship to blue light. But stimulating these neurons without using light still makes a blue experience, and I could in principle identify those neurons some other way—for example, if I were to discover that protein X is found only in blue-experience neurons, then I could define a blue experience as an experience involving neurons containing protein X, and then define blue light and objects based on that.
There are some other strange entities which can have color because of causal relations, too. For example, the number 255 (#0000FF) is blue because it causes blue photons to be produced when written in the right part of a CSS file.
There is no confusion. I am disputing the adequacy of the reductions proposed for blue experience, specifically.
Let us suppose you have a definition of the physical correlate of blue experience which no longer refers to external causes, but just to intrinsic properties, such as the presence of protein X. That is an arrangement of atoms in space. My question remains, very simply, where is the blueness in this picture? I don’t see it.
Where is chess in Deep Blue?
In the midichlorians.
I can’t find the video online, but there was an SNL Star Wars sketch that reminds me of.
Mace Windu: “Does Gary Kasparov have enough midichlorians to beat Deep Blue? Y’damn right he don’t!”
Let’s define what you mean by chess, first.
How about you explain what meanings you would find problematic so I can be sure you’re not just trying to time-suck me as a cover for your own insufficient effort in developing a coherent position, which, as others who have argued with you here would agree, is a likely hypothesis?
ETA: In any other circumstance, if someone had given the response in this comment, even and especially if it were me giving the comment, I would accuse the person of being evasive because of an inconsistent position.
So, just to prove that that’s not the case here, I’ll show that I can define “chess” non-arbitrarily but in a way that’s applicable to deep blue and the broader issues of reductionism. However, I’d also need for Mitchell to generate his answer independently of knowing mine.
So, I’m up for having us both submit our answers to some trusted third party, who then reveals the answers. I would be responding to “what is chess [and how does it relate to Deep Blue]?”, while Mitchell would be responding to the question, “What are the problematic definitions of chess when trying to apply them to Deep Blue [and what are the implications to the alleged irreducibility of phenomenal blue]?”
Silas, the answer to your original question really does depend on the definition. You presumably brought it up with some intent, why don’t you go ahead and make the point you intended to make?
ETA: I can add right now that no particular definition of chess will be problematic, in the sense of leaving me at a loss for an answer.
My point was a response to your point by means of reductio: if you can’t see the blueness in atoms moving around, can you see the chess in atoms moving around in a chess computer? It’s the wrong question—chess is identified by a large-scale pattern of behavior. It does not require there to be a low-level ontologically-fundamental chess-thing in the picture.
A counter-question I thought of asking: is there tennis in a brick wall? You can’t get a whole game out of it, but you can get return-of-serve, rally practice, and volley practice. A brick wall has some of the capabilities you’d want in an ‘artificial tennis player’, just as Deep Blue has some of the capabilities of an ‘artificial chess player’. The brick wall achieves this by inducing a particular systematic transformation of the game-state (it inverts one component of the tennis ball’s momentum), just as Deep Blue does. Is there a sense in which there is chess in Deep Blue but not tennis in the brick wall?
Yes. Playing against Deep Blue tells you something about the rules that define chess, while playing against a wall tells you nothing about the rules that define tennis.
If you knew nothing about chess, then by playing against Deep Blue, you can update your probability distribution about what sorts of behaviors (legal moves) count as chess. If you knew nothing about tennis, you’re not going to learn its rules by playing against a brick wall.
See my last comment about when a computation is about something.
So does that resolve the hot-shot zinger that just popped into your head? Are we ready to go back to how blue can arise from “atoms moving around”?
To learn about chess by experimenting with Deep Blue, you must already know that it is a game-playing device, and something about how to engage it appropriately, such that your interaction with it will be an instance of the game. If you don’t know that, it is just a complicated finite-state object which responds to its boundary conditions in a certain way. And conversely, if you know that a brick wall allows some aspects of tennis game-play to be reproduced, and you know the appropriate form of interaction, then you will be able to infer a little about tennis. Not much, but something.
However, this is a side issue, compared to your avowed subjectivism about computation. You say:
I appreciate that your personal theory of consciousness is a work in progress (and you may want to examine Giulio Tononi’s theory, which I discovered simply by a combined search on “consciousness” and “mutual information”), so you may not have an answer to this question yet, just an intended answer, but—is the existence of an observer going to be observer-dependent as well?
So, if you don’t know that it is a playing chess but decide ‘hey, I want to maximise the amount of control I have over these little pieces here’ then learn to play chess anyway you are really learning ‘Zombie Chess’ and not actually Chess.
Depending on how you define ‘maximum amount of control’ you may find yourself playing for something other than checkmate, since the game ends for both of you in a mate. For example, if we define ‘amount of control of the board’ by the number of moves open to you, divided by the number of moves open to your opponent—or perhaps the sum of that quantity over all positions throughout the game—then you will be playing for a drawn position in which you have as much freedom to move as possible and your opponent has as little freedom to move as possible. This also assumes that you don’t control the pieces by directly manipulating the screen image, and that you don’t intervene in Deep Blue’s computational processes.
The game that you learnt while interacting with Deep Blue would depend on the utility function you brought to the experience, and on the range of interactions you permitted yourself. Of course there is a relationship between the game of chess and the state transition diagram for Deep Blue, but you cannot infer the former from the latter alone.
You’re right, you can’t. Now, assume I do in fact infer and adopt a utility function that so happens to be that of chess. This is not an unrealistic assumption, the guy has a crown and the game ends. In that case, is ‘Chess’ in the room, even though there’s just this silicone powered thing and me who has decided to fiddle with it? Were I to grant that you can’t make Blue out of non-Blue I would assume I also couldn’t make Chess out of Deep Blue.
It’s a bit different because (from my perspective) the issue here is intentionality rather than qualia. You can’t turn something blue just by calling it blue. But you can make something part of a game by using it in the game. It has to be the right sort of thing to play the intended role, so its intrinsic properties do matter, but they only provide a necessary and not a sufficient condition. The other necessary condition is that it is being interpreted as playing the role, and so here we get back to the role of consciousness. If a copy of Deep Blue popped into being like a Boltzmann Brain and started playing itself in the intergalactic void, that really would be an instance of “zombie chess”.
I’m not talking about parts. I’m talking about the game Chess itself (or an instance thereof).
We will have to return to definitions then. Can you have a game without players? Can you have a player without intentions? It is like arguing whether the Face on Mars is really a face. It is not the product of intention, but it does indeed look like a face. Is looking like a face enough for it to be a face? Deep Blue “plays chess” if you define chess as occurring whenever there is a conformance to certain appearances. But if chess requires the presence of a mind possessing certain minimal concepts and intentions, then Deep Blue in itself does not play chess.
Given the assumption that the computer is optimizing something, and given the awareness of the possibility of a game, you can infer essentially the whole of chess from the program. Chess consists of three things: the board and pieces, the movement rules, and the winning criterion. Observing the game, you will find that the computer steers the chessboard into different final regions depending on whether it moves the black pieces or the white pieces, and this will tell you the criterion the computer uses to optimize its position. And these will tell you that checkmate favors the party moving last and draws are preferred to being checkmated but not to checkmating.
One obvious difference is that nowhere in the brick wall is a representation of tennis. Chess computers have models of chessboards, recognize legal and illegal moves, and have some judgement of how good or poor a position is for either side.
By virtue of what property do these representations have the content that you attribute to them?
I assume you can see where I’m going here—this is an old question in philosophy of computation: what is it that makes a physical computation about anything in particular.
By what virtue is a chess game a chess game and not two people playing with statues? The rules by which the chess computer operates parallel the rules by which chess operates—the behavior is mirrored within it. If you gave someone a brick wall, they couldn’t analyze it to learn how to play tennis, but if you gave someone a chess program, they could deduce from it the rules of chess.
I don’t think it’s quite that simple. If a couple of four-year-olds encounter a chess set, and start moving the pieces around on the board, they might happen to take turns and make only legal “moves” until they got bored. I don’t think they’d be playing chess. Similarly, if a couple of incompetent adults encounter a chess set and try to play chess, but because they aren’t very smart or paying very close attention, about a quarter of their moves aren’t actually legal, they’re playing chess—they’re just making mistakes in so doing.
The equivalence I’m proposing isn’t between results or actions, but the causal springs of the actions. In your example, the children making legal chess moves are only doing so by luck—the causal chains determining their moves at no point involve the rules of chess—whereas the adults playing chess badly are doing so by a causal chain which includes the rules of chess. If you changed those rules, it would not change the children’s moves, but it would change the adults’.
Wow, great minds think alike. ;-)
(No, I didn’t see your reply before posting.)
And it’s one that’s overhyped, but actually not that complicated.
A computation is “about” something else if and to the extent that there exists mutual information between the computation and the something else. Old thread on the matter.
Does observing the results of a physical process tell you something about the result of the computation 2+3? Then it’s an implementation of the addition of 2 and 3. Does it consistently tell you something about addition problems in general? Then it’s an implementation of addition.
This doesn’t fall into the trap of “joke interretations” where e.g. you apply complicated, convoluted transformations to molecular motions to hammer them into a mapping to addition. The reason is that by applying such a complicated (and probably ever-expanding) interpretation, the physical process is no longer telling you something about the answer; rather, the source of the output, by means of specifying the convoluted transformation, is you, and every result originates in you, not the physical process.
Mutual information is defined for two random variables, and random variables are mappings from a common sample space to the variables’ domains. What are the mappings for two “things”? Mutual information doesn’t just “exist”, it is given by mappings which have to be somehow specified, and which can in general be specified to yield an arbitrary result.
When you distinguish between the mappings “originating” in the interpreter versus in the “physical process itself”, you are judging the relevance of output of mutual information calculation in the same motion (“no true Scotsman”). Mutual information doesn’t compute your answer, deciding whether the mapping came from an approved source does.
I wasn’t as precise as I should have been. By “mutual information”, I mean “mutual information conditional on yourself”. (Normally, “yourself” is part of the background knowledge predicating any probability and not explicitly represented.) So, as per the rest of my comment, the kind of mutual information I meant is well defined here: Physical process R implements computation C if and to the extent that, given yourself, learning R tells you something about C.
Yes, this has the counterintuitive result that the existence of a computation in a process is observer-dependent (not unlike every other physical law).
No, mutual information is still the deciding factor. As per my above remark, if the source of the computation is really you, by means your ever-more-complex, carefully-designed mapping, then
P(C|self) = P(C|self,R)
i.e., learning about the physical process R didn’t change your beliefs about C. So, conditioning on yourself, there is no mutual information between C and R.
If you are the real source of the computation, that’s one reason the equality above can hold, but not the only reason.
Vague and doesn’t seem relevant. What is the sample space, what are the mappings? Conditioning means restricting to a subset of the sample space, and seeing how the mappings from the probability measure defined on it redraw the probability distributions on the variables’ domains. You still need those mappings, it’s what relates different variables to each other.
“This is a lot easier to understand if you remember that the point of the system is to keep track of sheep.”
http://yudkowsky.net/rational/the-simple-truth
Are you saying, then, that the meaning of a computation depends on what the user thinks or the programmer intends, rather than being intrinsic to the computation?
Well, it could depend on what the computation thinks.
But my point was that the brick wall doesn’t keep track of the ball.
Whether a robot tennis player keeps track of the ball or not doesn’t depend on what I think it does or how I thought I designed it. It is a fact of the matter.
Suppose I dip the ball in paint before I start hitting it against the wall, so it leaves paintmarks there. Is the wall keeping track of the ball now?
You can’t keep track of sheep by dropping pebbles down a well.