I’m really surprised that on a site called “Less Wrong”, there isn’t more skepticism about an argument that one can’t be wrong about X, especially when X isn’t just one statement but a large category of statements. That doesn’t scream out “hold on a second!” to anyone?
It might be that he can’t be wrong about that, even though he doesn’t know for sure that he can’t be wrong about it. Infallibility and certainty are distinct concepts.
Certainty (confidence, etc.) is in the mind. Fallibility isn’t; you can be prone (or immune) to error even if no one thinks you are.
The point is that ‘What if I couldn’t be wrong about it?’ does not express ‘What if I could be certain that I couldn’t be wrong about it?’; the latter requires that 1 be a probability, but the former does not, since I might be unable to be wrong about X and yet only assign, say, a .8 probability to X’s being true (because I don’t assign probability 1 to my own infallibility).
Certainty (confidence, etc.) is in the mind. Fallibility isn’t; you can be prone (or immune) to error even if no one thinks you are.
Though no one could ever possibly know. Seriously: fallibility is in the mind. It’s a measure of how likely something is to fail; likelihoods are probabilities—and probabilities are (best thought of as being) in the mind.
Rigorously, I think the argument doesn’t stand up in its ultimate form. But it’s tiptoing in the direction of a very interesting point on how to deal with changing utility functions, especially in circumstances where the changes might be predictable.
The simple answer is “judge everything in your future by your current utility function”, but that doesn’t seem satisfactory. Nor is “judge everything that occures in your future by your utility function at the time”, because of lobotomies, addicting wireheading, and so on. Some people have utility functions that they expect will change; and the degree of change allowable may vary from person to person and subject to subject (eg, people opposed to polygamy may have a wide range of reactions to the announcement “in fifty years time, you will approve of polygamy”). Some people trust their own CEV; I never would, but I might trust it one level removed.
It’s a difficult subject, and my upvote was in thanks of bringing it up. Susequent posts on the subject I’ll judge more harshly.
The simple answer is “judge everything in your future by your current utility function”, but that doesn’t seem satisfactory.
It sounds satisfactory for agents that have utility functions. Humans don’t (unless you mean implicit utility functions under reflection, to the extent that different possible reflections converge), and I think it’s really misleading to talk as if we do.
Also, while this is just me, I strongly doubt our notional-utility-functions-upon-reflection contain anything as specific as preferences about polygamy.
Also, while this is just me, I strongly doubt our notional-utility-functions-upon-reflection contain anything as specific as preferences about polygamy.
That was just an example; people react differently to the idea that their values may change in the future, depending on the person and depending on the value.
If you take a utility function and multiply all the utilities by 0.01, is it the same utility function? In one sense it is, but by your measure it will always win a “most pessimistic” contest.
Update: thinking about this further, if the only allowable operations on utilities are comparison and weighted sum, then you can multiply by any positive constant or add and subtract any constant and preserve isomorphism. Is there a name for this mathematical object?
In particular, this means that nothing has “positive utility” or “negative utility”, only greater or lesser utility compared to something else.
ETA: If you want to compare two different people’s utilities, it can’t be done without introducing further structure to enable that comparison. This is required for any sort of felicific calculus.
There’s a name I can’t remember for the “number line with no zero” where you’re only able to refer to relative positions, not absolute ones. I’m looking for a name for the “number line with no zero and no scale”, which is invariant not just under translation but under any affine transformation with positive determinant.
I’m in an elementary statistics class right now and we just heard about “levels of measurement” which seem to make these distinctions: your first is the interval scale, and second the ordinal scale.
The “number line with no zero, but a uniquely preferred scale” isn’t in that list of measurement types; and it says the “number line with no zero and no scale” is the interval scale.
A utility function is just a representation of preference ordering. Presumably those properties would hold for anything that is merely an ordering making use of numbers.
You also need the conditions of the utility theorem to hold. A preference ordering only gives you conditions 1 and 2 of the theorem as stated in the link.
Good point. I was effectively entirely leaving out the “mathematical” in “mathematical representation of preference ordering”. As I stated it, you couldn’t expect to aggregate utiles.
I completely agree. The argument may be wrong but the point it raises, that sloppily assuming things about which possible causal continuations of self I care about, is important.
My initial reaction: we can still use our current utility function, but make sure the CEV analysis or whatever doesn’t say “what would you want if you were more intelligentetc?” but instead “what would you want if you were changed in a way you currently want to be changed”?
This includes “what would you want if we found fixed points of iterated changes based on previous preferences”, so that if I currently want to value paperclips more but don’t care whether I value factories differently, but if upon modifying me to value paperclips more it turns out I would want to value factories more, then changing my preferences to value factories more is acceptable.
The part where I’m getting confused right now (rather, the part where I notice I’m getting confused :)) is that calculating fixed points almost certainly depends on the order of alteration, so that there are lots of different future-mes that I prefer to current-me that are at local maximums.
Also I have no idea how much we need to apply our current preferences to the fixed-point-mes. Not at all? 100%? Somehow something in-between? Or to the intermediate-state-mes.
I don’t think the order issue is a big problem—there is not One Glowing Solution, we just need to find something nice and tolerable.
Also I have no idea how much we need to apply our current preferences to the fixed-point-mes. Not at all? 100%? Somehow something in-between? Or to the intermediate-state-mes.
Incorrigibility is way too strong an assertion, but there’s a sense in which I cannot be completely wrong about my values, since I’m the only source of information about them; except perhaps to the extent that you can infer them from my fellow human beings, and to that extent humanity as a whole cannot be completely mistaken about its values.
I suspect there may be an analogy with Donaldson’s observation that if you think penguins are tiny burrowing insects that live in the Sahara, you’re not so much mistaken about penguins as not talking about them at all. However, I can’t completely make this analogy work.
P v -p is disputed, so someone is wrong there. Also, if you have ever done a 10+ line proof or 10+ place truth table you know it is trivially (pun intended) easy to get those wrong.
I think the concept of a thought and what it is for a thought to be about something needs to be refined before we can say more about the second example. To begin with, if I see a dragonfly and mistake it for a fairy and then start to think about the fairy I saw, it isn’t clear that I really am thinking about a fairy.
I’m really surprised that on a site called “Less Wrong”, there isn’t more skepticism about an argument that one can’t be wrong about X, especially when X isn’t just one statement but a large category of statements. That doesn’t scream out “hold on a second!” to anyone?
Eyup. Humans can be wrong about anything. It’s like our superpower.
You could be wrong about that.
What if I couldn’t be wrong about that?
Then you would clearly be immune to hemlock, and therefore weigh the same as a duck.
Then you would be 100% certain—and 0 and 1 are not probabilities.
It might be that he can’t be wrong about that, even though he doesn’t know for sure that he can’t be wrong about it. Infallibility and certainty are distinct concepts.
Fallibility is in the mind.
Certainty (confidence, etc.) is in the mind. Fallibility isn’t; you can be prone (or immune) to error even if no one thinks you are.
The point is that ‘What if I couldn’t be wrong about it?’ does not express ‘What if I could be certain that I couldn’t be wrong about it?’; the latter requires that 1 be a probability, but the former does not, since I might be unable to be wrong about X and yet only assign, say, a .8 probability to X’s being true (because I don’t assign probability 1 to my own infallibility).
Though no one could ever possibly know. Seriously: fallibility is in the mind. It’s a measure of how likely something is to fail; likelihoods are probabilities—and probabilities are (best thought of as being) in the mind.
Rigorously, I think the argument doesn’t stand up in its ultimate form. But it’s tiptoing in the direction of a very interesting point on how to deal with changing utility functions, especially in circumstances where the changes might be predictable.
The simple answer is “judge everything in your future by your current utility function”, but that doesn’t seem satisfactory. Nor is “judge everything that occures in your future by your utility function at the time”, because of lobotomies, addicting wireheading, and so on. Some people have utility functions that they expect will change; and the degree of change allowable may vary from person to person and subject to subject (eg, people opposed to polygamy may have a wide range of reactions to the announcement “in fifty years time, you will approve of polygamy”). Some people trust their own CEV; I never would, but I might trust it one level removed.
It’s a difficult subject, and my upvote was in thanks of bringing it up. Susequent posts on the subject I’ll judge more harshly.
It sounds satisfactory for agents that have utility functions. Humans don’t (unless you mean implicit utility functions under reflection, to the extent that different possible reflections converge), and I think it’s really misleading to talk as if we do.
Also, while this is just me, I strongly doubt our notional-utility-functions-upon-reflection contain anything as specific as preferences about polygamy.
That was just an example; people react differently to the idea that their values may change in the future, depending on the person and depending on the value.
How about “judge by both utility functions and use the most pessimistic result”?
If you take a utility function and multiply all the utilities by 0.01, is it the same utility function? In one sense it is, but by your measure it will always win a “most pessimistic” contest.
Update: thinking about this further, if the only allowable operations on utilities are comparison and weighted sum, then you can multiply by any positive constant or add and subtract any constant and preserve isomorphism. Is there a name for this mathematical object?
Affine transformations. Utility functions are defined up to affine transformation.
In particular, this means that nothing has “positive utility” or “negative utility”, only greater or lesser utility compared to something else.
ETA: If you want to compare two different people’s utilities, it can’t be done without introducing further structure to enable that comparison. This is required for any sort of felicific calculus.
There’s a name I can’t remember for the “number line with no zero” where you’re only able to refer to relative positions, not absolute ones. I’m looking for a name for the “number line with no zero and no scale”, which is invariant not just under translation but under any affine transformation with positive determinant.
I’m in an elementary statistics class right now and we just heard about “levels of measurement” which seem to make these distinctions: your first is the interval scale, and second the ordinal scale.
The “number line with no zero, but a uniquely preferred scale” isn’t in that list of measurement types; and it says the “number line with no zero and no scale” is the interval scale.
A utility function is just a representation of preference ordering. Presumably those properties would hold for anything that is merely an ordering making use of numbers.
You also need the conditions of the utility theorem to hold. A preference ordering only gives you conditions 1 and 2 of the theorem as stated in the link.
Good point. I was effectively entirely leaving out the “mathematical” in “mathematical representation of preference ordering”. As I stated it, you couldn’t expect to aggregate utiles.
You can’t aggregate utils; you can only take their weighted sums. You can aggregate changes in utils though.
I completely agree. The argument may be wrong but the point it raises, that sloppily assuming things about which possible causal continuations of self I care about, is important.
My initial reaction: we can still use our current utility function, but make sure the CEV analysis or whatever doesn’t say “what would you want if you were more intelligentetc?” but instead “what would you want if you were changed in a way you currently want to be changed”?
This includes “what would you want if we found fixed points of iterated changes based on previous preferences”, so that if I currently want to value paperclips more but don’t care whether I value factories differently, but if upon modifying me to value paperclips more it turns out I would want to value factories more, then changing my preferences to value factories more is acceptable.
The part where I’m getting confused right now (rather, the part where I notice I’m getting confused :)) is that calculating fixed points almost certainly depends on the order of alteration, so that there are lots of different future-mes that I prefer to current-me that are at local maximums.
Also I have no idea how much we need to apply our current preferences to the fixed-point-mes. Not at all? 100%? Somehow something in-between? Or to the intermediate-state-mes.
I don’t think the order issue is a big problem—there is not One Glowing Solution, we just need to find something nice and tolerable.
That is the question.
I think your heuristic is sound—that seemed screamingly wrong to me as well.
Incorrigibility is way too strong an assertion, but there’s a sense in which I cannot be completely wrong about my values, since I’m the only source of information about them; except perhaps to the extent that you can infer them from my fellow human beings, and to that extent humanity as a whole cannot be completely mistaken about its values.
I suspect there may be an analogy with Donaldson’s observation that if you think penguins are tiny burrowing insects that live in the Sahara, you’re not so much mistaken about penguins as not talking about them at all. However, I can’t completely make this analogy work.
How about if X is a set of assertions that logical tautologies are true:
http://en.wikipedia.org/wiki/Tautology_(logic))
http://en.wikipedia.org/wiki/Tautology_(logic)#Definition_and_examples#Definition_and_examples)
An example along similar lines to this post would be: you can’t be wrong about thinking you are thinking about X—if you are thinking about X.
http://www.spaceandgames.com/?p=27
Now that is a overconfidence/independent statements anecdote I’ll remember. The ‘7 is prime probability 1’ part too.
Nah, these are not “independent” statements, they are all much the same:
They are “I want X” statements.
P v -p is disputed, so someone is wrong there. Also, if you have ever done a 10+ line proof or 10+ place truth table you know it is trivially (pun intended) easy to get those wrong.
I think the concept of a thought and what it is for a thought to be about something needs to be refined before we can say more about the second example. To begin with, if I see a dragonfly and mistake it for a fairy and then start to think about the fairy I saw, it isn’t clear that I really am thinking about a fairy.