it is dangerous to give an AI false beliefs (they may not be stable, for one)
But the approach described here seems to give 100% identical results (certainly as long as we talk about beliefs uncorrelated with the agent’s behavior). So why do you think that one is dangerous and the other is fine?
Can you describe a situation in which the two changes lead to different outcomes?
I’m thinking about what happens in the close or converging to zero case. “I am in an impossible world” seems more dangerous than “I am in a world I cannot improve or worsen”.
To be honest, the real justification behind that was that the suggestions I’d heard about giving the AI false beliefs all seemed to fail, so this felt safer.
Again, can you describe any case where the two proposals do anything different?
Given that they do the same thing in every case, it seems highly unlikely that one is safe and the other is dangerous! At best you are obfuscating the problem.
For example, if this really dodges any failures associated with giving the AI false beliefs, that should give you an example where the two proposals do something different.
Now that I’ve rested a bit, let me think about this properly. One reason I was wary of changing probability was because of all the related other probabilities—conditional probabilities, AND and OR expressions, and so on. Changing one probability would have to keep the rest consistent, while changing utility had consistency built in.
It feels like changing a prior might be equivalent. I’m not sure that there is any difference between changing a prior and changing the utility. But, again, there might be some consistency worries to think about—eg how do we change priors over correlations between events, and so on? It still seems that changing probability involves many choices while changing the utility doesn’t (it seems equivalent with finding a Bayes factor that provides evidence for that specific event?)
The normal way of modifying a probability distribution to make X more likely is to increase the probability of each world where X is true, e.g. by doubling it. This is equivalent to observing evidence for X. It’s also equivalent to your procedure for modifying utility functions.
You write:
But the approach described here seems to give 100% identical results (certainly as long as we talk about beliefs uncorrelated with the agent’s behavior). So why do you think that one is dangerous and the other is fine?
Can you describe a situation in which the two changes lead to different outcomes?
I’m thinking about what happens in the close or converging to zero case. “I am in an impossible world” seems more dangerous than “I am in a world I cannot improve or worsen”.
To be honest, the real justification behind that was that the suggestions I’d heard about giving the AI false beliefs all seemed to fail, so this felt safer.
Again, can you describe any case where the two proposals do anything different?
Given that they do the same thing in every case, it seems highly unlikely that one is safe and the other is dangerous! At best you are obfuscating the problem.
For example, if this really dodges any failures associated with giving the AI false beliefs, that should give you an example where the two proposals do something different.
Now that I’ve rested a bit, let me think about this properly. One reason I was wary of changing probability was because of all the related other probabilities—conditional probabilities, AND and OR expressions, and so on. Changing one probability would have to keep the rest consistent, while changing utility had consistency built in.
It feels like changing a prior might be equivalent. I’m not sure that there is any difference between changing a prior and changing the utility. But, again, there might be some consistency worries to think about—eg how do we change priors over correlations between events, and so on? It still seems that changing probability involves many choices while changing the utility doesn’t (it seems equivalent with finding a Bayes factor that provides evidence for that specific event?)
I will think more.
The normal way of modifying a probability distribution to make X more likely is to increase the probability of each world where X is true, e.g. by doubling it. This is equivalent to observing evidence for X. It’s also equivalent to your procedure for modifying utility functions.
New post on these kinds of subjects: http://lesswrong.com/r/discussion/lw/myl/utility_probability_and_false_beliefs/