The world model is actually an integral, but it can be approximated by a search by searching for several good hypothesis instead of integrating over all hypothesis.
Can you tell me what you mean by this statement? When you say “integral” I think “mathematical integral (inverse of derivative)” but I don’t think that’s what you intend to communicate.
Yes integral is exactly what I intended to communicate.
Think of hypothesis space. A vast abstract space of all possibilities. Each hypothesis has a P(x) the probability of being true, and a Ua(x) the utility of action a if it is true.
To really evaluate an action, you need to calculate ∫P(x)Ua(x)dx an integral over all hypothesis.
If you don’t want to behave with maximum intelligence, just pretty good intelligence, then you can run gradient descent to find a point X by trying to maximize P(x). Then you can calculate Ua(X) to compare actions. More sophisticated methods would sum several points.
This is partly using the known structure of the problem. If you have good evidence, then the function P(x) is basically 0 almost everywhere. So if Ua(X) is changing fairly slowly over the region that is significantly nonzero, looking at any nonzero point of P(x) is a good estimate of the integral.
Can you tell me what you mean by this statement? When you say “integral” I think “mathematical integral (inverse of derivative)” but I don’t think that’s what you intend to communicate.
Yes integral is exactly what I intended to communicate.
Think of hypothesis space. A vast abstract space of all possibilities. Each hypothesis has a P(x) the probability of being true, and a Ua(x) the utility of action a if it is true.
To really evaluate an action, you need to calculate ∫P(x)Ua(x)dx an integral over all hypothesis.
If you don’t want to behave with maximum intelligence, just pretty good intelligence, then you can run gradient descent to find a point X by trying to maximize P(x). Then you can calculate Ua(X) to compare actions. More sophisticated methods would sum several points.
This is partly using the known structure of the problem. If you have good evidence, then the function P(x) is basically 0 almost everywhere. So if Ua(X) is changing fairly slowly over the region that is significantly nonzero, looking at any nonzero point of P(x) is a good estimate of the integral.