Be­liefs at dif­fer­ent timescales

Why is a chess game the op­pos­ite of an ideal gas? On short times­cales an ideal gas is de­scribed by elastic col­li­sions. And a single move in chess can be modeled by a policy net­work.

The dif­fer­ence is in long times­cales: If we sim­u­lated elastic col­li­sions for a long time, we’d end up with a com­plic­ated dis­tri­bu­tion over the mi­cro­states of the gas. But we can’t run sim­u­la­tions for a long time, so we have to make do with the Boltzmann dis­tri­bu­tion, which is a lot less ac­cur­ate.

Sim­il­arly, if we rolled out our policy net­work to get a dis­tri­bu­tion over chess game out­comes (win/​loss/​draw), we’d get the dis­tri­bu­tion of out­comes un­der self-play. But if we’re ob­serving a game between two play­ers who are bet­ter play­ers than us, we have ac­cess to a more ac­cur­ate model based on their Elo rat­ings.

Can we form­al­ize this? Sup­pose we’re ob­serving a chess game. Our be­liefs about the next move are con­di­tional prob­ab­il­it­ies of the form , and our be­liefs about the next moves are con­di­tional prob­ab­il­it­ies of the form . We can trans­form be­liefs of one type into the other us­ing the operators

If we’re lo­gic­ally om­ni­scient, we’ll have and . But in gen­eral we will not. A chess game is short enough that is easy to com­pute, but is too hard be­cause it has ex­po­nen­tially many terms. So we can have a long-term model that is more ac­cur­ate than the rol­lout , and a short-term model that is less ac­cur­ate than . This is a sign that we’re deal­ing with an in­tel­li­gence: We can pre­dict out­comes bet­ter than ac­tions.

If in­stead of a chess game we’re pre­dict­ing an ideal gas, the rel­ev­ant times­cales are so long that we can’t com­pute or . Our long-term ther­mo­dy­namic model is less ac­cur­ate than a sim­u­la­tion . This is of­ten a fea­ture of re­duc­tion­ism: Com­plic­ated things can be re­duced to simple things that can be modeled more ac­cur­ately, al­though more slowly.

In gen­eral, we can have sev­eral mod­els at dif­fer­ent times­cales, and and op­er­at­ors con­nect­ing all the levels. For ex­ample, we might have a short-term model de­scrib­ing the phys­ics of fund­mental particles; a me­dium-term model de­scrib­ing a per­son’s mo­tor ac­tions; and a long-term model de­scrib­ing what that per­son ac­com­plishes over the course of a year. The me­dium-term model will be less ac­cur­ate than a rol­lout of the short-term model, and the long-term model may be more ac­cur­ate than a rol­lout of the me­dium-term model if the per­son is smarter than us.