There I was just quoting from the Hintze paper so it’s not clear what he meant. One interpretation is that the right hand side is just the definition of what “UDT(s)” means, so in that sense there wouldn’t be a type error, UDT(s) would also be a policy. But also, you’re right, a decision theory should in the end output an action. The right notation all comes down to what I said in the last paragraph of my previous comment, namely, does UDT1.1/FDT-policy need to know the sense data s (or ‘observation x’, in the other notation) in order to condition on the agent using a particular policy? If the answer is yes, UDT(s) is a policy, and UDT(s)(s) is the action. If the answer is no, then UDT is the policy (confusing! because UDT is also the ‘decision algorithm’ that finds the policy in the particular decision problem you are facing) and UDT(s) is the action. My best guess is that the answer to this question is ‘no’, so UDT is the policy and UDT(s) is the action, so your point about there being a type error is correct. But the notation f:s↦a in the Hintze paper makes it seem like somehow s is being used on the right hand side, which is possibly what confused me when I wrote the post.
There I was just quoting from the Hintze paper so it’s not clear what he meant. One interpretation is that the right hand side is just the definition of what “UDT(s)” means, so in that sense there wouldn’t be a type error, UDT(s) would also be a policy. But also, you’re right, a decision theory should in the end output an action. The right notation all comes down to what I said in the last paragraph of my previous comment, namely, does UDT1.1/FDT-policy need to know the sense data s (or ‘observation x’, in the other notation) in order to condition on the agent using a particular policy? If the answer is yes, UDT(s) is a policy, and UDT(s)(s) is the action. If the answer is no, then UDT is the policy (confusing! because UDT is also the ‘decision algorithm’ that finds the policy in the particular decision problem you are facing) and UDT(s) is the action. My best guess is that the answer to this question is ‘no’, so UDT is the policy and UDT(s) is the action, so your point about there being a type error is correct. But the notation f:s↦a in the Hintze paper makes it seem like somehow s is being used on the right hand side, which is possibly what confused me when I wrote the post.