But if you roll it 9 times and hide the 3 rolls in a certain direction, then you don’t have log_2 of 6 = 2.6 bits. That would be true if you had 6 honest rolls (looking like 2:2:2) but 3:3:0 surely is not the same amount of evidence. I’m generally not sure how best to understand the effects of biases of this sort, and want to think about that more.
The general formula is −∑obsP(obs)logP(obs), where obs is the observation that you see. You need to calculate P(obs) based on the problem setup; if you are given the ground truth of how the 9 rolls happen as well as the algorithm by which the 6 dice rolls to reveal are chosen, you can compute P(obs) for each obs by brute force simulation of all possible worlds.
The general formula is −∑obsP(obs)logP(obs), where obs is the observation that you see. You need to calculate P(obs) based on the problem setup; if you are given the ground truth of how the 9 rolls happen as well as the algorithm by which the 6 dice rolls to reveal are chosen, you can compute P(obs) for each obs by brute force simulation of all possible worlds.