If you have lots of training data, the probability that the model assigns the training data is very small. You can’t represent such small numbers with the commonly used floating point types in Python/Java/etc.
It’s more practical to compute the log probability of the training data (by summing the log probabilities assigned to the training examples rather than multiplying the original probabilities).
If you have lots of training data, the probability that the model assigns the training data is very small. You can’t represent such small numbers with the commonly used floating point types in Python/Java/etc.
It’s more practical to compute the log probability of the training data (by summing the log probabilities assigned to the training examples rather than multiplying the original probabilities).