That’s a case of reducing a high uncertainty (high entropy). The more classical Bayesian case where you learn a lot is when you were previously very certain about what the first data point will look like (i.e. you “know” a lot in your terminology, though knowledge implies truth, so that’s arguably the wrong term), but then the first data point turns out to be very different from what you expected.
So in summary, you will learn very little from a single example if you are both a) very sure about what it will look like and b) it then actually very much looks like you expected.
That’s a case of reducing a high uncertainty (high entropy). The more classical Bayesian case where you learn a lot is when you were previously very certain about what the first data point will look like (i.e. you “know” a lot in your terminology, though knowledge implies truth, so that’s arguably the wrong term), but then the first data point turns out to be very different from what you expected.
So in summary, you will learn very little from a single example if you are both a) very sure about what it will look like and b) it then actually very much looks like you expected.