[Question] By what metric do you judge a reference class?

The reference class problem is when you have a singular phenomena (e.g your friend Josh) and to extrapolate data and make predictions about this singular phenomena, you have to put it in a reference class of similar phenomena. The question becomes how you quantify similarity. Everything has an indefinite number of proporties that could be used as the basis for selecting a reference class (Josh is male, likes jazz, is an animal, is born in Germany, has a freckle on his toe etc). You can almost always select a reference class in such a way that you get the results you want to see. So how do you judge a reference class?

EDIT: Put up a $100 bounty for anyone who can solve it before 2022