There is a chance that one can avoid having to solve ontology identification in general if one punts the problem to simulated humans. I.e., it seems one can train the human simulator without solving it, and then use simulated humans to solve the problem. One may have to solve some specific ontology identification problems to make sure one gets an actual human simulator and not e.g. a malign AI simulator. However, this might be easier than solving the problem in full generality.
Minor comment: regarding the RLHF example, one could solve the problem implicitly if one is able to directly define a likelihood function over utility functions defined in the AI’s ontology, given human behavior. Though you probably correctly assume that e.g. cognitive science would produce a likelihood function over utility functions in the human ontology, in which case ontology identification still has to be solved explicitly.
There is a chance that one can avoid having to solve ontology identification in general if one punts the problem to simulated humans. I.e., it seems one can train the human simulator without solving it, and then use simulated humans to solve the problem. One may have to solve some specific ontology identification problems to make sure one gets an actual human simulator and not e.g. a malign AI simulator. However, this might be easier than solving the problem in full generality.
Minor comment: regarding the RLHF example, one could solve the problem implicitly if one is able to directly define a likelihood function over utility functions defined in the AI’s ontology, given human behavior. Though you probably correctly assume that e.g. cognitive science would produce a likelihood function over utility functions in the human ontology, in which case ontology identification still has to be solved explicitly.