After more discussion with bmk, I appended the following edit:
In this post, I wrote about the Arbital article’s unsupported jump from “Build an AI which cares about a simple object like diamonds” to “Let’s think about ontology identification for AIXI-tl.” The point is not that there is no valid reason to consider the latter, but that the jump, as written, seemed evidence-starved. For separate reasons, I currently think that ontology identification is unattractive in some ways, but this post isn’t meant to argue against that framing in general. The main point of the post is that humans provide tons of evidence about alignment, by virtue of containing guaranteed-to-exist mechanisms which produce e.g. their values around diamonds.
After more discussion with bmk, I appended the following edit: