Got it, thanks. We’re planning to try to avoid testing systems that are isomorphic to real-world examples, in the interest of making a crisp distinction between reasoning and knowledge. That said, if we come up with a principled way to characterize system complexity (especially the complexity of the underlying mathematical laws), and if (big if!) that turns out to match what LLMs find harder, then we could certainly compare results to the complexity of real-world laws. I hadn’t considered that, thanks for the idea!
Got it, thanks. We’re planning to try to avoid testing systems that are isomorphic to real-world examples, in the interest of making a crisp distinction between reasoning and knowledge. That said, if we come up with a principled way to characterize system complexity (especially the complexity of the underlying mathematical laws), and if (big if!) that turns out to match what LLMs find harder, then we could certainly compare results to the complexity of real-world laws. I hadn’t considered that, thanks for the idea!