(1) You might give some thought to trying to copy (or at least understand) the world model framework of the human brain. There’s uncertainty in how that works, but a lot is known, and you’ll at least be working towards something that we know for sure is capable of getting built up to a human level world-model within a reasonable amount of time and computation. As best as I can tell (and I’m working hard to understand it myself), and grossly oversimplifying, it’s a data structure with billions of discrete concepts, and transformations between those concepts (composition, cause-effect, analogy, etc...probably all of those are built out of the same basic “transformation machinery” with different contexts acting as metadata). All these concepts are sitting in the top layer of some kind of loose hierarchy, whose lowest layer consists of (higher-level-context-dependent) probability distributions over spatiotemporal sequences of sensory inputs. See my Jeff Hawkins post for one possible point of departure. I’ve found a couple other references that are indirectly helpful, and like I said, I’m still trying to figure it out. I’m still trying to understand the “sheaves” approach , so I won’t comment on how these compare.
(2) “This conception will be the result of an optimizer, and so this should be in the optimization provenance”—this seems to be important and I don’t understand it. Better understanding the world consists (in part) of chunking sequences of events and actions, suppressing intermediate steps. Thus we say and think “I’ll put some milk in my coffee,” leaving out the steps like unscrewing the top of the jug. The process of “explore the world model, chunking sequences of events when appropriate” is (I suspect) essential to making the world-model usable and powerful, and needs to be repeated millions of times in every nook and cranny of the world model, and thus this is a process that an overseer would have little choice but to approve in general, I think. But this process can find and chunk manipulative causal pathways just as well as any other kind of pathway. And once manipulation is packaged up inside a chunk, you won’t need optimization per se to manipulate, it will just be an obvious step in the process of doing something, just like unscrewing the top of the jug is an obvious step in putting-milk-into-coffee. I’m not sure how you propose to stop that from happening.
Couple things:
(1) You might give some thought to trying to copy (or at least understand) the world model framework of the human brain. There’s uncertainty in how that works, but a lot is known, and you’ll at least be working towards something that we know for sure is capable of getting built up to a human level world-model within a reasonable amount of time and computation. As best as I can tell (and I’m working hard to understand it myself), and grossly oversimplifying, it’s a data structure with billions of discrete concepts, and transformations between those concepts (composition, cause-effect, analogy, etc...probably all of those are built out of the same basic “transformation machinery” with different contexts acting as metadata). All these concepts are sitting in the top layer of some kind of loose hierarchy, whose lowest layer consists of (higher-level-context-dependent) probability distributions over spatiotemporal sequences of sensory inputs. See my Jeff Hawkins post for one possible point of departure. I’ve found a couple other references that are indirectly helpful, and like I said, I’m still trying to figure it out. I’m still trying to understand the “sheaves” approach , so I won’t comment on how these compare.
(2) “This conception will be the result of an optimizer, and so this should be in the optimization provenance”—this seems to be important and I don’t understand it. Better understanding the world consists (in part) of chunking sequences of events and actions, suppressing intermediate steps. Thus we say and think “I’ll put some milk in my coffee,” leaving out the steps like unscrewing the top of the jug. The process of “explore the world model, chunking sequences of events when appropriate” is (I suspect) essential to making the world-model usable and powerful, and needs to be repeated millions of times in every nook and cranny of the world model, and thus this is a process that an overseer would have little choice but to approve in general, I think. But this process can find and chunk manipulative causal pathways just as well as any other kind of pathway. And once manipulation is packaged up inside a chunk, you won’t need optimization per se to manipulate, it will just be an obvious step in the process of doing something, just like unscrewing the top of the jug is an obvious step in putting-milk-into-coffee. I’m not sure how you propose to stop that from happening.