Oh. I was thinking the diamond maximizer problem is about making the AI care about the specific goal of maximizing diamond, not about making the AI have some consistent goal slot instead of lots of cognitive spaghetti code. I think it’s simple to make a highly agentic AI with some specific goals it’s really good at maximizing (if you have infinite compute and don’t care about what these goals actually are; I have no idea how to point that at diamond). Should I write a description somewhere?
Is it simple if you don’t have infinite compute ? I would be interested in a description which doesn’t rely on infinite compute, or more strictly still, that is is computationally tractable. This constraint is important to me because I assume that the first AGI we get is using something that’s more efficient that other known methods (eg. using DL because it works, even though it’s hard to control), so I care about aligning the stuff which we’ll actually be using.
Oh. I was thinking the diamond maximizer problem is about making the AI care about the specific goal of maximizing diamond, not about making the AI have some consistent goal slot instead of lots of cognitive spaghetti code. I think it’s simple to make a highly agentic AI with some specific goals it’s really good at maximizing (if you have infinite compute and don’t care about what these goals actually are; I have no idea how to point that at diamond). Should I write a description somewhere?
Is it simple if you don’t have infinite compute ?
I would be interested in a description which doesn’t rely on infinite compute, or more strictly still, that is is computationally tractable. This constraint is important to me because I assume that the first AGI we get is using something that’s more efficient that other known methods (eg. using DL because it works, even though it’s hard to control), so I care about aligning the stuff which we’ll actually be using.