Thanks again for this piece. I’ll follow your daily posts and comment on them regularly!
I have a few clarification questions for you:
if an AGI could simulate quasi-perfectly a human brain, with human knowledge encoded inside, would your utility function be satisfied?
is the goal of understanding all there is to the utility function? What would the AGI do, once able to model precisely the way humans encode knowledge? If the AGI has the keys to the observable universe, what does it do with it?
Thanks again for this piece. I’ll follow your daily posts and comment on them regularly!
Very kind of you!
if an AGI could simulate quasi-perfectly a human brain, with human knowledge encoded inside, would your utility function be satisfied?
Interesting thought experiment. I would say no, but it depends on how it simulates the human brain.
If brain scans allow to quasi-perfectly get the position of every cell in a brain, and that we know how to model their interactions, we could have an electric circuit without knowing much about the information inside it.
So we would have the code, but we would have nothing about the meaning, so it would not “understand how the knowledge is encoded”.
is the goal of understanding all there is to the utility function? What would the AGI do, once able to model precisely the way humans encode knowledge? If the AGI has the keys to the observable universe, what does it do with it?
You’re absolutely right in the sense that it does not constitute a valid utility function for the alignment problem, or for anything really useful, if it is the one used for the final goal.
My point was that Yudkowsky showed how the encoding utility function was limited because of the simple way of maximizing it, but if we changed it to my “understanding the encoding” it could be much more interesting, enough to lead to AGI.
Once the AGI knows how humans encode knowledge, it can basically have, at least, the same model of reality of humans (assuming it has the same Input sensors (e.g. things like eyes, ears, etc.) , which is not very difficult) by encoding knowledge the same way. And then, because it is in Silico (and not in a biological prison), it can do everything humans do but much faster (basically an ASI).
I guess if it reaches this point, it would be the same as humans living for thousands of subjective years, and so it would be able to do whatever humans would believe useful if we were given more time to think about it.
Should we implement ought statements inside it in addition to the “understanding the encoding” utility function? Or the “faster human” is enough? I don’t know.
Thanks again for this piece. I’ll follow your daily posts and comment on them regularly!
I have a few clarification questions for you:
if an AGI could simulate quasi-perfectly a human brain, with human knowledge encoded inside, would your utility function be satisfied?
is the goal of understanding all there is to the utility function? What would the AGI do, once able to model precisely the way humans encode knowledge? If the AGI has the keys to the observable universe, what does it do with it?
Very kind of you!
Interesting thought experiment. I would say no, but it depends on how it simulates the human brain.
If brain scans allow to quasi-perfectly get the position of every cell in a brain, and that we know how to model their interactions, we could have an electric circuit without knowing much about the information inside it.
So we would have the code, but we would have nothing about the meaning, so it would not “understand how the knowledge is encoded”.
You’re absolutely right in the sense that it does not constitute a valid utility function for the alignment problem, or for anything really useful, if it is the one used for the final goal.
My point was that Yudkowsky showed how the encoding utility function was limited because of the simple way of maximizing it, but if we changed it to my “understanding the encoding” it could be much more interesting, enough to lead to AGI.
Once the AGI knows how humans encode knowledge, it can basically have, at least, the same model of reality of humans (assuming it has the same Input sensors (e.g. things like eyes, ears, etc.) , which is not very difficult) by encoding knowledge the same way. And then, because it is in Silico (and not in a biological prison), it can do everything humans do but much faster (basically an ASI).
I guess if it reaches this point, it would be the same as humans living for thousands of subjective years, and so it would be able to do whatever humans would believe useful if we were given more time to think about it.
Should we implement ought statements inside it in addition to the “understanding the encoding” utility function? Or the “faster human” is enough? I don’t know.