Thank you for actually engaging with the idea (pointing out problems and whatnot) rather than just suggesting reading material.
Btw, would you count a data packet as an object you move through space?
A couple of points:
I only assume AI models the world as “objects” moving through space and time, without restricting what those objects could be. So yes, a data packet might count.
“Fundamental variables” don’t have to capture all typical effects of humans on the world, they only need to capture typical human actions which humans themselves can easily perceive and comprehend. So the fact that a human can send an Internet message at 2⁄3 speed of light doesn’t mean that “2/3 speed of light” should be included in the range of fundamental variables, since humans can’t move and react at such speeds.
Conclusion: data packets can be seen as objects, but there are many other objects which are much easier for humans to interact with.
Also note that fundamental variables are not meant to be some kind of “moral speed limits”, prohibiting humans or AIs from acting at certain speeds. Fundamental variables are only needed to figure out what physical things humans can most easily interact with (because those are the objects humans are most likely to care about).
This range is quite huge. In certain contexts, you’d want to be moving through space at high fractions of the speed of light, rather than walking speed. Same goes for moving other objects through space.
What contexts do you mean? Maybe my point about “moral speed limits” addresses this.
Hopefully the AI knows you mean moving in sync with Earth’s movement through space.
Yes, relativity of motion is a problem which needs to be analyzed. Fundamental variables should refer to relative speeds/displacements or something.
For example a person may be specified by textual name and address, by textual physical description, and by images and other recordings. There is very active research on recognizing people and objects by such specifications (Bishop, 2006; Koutroumbas and Theodoris, 2008; Russell and Norvig, 2010). This paper will not discuss the details of how specifications can be matched to structures in learned environment models, but assumes that algorithms for doing this are included in the utility function implementation.
Does it just completely ignore the main problem?
I know Abram Demski wrote about Model-based Utility Functions, but I couldn’t fully understand his post too.
(Disclaimer: I’m almost mathematically illiterate, except knowing a lot of mathematical concepts from popular materials. Halting problem, Godel, uncountability, ordinals vs. cardinals, etc.)
Also note that fundamental variables are not meant to be some kind of “moral speed limits”, prohibiting humans or AIs from acting at certain speeds. Fundamental variables are only needed to figure out what physical things humans can most easily interact with (because those are the objects humans are most likely to care about).
Ok, that clears things up a lot. However, I still worry that if it’s at the AI’s discretion when and where to sidestep the fundamental variables, we’re back at the regular alignment problem. You have to be reasonably certain what the AI is going to do in extremely out of distribution scenarios.
The subproblem of environmental goals is just to make AI care about natural enough (from the human perspective) “causes” of sensory data, not to align AI to the entirety of human values. Fundamental variables have no (direct) relation to the latter problem.
However, fundamental variables would be helpful for defining impact measures if we had a principled way to differentiate “times when it’s OK to sidestep fundamental variables” from “times when it’s NOT OK to sidestep fundamental variables”. That’s where the things you’re talking about definitely become a problem. Or maybe I’m confused about your point.
Thank you for actually engaging with the idea (pointing out problems and whatnot) rather than just suggesting reading material.
A couple of points:
I only assume AI models the world as “objects” moving through space and time, without restricting what those objects could be. So yes, a data packet might count.
“Fundamental variables” don’t have to capture all typical effects of humans on the world, they only need to capture typical human actions which humans themselves can easily perceive and comprehend. So the fact that a human can send an Internet message at 2⁄3 speed of light doesn’t mean that “2/3 speed of light” should be included in the range of fundamental variables, since humans can’t move and react at such speeds.
Conclusion: data packets can be seen as objects, but there are many other objects which are much easier for humans to interact with.
Also note that fundamental variables are not meant to be some kind of “moral speed limits”, prohibiting humans or AIs from acting at certain speeds. Fundamental variables are only needed to figure out what physical things humans can most easily interact with (because those are the objects humans are most likely to care about).
What contexts do you mean? Maybe my point about “moral speed limits” addresses this.
Yes, relativity of motion is a problem which needs to be analyzed. Fundamental variables should refer to relative speeds/displacements or something.
The paper is surely at least partially relevant, but what’s your own opinion on it? I’m confused about this part: (4.2 Defining Utility Functions in Terms of Learned Models)
Does it just completely ignore the main problem?
I know Abram Demski wrote about Model-based Utility Functions, but I couldn’t fully understand his post too.
(Disclaimer: I’m almost mathematically illiterate, except knowing a lot of mathematical concepts from popular materials. Halting problem, Godel, uncountability, ordinals vs. cardinals, etc.)
Ok, that clears things up a lot. However, I still worry that if it’s at the AI’s discretion when and where to sidestep the fundamental variables, we’re back at the regular alignment problem. You have to be reasonably certain what the AI is going to do in extremely out of distribution scenarios.
The subproblem of environmental goals is just to make AI care about natural enough (from the human perspective) “causes” of sensory data, not to align AI to the entirety of human values. Fundamental variables have no (direct) relation to the latter problem.
However, fundamental variables would be helpful for defining impact measures if we had a principled way to differentiate “times when it’s OK to sidestep fundamental variables” from “times when it’s NOT OK to sidestep fundamental variables”. That’s where the things you’re talking about definitely become a problem. Or maybe I’m confused about your point.
Thanks. That makes sense.