Charlie Steiner comments on Characterizing Real-World Agents as a Research Meta-Strategy

Charlie Steiner 9 Oct 2019 20:34 UTC
LW: 2 AF: 1
0
AF
Somehow I missed that second post of yours. I’ll try out the subscribe function :)
Do you also get the feeling that you can sort of see where this is going in advance?
When asking what computations a system instantiates, it seems you’re asking what models (or what fits to an instantiated function) perform surprisingly well, given the amount of information used.
To talk about humans wanting things, you need to locate their “wants.” In the simple case this means knowing in advance which model, or which class of models, you are using. I think there are interesting predictions we can make about taking a known class of models and asking “does one of these do a surprisingly good job at predicting a system in this part of the world including humans?”
The answer is going to be yes, several times over—humans, and human-containing parts of the environment, are pretty predictable systems, at multiple different levels of abstraction. This is true even if you assume there’s some “right” model of humans and you get to start with it, because this model would also be surprisingly effective at predicting e.g. the human+phone system, or humans at slightly lower or higher levels of abstraction. So now you have a problem of underdetermination. What to do? The simple answer is to pick whatever had the highest surprising power, but I think that’s not only simple but also wrong.
Anyhow, since you mention you’re not into hand-coding models of humans where we know where the “wants” are stored, I’d be interested in your thoughts on that step too, since just looking for all computations that humans instantiate is going to return a whole lot of answers.
- johnswentworth 9 Oct 2019 21:56 UTC
  LW: 2 AF: 1
  0
  AF Parent
  I think it will turn out that, with the right notion of abstraction, the underdetermination is much less severe than it looks at first. In particular, I don’t think abstraction is entirely described by a pareto curve of information thrown out vs predictive power. There are structural criteria, and those dramatically cut down the possibility space.
  Consider the Navier-Stokes equations for fluid flow as an abstraction of (classical) molecular dynamics. There are other abstractions which keep around slightly more or slightly less information, and make slightly better or slightly worse predictions. But Navier-Stokes is special among these abstractions: it has what we might call a “closure” property. The quantities which Navier-Stokes predicts in one fluid cell (average density & momentum) can be fully predicted from the corresponding quantities in neighboring cells plus generic properties of the fluid (under certain assumptions/approximations). By contrast, imagine if we tried to also compute the skew or heteroskedasticity or other statistics of particle speeds in each cell. These would have bizarre interactions with higher moments, and might not be (approximately) deterministically predictable at all without introducing even more information in each cell. Going the other direction, imagine we throw out info about density & momentum in some of the cells. Then that throws off everything else, and suddenly our whole fluid model needs to track multiple possible flows.
  So there are “natural” levels of abstraction where we keep around exactly the quantities relevant to prediction of the other quantities. Part of what I’m working on is characterizing these abstractions: for any given ground-level system, how can we determine which such abstractions exist? Also, is this the right formulation of a “natural” abstraction, or is there a more/less general criteria which better captures our intuitions?
  All this leads into modelling humans. I expect that there is such a natural level of abstraction which corresponds to our usual notion of “human”, and specifically humans as agents. I also expect that this natural abstraction is an agenty model, with “wants” build into it. I do not think that there are a large number of “nearby” natural abstractions.