My current thinking about how to implement this without having to build full sized agents is to make little stateful reinforcement learner type things in a really simple agent-world, something like a typed-message-passing type thing. possibly with 2d or 3d locations and falloff of action effects by distance? then each agent can take actions, can learn to map agent to reward, etc.
make other agent’s reward states observable, maybe with a gating where an agent can choose to make its reward state non-observable to other agents, in exchange for that action being visible somehow.
make some sort of game of available actions—something like, agents have resources they need to live, can take them from each other, value being close to each other, value stability, etc etc. some sort of thing to make there be different contexts an agent can be cooperatey or defecty in.
hardcode or preinitialize-from-code level 3 stuff. hardcode into the world identification of which agent took an action at you? irl there’s ambiguity about cause and without that some patterns probably won’t arise
could use really small neural networks I guess, or maybe just linear matrices of [agents, actions] and then mcmc sample from actions taken and stuff?
I’m confused precisely how to implement deservingness… seems like deservingness is something like a minimum control target for others’ reward, retribution is a penalty that supersedes it? maybe?
if using neural networks implementing the power thing on level 3 is a fairly easy prediction task, using bayesian mcmc whatever it’s much harder. maybe that’s an ok place to use NNs? trying to use NNs in a model like this feels like a bad idea unless the NNs are extremely regularized.… also the inference needed for level 4 is hard without NNs.
My current thinking about how to implement this without having to build full sized agents is to make little stateful reinforcement learner type things in a really simple agent-world, something like a typed-message-passing type thing. possibly with 2d or 3d locations and falloff of action effects by distance? then each agent can take actions, can learn to map agent to reward, etc.
make other agent’s reward states observable, maybe with a gating where an agent can choose to make its reward state non-observable to other agents, in exchange for that action being visible somehow.
make some sort of game of available actions—something like, agents have resources they need to live, can take them from each other, value being close to each other, value stability, etc etc. some sort of thing to make there be different contexts an agent can be cooperatey or defecty in.
hardcode or preinitialize-from-code level 3 stuff. hardcode into the world identification of which agent took an action at you? irl there’s ambiguity about cause and without that some patterns probably won’t arise
could use really small neural networks I guess, or maybe just linear matrices of [agents, actions] and then mcmc sample from actions taken and stuff?
I’m confused precisely how to implement deservingness… seems like deservingness is something like a minimum control target for others’ reward, retribution is a penalty that supersedes it? maybe?
if using neural networks implementing the power thing on level 3 is a fairly easy prediction task, using bayesian mcmc whatever it’s much harder. maybe that’s an ok place to use NNs? trying to use NNs in a model like this feels like a bad idea unless the NNs are extremely regularized.… also the inference needed for level 4 is hard without NNs.