I would be interested in seeing the learning process for pacman further. I guess the algorithm just ran for a couple of iterations. Also, could we run an experiment with more complicated games, like doom? There also is an obvious way to count, namely the number of killed enemies. Chess? Maybe even some poker?
Also, could we run an experiment with more complicated games, like doom? There also is an obvious way to count, namely the number of killed enemies.
There is an obvious way to count, yes, but can it be easily located in the raw RAM of a running Doom instance? That’s the point here, automatically inferring a measure of progress from somewhere in the raw binary blob.
If you have to explicitly define a reward counter, then you’re just doing normal reinforcement-learning or AI kinda stuff, and you might as well go use a non-joke agent like AIXI-MC (already plays Pacman) or something with a more straightforward and less hilariously awesomely bizarre design.
Plus from the paper, it sounds like there are serious performance and scalability issues on just the small NES games he was using, so it may not be feasible to run on as big a program as Doom without real work like using a cluster.
I would be interested in seeing the learning process for pacman further. I guess the algorithm just ran for a couple of iterations. Also, could we run an experiment with more complicated games, like doom? There also is an obvious way to count, namely the number of killed enemies. Chess? Maybe even some poker?
I will have to read that paper.
There is an obvious way to count, yes, but can it be easily located in the raw RAM of a running Doom instance? That’s the point here, automatically inferring a measure of progress from somewhere in the raw binary blob.
If you have to explicitly define a reward counter, then you’re just doing normal reinforcement-learning or AI kinda stuff, and you might as well go use a non-joke agent like AIXI-MC (already plays Pacman) or something with a more straightforward and less hilariously awesomely bizarre design.
Plus from the paper, it sounds like there are serious performance and scalability issues on just the small NES games he was using, so it may not be feasible to run on as big a program as Doom without real work like using a cluster.