Vis-a-vis selecting inputs freely: OpenAI also included a large dump of unconditioned text generation in their github repo.
Nice review, I enjoyed it. I read the books a while ago and it was good to see I’m not alone in seeing it as deeply conservative. As far as that goes, I wondered how much of that is sort of general Chinese attitude vs. non-Chinese attitude, and how much if it is unique to the author.
One thing that keeps bothering me about the book is I can’t make sense of Wade.
Wade was the ideal swordholder, because he could stick to commitments. Is he supposed to be absolutely bound by them, though, and is that why he inexplicably obeys Cheng, because he agreed to in the past? That’s a coherent notion of character, but it hardly feels explicable; it makes sense if Wade is an AI, but not really as a human. Or at least that’s how I felt.
My overall impression looking at this is still more or less summed up by what Francois Chollet said a bit ago.
Any problem can be treated as a pattern recognition problem if your training data covers a sufficiently dense sampling of the problem space. What’s interesting is what happens when your training data is a sparse sampling of the space—to extrapolate, you will need intelligence.
Whether an AI that plays StarCraft, DotA, or Overwatch succeeds or fails against top players, we’d have learned nothing from the outcome. Wins—congrats, you’ve trained on enough data. Fails—go back, train on 10x more games, add some bells & whistles to your setup, succeed.
Some of the stuff Deepmind talks about a lot—so, for instance, the AlphaLeague—seems like a clever technique simply designed to ensure that you have a sufficiently dense sampling of the space, which would normally not occur in a game with unstable equilibria. And this seems to me more like “clever technique applicable to domain where we can generate infinite data through self-play” than “stepping stone on way to AGI.”
That being said, I haven’t yet read through all the papers in the blog postl, and I’d be curious what of them people think might be / definitely are potential steps towards actually engineering intelligence.
Fair. For (1), more than 50% because that was how they’ve been defining victories in these tournaments. For (2), no unplanned interventions—i.e, it’s fine if they want to drive it on a gravel driveway that they know the thing cannot handle, or fill it up at the supercharger because the car clearly cannot handle that, but in general no interventions because the car would potentially crash in a situation it (ostensibly) should handle. And for (3), meh, can beat the native scripted AI seems reasonable.
So if I understand you, for (1) you’re proposing a “hard” attention over the image, rather than the “soft” differentiable attention which is typically meant by “attention” for NNs.
You might find interesting “Recurrent Models of Visual Attention” by DeepMind (https://arxiv.org/pdf/1406.6247.pdf). They use a hard attention over the image with RL to train where to attend. I found it interesting—there’s been subsequent work using hard attention (I thiiink this is a central paper for the topic, but I could be wrong, and I’m not at all sure what the most interesting recent one is) as well.
...you were an ancient being, with a mind vast and unsympathetic, concerned with all the events in the path of the light-cone, who has through some mistake been trapped in a smaller, duller mind, forgetting most of the wisdom natural to it, becoming encumbered by fleshy bounds, and who now must decide what to do with the potential it has left.
...the “you” listening to this was one of several complete agents inhabiting a body, each of which has their own plans, goals, and strategies, each of which jockeys for control over the actions of that body, and each of which can wage war or form alliances with each other to try gain more control over that body over the course of a lifetime?