I appreciate your posting this here, and I do agree that any information from AlphaGo Zero is limited in our ability to apply it to forecasting things like AGI.
That said, this whole article is very defensive, coming up with ways in which the evidence might not apply, not coming up with ways in which it isn’t evidence.
I don’t think Eliezer’s article was a knock-down argument, and I don’t think anyone including him believes that. But I do think the situation is some weak evidence in favor for his position over yours.
I also think it’s stronger evidence than you seem to think according to the framework you lay down here!
For example, a previous feature of AI for playing games like Chess or Go was to capture information about the structure of the game via some complex combination. However in AlphaGo Zero, very little specific information about Go is required. The change in architecture actually subsumes some amount of the combination of tools needed.
Again I don’t think this is a knockdown argument or very strong or compelling evidence—but it looks as though you are treating it as essentially zero evidence which seems unjustified to me.
As I said, I’m treating it as the difference of learning N simple general tools to learning N+1 such tools. Do you think it stronger evidence than that, or do you think I’m not acknowledging how big that is?
I think it is evidence that ‘simple general tools’ can be different from one another along multiple dimensions.
...we solve most problems via complex combinations of simple tools. Combinations so complex, in fact, that our main issue is usually managing the complexity, rather than including the right few tools.
This is a specific instance of complex details being removed to improve performance, where using the central correct tool was the ONLY moving part.
And thus the first team to find the last simple general tool needed might “foom” via having an enormous advantage over the entire rest of the world put together. At least if that one last tool were powerful enough. I disagree with this claim, but I agree that neither view can be easily and clearly proven wrong.
I am interpreting your disagreement here to mean that you disagree that any single simple tool will be powerful enough in practice, and not in theory. I hope you agree that if someone acquired all magic powers ever written about in fiction with no drawbacks they would be at an enormous advantage over the rest of the world combined. If that was the simple tool, it would be big enough.
Then if the question is “how big of an advantage can a single simple tool give,” and the observation is, “this single simple tool gives a bigger advantage on a wider range of tasks than we have seen with previous tools,” then I would be more concerned with bigger, faster moving simple tools in the future having different types or scales of impact.
I appreciate your posting this here, and I do agree that any information from AlphaGo Zero is limited in our ability to apply it to forecasting things like AGI.
That said, this whole article is very defensive, coming up with ways in which the evidence might not apply, not coming up with ways in which it isn’t evidence.
I don’t think Eliezer’s article was a knock-down argument, and I don’t think anyone including him believes that. But I do think the situation is some weak evidence in favor for his position over yours.
I also think it’s stronger evidence than you seem to think according to the framework you lay down here!
For example, a previous feature of AI for playing games like Chess or Go was to capture information about the structure of the game via some complex combination. However in AlphaGo Zero, very little specific information about Go is required. The change in architecture actually subsumes some amount of the combination of tools needed.
Again I don’t think this is a knockdown argument or very strong or compelling evidence—but it looks as though you are treating it as essentially zero evidence which seems unjustified to me.
As I said, I’m treating it as the difference of learning N simple general tools to learning N+1 such tools. Do you think it stronger evidence than that, or do you think I’m not acknowledging how big that is?
I think it is evidence that ‘simple general tools’ can be different from one another along multiple dimensions.
This is a specific instance of complex details being removed to improve performance, where using the central correct tool was the ONLY moving part.
I am interpreting your disagreement here to mean that you disagree that any single simple tool will be powerful enough in practice, and not in theory. I hope you agree that if someone acquired all magic powers ever written about in fiction with no drawbacks they would be at an enormous advantage over the rest of the world combined. If that was the simple tool, it would be big enough.
Then if the question is “how big of an advantage can a single simple tool give,” and the observation is, “this single simple tool gives a bigger advantage on a wider range of tasks than we have seen with previous tools,” then I would be more concerned with bigger, faster moving simple tools in the future having different types or scales of impact.
I disagree with the claim that “this single simple tool gives a bigger advantage on a wider range of tasks than we have seen with previous tools.”