That is the sampling part of AlphaGo. When you ask it a question, if you let take more and more samples as potential solutions, the more likely you are to get a correct answer. The success rate of this seems to grow log linearly with the number of samples, which is pretty poor.
Oh, I see, I thought you meant AlphaCode had some sort of recurrency (does it?), i.e. iterating over its answer like a human would when debugging code. Log-linear doesn’t sound bad, you wouldn’t expect a human to be able to write a correct program on the first try without debugging either, would you?
It is pretty bad because the constant matters here. If you have a method which achieves a success rate % of C log( 1+ k) where k is the number of samples then, if C =1, you’d need 2^100~10^30 many samples per inference to get all the competitive programing questions right. As it stands, C seems pretty small. For k=100000 you get solve maybe 33% of problems. And it isn’t much better than k=1000.
That’s leaving aside the question of whether you will continue to get log-linear improvements, which I find fairly unlikely. Though if you could get C to something like 10 or 20 with one or two innovations, I’d be scared or even terrified.
Yeah, basically, but I was using log_2(x) instead of log_10(x) in my made up example. Here’s some actual data. What they mean by “unlimited attempts” is that there is no filter on the sample solutions, so they are all submitted to codebase. I expect false positives/incredibly slow to be more likely than performance actually increasing without limit.
That is the sampling part of AlphaGo. When you ask it a question, if you let take more and more samples as potential solutions, the more likely you are to get a correct answer. The success rate of this seems to grow log linearly with the number of samples, which is pretty poor.
Oh, I see, I thought you meant AlphaCode had some sort of recurrency (does it?), i.e. iterating over its answer like a human would when debugging code. Log-linear doesn’t sound bad, you wouldn’t expect a human to be able to write a correct program on the first try without debugging either, would you?
It is pretty bad because the constant matters here. If you have a method which achieves a success rate % of C log( 1+ k) where k is the number of samples then, if C =1, you’d need 2^100~10^30 many samples per inference to get all the competitive programing questions right. As it stands, C seems pretty small. For k=100000 you get solve maybe 33% of problems. And it isn’t much better than k=1000.
That’s leaving aside the question of whether you will continue to get log-linear improvements, which I find fairly unlikely. Though if you could get C to something like 10 or 20 with one or two innovations, I’d be scared or even terrified.
To check my understanding:
If C was 20, then 100,000 samples would be enough to solve approximately 100% of these problems?
But instead C is something like 6-7, because 100,000 samples gets you to 33%?
Yeah, basically, but I was using log_2(x) instead of log_10(x) in my made up example. Here’s some actual data. What they mean by “unlimited attempts” is that there is no filter on the sample solutions, so they are all submitted to codebase. I expect false positives/incredibly slow to be more likely than performance actually increasing without limit.