Algon comments on Competitive programming with AlphaCode

Algon 3 Feb 2022 14:56 UTC
1 point
0
That is the sampling part of AlphaGo. When you ask it a question, if you let take more and more samples as potential solutions, the more likely you are to get a correct answer. The success rate of this seems to grow log linearly with the number of samples, which is pretty poor.
- maximkazhenkov 3 Feb 2022 19:06 UTC
  1 point
  0
  Parent
  Oh, I see, I thought you meant AlphaCode had some sort of recurrency (does it?), i.e. iterating over its answer like a human would when debugging code. Log-linear doesn’t sound bad, you wouldn’t expect a human to be able to write a correct program on the first try without debugging either, would you?
  - Algon 3 Feb 2022 19:21 UTC
    2 points
    0
    Parent
    It is pretty bad because the constant matters here. If you have a method which achieves a success rate % of C log( 1+ k) where k is the number of samples then, if C =1, you’d need 2^100~10^30 many samples per inference to get all the competitive programing questions right. As it stands, C seems pretty small. For k=100000 you get solve maybe 33% of problems. And it isn’t much better than k=1000.
    That’s leaving aside the question of whether you will continue to get log-linear improvements, which I find fairly unlikely. Though if you could get C to something like 10 or 20 with one or two innovations, I’d be scared or even terrified.
    - Daniel Kokotajlo 3 Feb 2022 22:14 UTC
      2 points
      0
      Parent
      To check my understanding:
      If C was 20, then 100,000 samples would be enough to solve approximately 100% of these problems?
      But instead C is something like 6-7, because 100,000 samples gets you to 33%?
      - Algon 3 Feb 2022 22:28 UTC
        3 points
        0
        Parent
        Yeah, basically, but I was using log_2(x) instead of log_10(x) in my made up example. Here’s some actual data. What they mean by “unlimited attempts” is that there is no filter on the sample solutions, so they are all submitted to codebase. I expect false positives/incredibly slow to be more likely than performance actually increasing without limit.