Conceding a short timelines bet early

Matthew Barnett16 Mar 2023 21:49 UTC

132 points

Last year I bet some people about short AI timelines. While I don’t think I’ve lost the bet yet, I think it’s clear at this point that I will lose with high probability. I’ve outlined the reasons why I think that in a retrospective here. Even if I end up winning, I think it will likely be the result of a technicality, and that wouldn’t be very interesting.

Because of my personal preference for settling this matter now without delay, I have decided to take the step of conceding the bet now. Note however that I am not asking Tamay to do the same. I have messaged the relevant parties and asked them to send me details on how to pay them.

I congratulate Nathan Helm-Burger and Tomás B. for taking the other side of the bet.

What links here?

Matthew Barnett16 Mar 2023 21:49 UTC

132 points

16 comments1 min readLW link

Betting AI Timelines AI

Tomás B. 16 Mar 2023 22:46 UTC
48 points
8
My emotional state right now: https://twitter.com/emojimashupbot/status/1409934745895583750?s=46
Nathan Helm-Burger 17 Mar 2023 3:58 UTC
21 points
1
I will accept the early resolution, but I’d like to reserve the option to reverse the decision and the payment should the world turn out unexpectedly in our favor.

Also, I’d like to state that I commit to using the money to buy more equipment for my AI safety research. [Edit: Matthew paid up!] [Edit 2: So did Tamay!]
Liron 16 Mar 2023 23:02 UTC
13 points
2
Bravo.

Which 2+ outcomes from the list do you think are most likely to lead to your loss?
- Matthew Barnett 17 Mar 2023 0:07 UTC
  17 points
  0
  Parent
  I suspect the MMLU and the MATH milestones are the easiest to achieve. I suspect it will probably happen after a GPT-4-level model is specialized to perform well in mathematics like Minerva.
- hold_my_fish 16 Mar 2023 23:34 UTC
  1 point
  0
  Parent
  I’m curious about this too. The retrospective covers weaknesses in each milestone, but a collection of weak milestones doesn’t necessarily aggregate to a guaranteed loss, since performance ought to be correlated (due to an underlying general factor of AI progress).
  - Gerald Monroe 16 Mar 2023 23:54 UTC
    1 point
    −2
    Parent
    Hmm? The 10 billion funding increase to OpenAI and the arms race with google pretty much guaranteed that the 10^30/ 1 billion USD machine for training would be satisfied. So we can mark that one as “almost certainly” satisfied by EOY 2023. Only way it isn’t is a shortage of GPU/TPUs.
    
    GPT-4 likely satisfies MMLU. So with 2 “almost certain” conditions met, plus if by some fluke they aren’t met by 2026, there are still several other ways Matt can lose the bet.
    - Matthew Barnett 17 Mar 2023 0:04 UTC
      7 points
      2
      Parent
      I think you’re overconfident here. I’m quite skeptical that GPT-4 already got above 80% on every single task in the MMLU since there are 57 tasks and it got 86.4% on average. I’m also skeptical that OpenAI will very soon spend >$1 billion to train a single model, but I definitely don’t think that’s implausible. “Almost certain” for either of those seems wrong.
      - Gerald Monroe 17 Mar 2023 0:11 UTC
        1 point
        0
        Parent
        There’s gpt-5 though, or GPT-4.math.finetune. You saw the Minerva results. You know there will be significant gain with a fine-tune, likely enough to satisfy 2-3 of your conditions.
        
        As I said it’s ridiculous to think someone either in the Google or OAI camp won’t have more than 1 billion USD in training hardware, in service for a single model (training many instances in parallel) by openAI.
        
        Think about what that means. 1 A100 is 25k. The cluster meta uses is 2048 of them. So about 50 million.
        
        Why would you not go for the most powerful model possible as soon as you can? Either the world’s largest tech giant is about to lose it all, or they are going to put the proportional effort in.
        Matthew Barnett 17 Mar 2023 0:16 UTC
        5 points
        2
        Parent
        As I said it’s ridiculous to think someone either in the Google or OAI camp won’t have more than 1 billion USD in training hardware, in service for a single model (training many instances in parallel) by openAI.
        I think you’re reading this condition incorrectly. The $1 billion would need to be spent for a single model. If OpenAI buys a $2 billion supercomputer but they train 10 models with it, that won’t necessarily qualify.
        Gerald Monroe 17 Mar 2023 0:18 UTC
        1 point
        0
        Parent
        Then why did you add the term? I assume you meant that the entire supercomputer is working on instances of the same model at once. Obviously training is massively parallel.
        
        Once the model is done obviously the supercomputer will be used for other things.
Evan R. Murphy 17 Mar 2023 1:41 UTC
7 points
0
I congratulate Nathan Helm-Burger and Tomás B. for taking the other side of the bet.
Just for the record, I also took your bet. ;)
- Matthew Barnett 17 Mar 2023 2:07 UTC
  16 points
  4
  Parent
  Congratulations. However, unless I’m mistaken, you simply said you’d be open to taking the bet. We didn’t actually take it with you, did we?
  - Evan R. Murphy 17 Mar 2023 21:17 UTC
    7 points
    3
    Parent
    Yea, I guess I was a little unclear on whether your post constituted a bet offer where people could simply reply to accept as I did, or if you were doing specific follow-up to finalize the bet agreements. I see you did do that with Nathan and Tomás, so it makes sense you didn’t view our bet as on. It’s ok, I was more interested in the epistemic/forecasting points than the $1,000 anyway. ;)
    I commend you for following up and for your great retrospective analysis of the benchmark criteria. Even though I offered to take your bet, I didn’t realize just how problematic the benchmark criteria were for your side of the bet.
    Most importantly, it’s disquieting and bad news that long timelines are looking increasingly implausible. I would have felt less worried about a world where you were right about that.
Ahmdal Oberth 23 Mar 2023 0:30 UTC
4 points
0
Darn. Who should I defer to now if I want to believe longer timelines?
- Ben Pace 23 Mar 2023 0:42 UTC
  11 points
  0
  Parent
  lol
Lone Pine 17 Mar 2023 12:48 UTC
4 points
1
Wild that this bet lasted less than a year.

If you were interested in rebetting, maybe you can make the threshold 3 or 4 items.