I feel the issue with your GPT-5 prediction is that it specifies both “no massive advance” and “no GPT-5″. When there was a massive advance but no GPT-5, it makes it ambiguous which half of the prediction is more important.
It’s slightly weird to have the correctness of it depend on OpenAI’s branding choices, though. If we decided that the GPT part of the prediction was more important, then in an alternative world that was otherwise identical to our own but where OAI had chosen to call one of their reasoning models GPT-5, the prediction would flip from false to correct. So that makes me lean a bit toward weighting the “no massive advance” part more, though I also wouldn’t think it unreasonable to split the difference and give you half credit for having one part of a two-part prediction correct.
I feel the issue with your GPT-5 prediction is that it specifies both “no massive advance” and “no GPT-5″. When there was a massive advance but no GPT-5, it makes it ambiguous which half of the prediction is more important.
It’s slightly weird to have the correctness of it depend on OpenAI’s branding choices, though. If we decided that the GPT part of the prediction was more important, then in an alternative world that was otherwise identical to our own but where OAI had chosen to call one of their reasoning models GPT-5, the prediction would flip from false to correct. So that makes me lean a bit toward weighting the “no massive advance” part more, though I also wouldn’t think it unreasonable to split the difference and give you half credit for having one part of a two-part prediction correct.