I never published it because it seemed to be clear that SWE-bench verified saturates well below 100% and without a known saturation point the derived time horizons could really be anything. I wasn’t even sure getting 93.9% was possible.
Of course one doesn’t need to transfer the 93.9% to time horizons to see that this is a huge discontinuous jump.
Do you have details of this analysis published anywhere? I’m curious to take a look. Thanks!
I never published it because it seemed to be clear that SWE-bench verified saturates well below 100% and without a known saturation point the derived time horizons could really be anything. I wasn’t even sure getting 93.9% was possible.
Of course one doesn’t need to transfer the 93.9% to time horizons to see that this is a huge discontinuous jump.