it has to go to infinity when we get AGI / superhuman coder.
This isn’t necessarily true, as even an AGI or a superhuman coder might get worse at tasks-that-take-humans-longer compared to tasks-that-take-humans-shorter (this seems pretty likely given constant-error-rate considerations), meaning that even an extremely capable AI might be like 99.999% reliable for 1 hour tasks, but only 99.9% reliable for 10,000 hour tasks, meaning the logistic fit still has an intercept with 50%, it’s just a very high number.
In order for the 50% intercept to approach infinity, you’d need a performance curve which approaches a flat line, and this seems very hard to pull off and probably requires wildly superhuman AI.
Under the logistic methodology where we don’t actually have long enough tasks to measure the 50% point, sure. But if we actually have years-long tasks, a true superhuman coder should be able to do them more reliably than humans, which is more than 50% if we filter the problem distribution to things humans can do with more than about 50% probability. There are other methodologies that I think are more meaningful, where it might also make sense to have the SC’s time horizon be infinity.
This isn’t necessarily true, as even an AGI or a superhuman coder might get worse at tasks-that-take-humans-longer compared to tasks-that-take-humans-shorter (this seems pretty likely given constant-error-rate considerations), meaning that even an extremely capable AI might be like 99.999% reliable for 1 hour tasks, but only 99.9% reliable for 10,000 hour tasks, meaning the logistic fit still has an intercept with 50%, it’s just a very high number.
In order for the 50% intercept to approach infinity, you’d need a performance curve which approaches a flat line, and this seems very hard to pull off and probably requires wildly superhuman AI.
Under the logistic methodology where we don’t actually have long enough tasks to measure the 50% point, sure. But if we actually have years-long tasks, a true superhuman coder should be able to do them more reliably than humans, which is more than 50% if we filter the problem distribution to things humans can do with more than about 50% probability. There are other methodologies that I think are more meaningful, where it might also make sense to have the SC’s time horizon be infinity.