I expect a delay even in the infinite data case, I think?
Although I’m not quite sure what you mean by “infinite data” here—if the argument is that every data point will have been seen during training, then I agree that there won’t be any delay. But yes training on the test set (even via “we train on everything so there is no possible test set”) counts as cheating for this purpose.
I expect a delay even in the infinite data case, I think?
Although I’m not quite sure what you mean by “infinite data” here—if the argument is that every data point will have been seen during training, then I agree that there won’t be any delay. But yes training on the test set (even via “we train on everything so there is no possible test set”) counts as cheating for this purpose.