What are your priors, or what is your base rate calculation, for how often promising new ML architectures end up scaling and generalizing well to new tasks?
What are your priors, or what is your base rate calculation, for how often promising new ML architectures end up scaling and generalizing well to new tasks?