Thanks so much for the feedback! 1. I agree, I think many providers have a blended price that it is still optimized for low latency. We hope that selecting the lowest price provider among all providers of a given model mitigates this (or at least selects a consistent set of providers that make consistent choices) 2. I’d be really interested in such data. I don’t know of any such empirical results, but Erdil et al’s Inference Economics has a good theoretical model.
Thanks so much for the feedback! 1. I agree, I think many providers have a blended price that it is still optimized for low latency. We hope that selecting the lowest price provider among all providers of a given model mitigates this (or at least selects a consistent set of providers that make consistent choices) 2. I’d be really interested in such data. I don’t know of any such empirical results, but Erdil et al’s Inference Economics has a good theoretical model.