the gears to ascension comments on Memory bandwidth constraints imply economies of scale in AI inference