One piece of low-hanging fruit is that running LLMs on general-purpose hardware is inefficient. As soon as these people scale up it should be possible to reduce token prices by at least another order of magnitude, although I’m not sure if frontier models will fit on their chips (yet). https://taalas.com/the-path-to-ubiquitous-ai/
One piece of low-hanging fruit is that running LLMs on general-purpose hardware is inefficient. As soon as these people scale up it should be possible to reduce token prices by at least another order of magnitude, although I’m not sure if frontier models will fit on their chips (yet). https://taalas.com/the-path-to-ubiquitous-ai/