One piece of low-hanging fruit is that running LLMs on general-purpose hardware is inefficient. As soon as these people scale up it should be possible to reduce token prices by at least another order of magnitude, although I’m not sure if frontier models will fit on their chips (yet). https://taalas.com/the-path-to-ubiquitous-ai/
Will AI token prices for a fixed level of intelligence continue to decline 70-90% per year for how long? What are the reasons they decline so fast?
One piece of low-hanging fruit is that running LLMs on general-purpose hardware is inefficient. As soon as these people scale up it should be possible to reduce token prices by at least another order of magnitude, although I’m not sure if frontier models will fit on their chips (yet). https://taalas.com/the-path-to-ubiquitous-ai/