I previously expected open-source LLMs to lag far behind the frontier because they’re very expensive to train and naively it doesn’t make business sense to spend on the order of $10M to (soon?) $1B to train a model only to give it away for free.
But this has been repeatedly challenged, most recently by Meta’s Llama 3. They seem to be pursuing something like a commoditize your complement strategy: https://twitter.com/willkurt/status/1781157913114870187 .
As models become orders-of-magnitude more expensive to train can we expect companies to continue to open-source them?
In particular, can we expect this of Meta?
[Question] How to Model the Future of Open-Source LLMs?
I previously expected open-source LLMs to lag far behind the frontier because they’re very expensive to train and naively it doesn’t make business sense to spend on the order of $10M to (soon?) $1B to train a model only to give it away for free.
But this has been repeatedly challenged, most recently by Meta’s Llama 3. They seem to be pursuing something like a commoditize your complement strategy: https://twitter.com/willkurt/status/1781157913114870187 .
As models become orders-of-magnitude more expensive to train can we expect companies to continue to open-source them?
In particular, can we expect this of Meta?