Two years ago my coworkers (not in any kind of software field) were asking me, shouldn’t targeted models be able to work better than generalist models? And I said, in principle, yes, but the general frontier models are currently advancing so fast that no one has time or incentive to make many specialist models before they’re already out of date. As long as this is the case, new things will spontaneously become low hanging fruit every couple months, and efforts to push the frontier by anyone except the frontier labs will usually be wasted and overpriced.
If that stops being the case—if we were sticking with a given model and set of tools and harness for years before moving on—then we open up a whole host of other pathways that haven’t generally been worthwhile to date.
Maybe you would fine-tune a model on each particular large codebase, its history, its documentation, and its institutional context, so that the knowledge is in its weights instead of its context window. This could provide quite a bit of the tacit knowledge humans struggle to convey to each other, let alone to LLMs.
Maybe you would put in the effort to really optimize the organization of the knowledge base you give it.
Maybe you would hire an army of I/O psych types to figure out more precisely the shape of what does and doesn’t work well for AI, and adapt workflows accordingly. AKA, we could put in the actual effort to create an environment where AI can do its best work, the way organizations that need high quality and high reliability do for humans today. This includes helping the humans adapt to the AI, as well.
Two years ago my coworkers (not in any kind of software field) were asking me, shouldn’t targeted models be able to work better than generalist models? And I said, in principle, yes, but the general frontier models are currently advancing so fast that no one has time or incentive to make many specialist models before they’re already out of date. As long as this is the case, new things will spontaneously become low hanging fruit every couple months, and efforts to push the frontier by anyone except the frontier labs will usually be wasted and overpriced.
If that stops being the case—if we were sticking with a given model and set of tools and harness for years before moving on—then we open up a whole host of other pathways that haven’t generally been worthwhile to date.
Maybe you would fine-tune a model on each particular large codebase, its history, its documentation, and its institutional context, so that the knowledge is in its weights instead of its context window. This could provide quite a bit of the tacit knowledge humans struggle to convey to each other, let alone to LLMs.
Maybe you would put in the effort to really optimize the organization of the knowledge base you give it.
Maybe you would hire an army of I/O psych types to figure out more precisely the shape of what does and doesn’t work well for AI, and adapt workflows accordingly. AKA, we could put in the actual effort to create an environment where AI can do its best work, the way organizations that need high quality and high reliability do for humans today. This includes helping the humans adapt to the AI, as well.