I think a big problem with LLMs as we know them is that they are “god models” that are essentially incomprehensibly large. Smaller models are much easier to modify. We need something like the Drosophila of AI models.
Something I’ve started to do is try to build toy models that exhibit certain large model behaviors. I suspect a lot of what the large models do can be trained in small models if we can figure out which part of the massive data sets creates the behavior we want.
I think a big problem with LLMs as we know them is that they are “god models” that are essentially incomprehensibly large. Smaller models are much easier to modify. We need something like the Drosophila of AI models.
Something I’ve started to do is try to build toy models that exhibit certain large model behaviors. I suspect a lot of what the large models do can be trained in small models if we can figure out which part of the massive data sets creates the behavior we want.
Thank you for the suggestion! I have found it a lot easier doing experiments with small models than I thought.