Of course, no matter how flawed the comparison to evolution is, there doesn’t seem to be any competing analogy which makes the same argument in a more defensible manner. And people love analogies. A friend, reading this post, told me (paraphrased): “give a better analogy, then, if this one isn’t good”. I have to admit, I have no better analogy.
A better, more mechanistically relevant analogy is within-lifetime human reward circuitry (outer) and learned human values (inner). However, it doesn’t yield the same conclusions (which I think is good). I think it’s more relevant due to greater similarity in mechanism to LLMs (locally randomly initialized networks updated by a local update rule using predictive and reinforcement learning, also trained on a lot of language data), but still not quite as relevant as actual LLM experiments.
I agree that we should stop with the analogies. Gather evidence to learn how it actually works. Let go of these old arguments that we don’t need anymore.
A better, more mechanistically relevant analogy is within-lifetime human reward circuitry (outer) and learned human values (inner). However, it doesn’t yield the same conclusions (which I think is good). I think it’s more relevant due to greater similarity in mechanism to LLMs (locally randomly initialized networks updated by a local update rule using predictive and reinforcement learning, also trained on a lot of language data), but still not quite as relevant as actual LLM experiments.
I agree that we should stop with the analogies. Gather evidence to learn how it actually works. Let go of these old arguments that we don’t need anymore.