I am afraid that diffusion-based models will be far more powerful than the LLMs trained by similar compute and on a similar amount of data, especially in generating a new and diverse set of ideas, which is vitally important[1] in automating the research. While LLMs are thought to become comparable with scatterbrained employees who thrive under careful management, diffusion-based models seem to work more like human imagination. See also the phenomena where humans dream of ideas.
Unfortunately, as Peter Barnett remarks, there isn’t a clear way to get a (faithful) Chain-of-Thought from diffusion-based models. The backtracking in diffusion-based models appears to resemble the neuralese[2] in LLMs that are thought to become far more powerful and very hard to interpret.
Another complication of using diffusion-based LLMs is that they think big-picture about the entire responce. Then it might also be easier for these models to develop a worldview, think about their long-term goals and how best to achieve them. Unlike the AI-2027 scenario, the phase where Agent-3 is misaligned but not adversarily so is severely shortened or outright absent.
Therefore, a potential safe strategy would be to test only the dependence of capabilities of diffusion-based models on size and on data quantity, develop interpretability tools[3] for them by using the less capable and better-interpretable LLMs, then develop interpreted diffusion-based models.
For example, the optimistic scenario, unlike the takeoff forecast by Kokotajlo, assumes that the AI’s creative process fails to be sufficiently different from instance to instance.
CoT-using models are currently bottlenecked by the inability to transfer more than one token. The neuralese recurrence makes interpretability far more difficult. I proposed a technique where, instead of a neuralese memo, the LLM receives a text memo from itself and the LLM should be trained to ensure that the text is related to the prompt’s subject.
The race ending of the AI-2027 scenario has Agent-4develop such tools and construct Agent-5 on an entirely different architecture. But the forecasters didn’t know about Google’s experiments with diffusion-based models and assumed that it would be an AI who would come up with the architecture of Agent-5.
What is this based on? Is there, currently, any reason to expect diffusion LLMs to have a radically different set of capabilities from LLMs? Is there a reason to expect them to scale better?
I have seen one paper, arguing that diffusion is stronger.
the most interesting result there: A small 6M(!) model is able to solve 9x9 sudokus with 100% accuracy. In my own experiments, using a LLama 3B model and a lot of finetuning and engineering, I got up to ~50% accuracy with autoregressive sampling.[1]
For sudoku this seems pretty obvious: In many cases there is a cell that is very easy to fill, so you can solve the problem piece by piece. Otoh, autoregressive models sometimes have to keep the full solution “in their head” before filling the first cell.
(As an aside, I tried testing some sudokus in Gemini Diffusion, which it couldn’t solve)
So yeah, there are some tasks that should definitely be easier to solve for diffusion LLMs. Their scaling behaviour is not researched well yet, afaict.
I don’t fully trust the results of this paper, as it seems too strong and I have not yet been able to replicate this well. But the principle holds and the idea is good.
I am afraid that diffusion-based models will be far more powerful than the LLMs trained by similar compute and on a similar amount of data, especially in generating a new and diverse set of ideas, which is vitally important[1] in automating the research. While LLMs are thought to become comparable with scatterbrained employees who thrive under careful management, diffusion-based models seem to work more like human imagination. See also the phenomena where humans dream of ideas.
Unfortunately, as Peter Barnett remarks, there isn’t a clear way to get a (faithful) Chain-of-Thought from diffusion-based models. The backtracking in diffusion-based models appears to resemble the neuralese[2] in LLMs that are thought to become far more powerful and very hard to interpret.
Another complication of using diffusion-based LLMs is that they think big-picture about the entire responce. Then it might also be easier for these models to develop a worldview, think about their long-term goals and how best to achieve them. Unlike the AI-2027 scenario, the phase where Agent-3 is misaligned but not adversarily so is severely shortened or outright absent.
Therefore, a potential safe strategy would be to test only the dependence of capabilities of diffusion-based models on size and on data quantity, develop interpretability tools[3] for them by using the less capable and better-interpretable LLMs, then develop interpreted diffusion-based models.
For example, the optimistic scenario, unlike the takeoff forecast by Kokotajlo, assumes that the AI’s creative process fails to be sufficiently different from instance to instance.
CoT-using models are currently bottlenecked by the inability to transfer more than one token. The neuralese recurrence makes interpretability far more difficult. I proposed a technique where, instead of a neuralese memo, the LLM receives a text memo from itself and the LLM should be trained to ensure that the text is related to the prompt’s subject.
The race ending of the AI-2027 scenario has Agent-4 develop such tools and construct Agent-5 on an entirely different architecture. But the forecasters didn’t know about Google’s experiments with diffusion-based models and assumed that it would be an AI who would come up with the architecture of Agent-5.
What is this based on? Is there, currently, any reason to expect diffusion LLMs to have a radically different set of capabilities from LLMs? Is there a reason to expect them to scale better?
I have seen one paper, arguing that diffusion is stronger.
the most interesting result there: A small 6M(!) model is able to solve 9x9 sudokus with 100% accuracy. In my own experiments, using a LLama 3B model and a lot of finetuning and engineering, I got up to ~50% accuracy with autoregressive sampling.[1]
For sudoku this seems pretty obvious: In many cases there is a cell that is very easy to fill, so you can solve the problem piece by piece. Otoh, autoregressive models sometimes have to keep the full solution “in their head” before filling the first cell.
(As an aside, I tried testing some sudokus in Gemini Diffusion, which it couldn’t solve)
So yeah, there are some tasks that should definitely be easier to solve for diffusion LLMs. Their scaling behaviour is not researched well yet, afaict.
I don’t fully trust the results of this paper, as it seems too strong and I have not yet been able to replicate this well. But the principle holds and the idea is good.