For an example of the kind of capabilities diffusion models offer that LLMs don’t, you don’t need to just predict tokens after a piece of text: you can natively edit somewhere in the middle.
Unfortunately it still remains hard to insert a bunch of tokens into the middle of the text, shifting everything else to the right. If that capability was added, I visualise it as adding some asides/footnotes/comments to text parts, then iterating on total text meaning (so comments can influence each other too), and then folding the extended structure to text.
Nice to see that diffusion model hypothesis is tested, and even better to hear that it works as I anticipated!
Unfortunately it still remains hard to insert a bunch of tokens into the middle of the text, shifting everything else to the right. If that capability was added, I visualise it as adding some asides/footnotes/comments to text parts, then iterating on total text meaning (so comments can influence each other too), and then folding the extended structure to text.
Nice to see that diffusion model hypothesis is tested, and even better to hear that it works as I anticipated!