From Skeptic to Believer: How Diffusion Models Are Reshaping Language Generation
When I first encountered diffusion models back in 2020, I dismissed them as elegant solutions for continuous domains like images but fundamentally incompatible with the discrete nature of language. Like many in the field, I was convinced that autoregressive models (ARMs) were the only sensible architecture for text generation. After all, language is inherently sequential, and the causal attention mechanism in models like GPT seemed perfectly designed for this constraint. ...