SILO: Faster Diffusion Language Models

Abstract:

Diffusion language models (DLMs) represent a nascent but promising alternative to GPT-style autoregressive (AR) language models: as opposed to generating one token at a time left to right, DLMs start from a set of noise tokens which they iteratively refine into text. The any-order generation can potentially result in more consistent text, while parallel generation can potentially be faster. In practice however, parallel generation results in big drops in output quality, and DLMs currently do not match AR models except if used in one-token-at-a-time mode.

In this talk we identify two issues with current DLMs: (a) parallel generation samples from product marginals instead of the true joint distribution of tokens, and (b) early errors are the primary cause of drops in accuracy. We then develop a new architecture for better sampling, and also a new self-training process, to significantly fix these issues.

No prior knowledge of DLMs is assumed.

Bio:

Sujay Sanghavi is the Fluor Centennial Fellow and Professor at UT Austin, where his research focuses on machine learning and optimization with applications to search, recommendations, and large model training. At UT Austin Sujay heads the EnCore and IFDS Tripods Institutes, and is the Associate Director of the Amazon Science Hub. Sujay has been a Principal Research Scientist and is currently an Amazon Scholar at Amazon. Sujay has three degrees from U. Illinois at Urbana Champaign: an MS in ECE, an MS in Math, and a PhD in ECE.

October 29, 2025

12:30 pm (1h)

Orchard View Room

Sujay Sanghavi, UT Austin

video