Systems | Information | Learning | Optimization
 

SILO: Toward improving LLM English perplexity

Abstract

Over the past decade there has been a significant effort to improve Large Language Models (LLMs) performance. The fundamental aspect of training LLMs is next-word prediction over a large corpus as measured by the resulting perplexity. Models with lower perplexity consistently improve performance across a variety of downstream tasks including reasoning, coding, and question-answering. In this talk we review existing perplexity-reduction approaches, and show how information-theoretic, syntactical, statistical, and diversification-based techniques may help further reduce the LLM perplexity for the common WikiText-103 benchmark. Based on Joint work with Esen Ergun.

Bio

Alon Orlitsky received B.Sc. degrees in Mathematics and Electrical Engineering from Ben Gurion University, and M.Sc. and Ph.D. degrees in Electrical Engineering from Stanford University. After a decade with the Communications Analysis Research Department at Bell Laboratories and a year at D.E. Shaw and Company, he joined the University of California San Diego, where he is currently a professor of Electrical and Computer Engineering and of Computer Science and Engineering, holds the Qualcomm Chair for Information Theory and its Applications, and heads the Information Theory and Applications Center. His research spans information theory, statistical modeling, and machine learning, and focuses on fundamental limits and practical algorithms for extracting knowledge from data. Among other distinctions, Alon is a recipient of the 2021 Information Theory Society Claude E. Shannon Award and a co-recipient of the 2017 ICML Best Paper Honorable Mention Award, the 2015 NeurIPS Best Paper Award, the 2006 Information Theory Society Paper Award, and the 1992 IEEE W.R.G. Baker Award.

April 15, 2026
12:30 pm (1h)

Orchard View Room

Alon Orlitsky, UCSD