Kangwook Lee

SILO: Beyond Decoder-Only Next Token Prediction

Abstract: This talk presents two distinct approaches that expand the potential of Transformer architectures beyond the traditional decoder-only, causal-attention models for next-token prediction. In the first half, we will examine looped Transformers with an adaptive iteration mechanism, demonstrating that these models can learn highly length-generalizable solutions for algorithmic tasks. The …

SILO: Theoretical Exploration of Foundation Model Adaptation Methods

TBA

SILO: Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance

Kangwook Lee, University of Wisconsin–Madison

Make-or-break issues in fair classification

Kangwook Lee, University of Wisconsin–Madison

Learning with scarce data: The role of side information, simulators, and GANs

In this talk, I will present the role of side information, simulators, and GANs for learning with scarce data. In the first part, I will talk about the role of similarity graphs in recommendation systems. In the second part, the role of simulators and GANs will be discussed.