Systems | Information | Learning | Optimization
 

SILO: Do Large Language Models Need Statistical Foundations?

Abstract: In this talk, we advocate for the development of rigorous statistical foundations for large language models (LLMs). We begin by elaborating two key features that motivate statistical perspectives for LLMs: (1) the probabilistic, autoregressive nature of next-token prediction, and (2) the complexity and black box nature of Transformer architectures. …

SILO: Variational inference – reconciling statistical and convergence guarantees

Abstract: As a computational alternative to Markov chain Monte Carlo approaches, variational inference (VI) is becoming increasingly popular for approximating intractable posterior distributions in large-scale Bayesian models due to its comparable efficacy and superior efficiency. Several recent works provide theoretical justifications of VI by proving its statistical optimality for parameter …

SILO: Characterizing the power of MCMC methods for sparse estimation

Abstract: Markov Chain Monte Carlo (MCMC) and local-search optimization methods have been extensively used in the practice of statistical research for many decades now. However, their exact theoretical performance has been strikingly eluding even for simple parametric estimation tasks. This is in stark contrast to other classes of estimators such …

SILO: On counterfactual inference with unobserved confounding via exponential family

Abstract: We are interested in the problem of unit-level counterfactual inference in the presence of unobserved confounders owing to the increasing importance of personalized decision-making in many domains: consider a recommender system interacting with a user over time where each user is provided recommendations based on observed demographics, prior engagement …

SILO: Polynomial Graph Neural Networks: Theoretical Limits and Graph Noise Impact

Abstract: This talk examines the theoretical foundations of Graph Neural Networks (GNNs), focusing on polynomial GNNs (Poly-GNNs). We start with empirical evidence challenging the need for complex GNN architectures in semi-supervised node classification, showing simpler methods often perform comparably. We then analyze Poly-GNNs within a contextual stochastic block model, addressing …

SILO: Towards Secure Large Language Models: From Model to System

Abstract: We are witnessing a paradigm shift in AI, transitioning from deep learning models to the era of  Large Language Models (LLMs). This shift signifies a transformative advancement in AI, enabling it to be applied to diverse real-world safety-critical applications.   Despite these impressive achievements, a fundamental question remains: are …