location: Orchard View Room
SILO: Bayesian Optimization Beyond the Black Box: Leveraging Computational Structure for Efficient and Scalable Decision-Making
Abstract Bayesian optimization (BO) is a principled framework for optimizing expensive, noisy objective functions, but traditional BO treats the system as a black box and learns only through input-output queries. In many scientific and engineering settings, this assumption is unnecessarily restrictive; valuable computational structure is often available, even if the …
SILO: How to Use Synthetic Data for Improved Statistical Inference?
Abstract The rapid proliferation of high-quality synthetic data — generated by advanced AI models or collected as auxiliary data from related tasks — presents both opportunities and challenges for statistical inference. Here, we introduce the GEneral Synthetic-Powered Inference (GESPI) framework that wraps around any statistical inference procedure to safely enhance …
SILO: High-dimensional Optimization with Applications to Compute-Optimal Neural Scaling Laws
Abstract Given the massive scale of modern ML models, we now only get a single shot to train them effectively. This restricts our ability to test multiple architectures and hyper-parameter configurations. Instead, we need to understand how these models scale, allowing us to experiment with smaller problems and then apply …
SILO: Theory and practice of LLM quantization
Abstract Modern LLMs process information by repeatedly applying a basic primitive of matrix multiplication. Estimates show that about 60-84% of the energy consumed by LLMs goes into memory load/store operations. How can we reduce this power consumption? LLM converts text into a sequence of tokens (which can be thought as …
SILO: Learning from the Right Teacher in Knowledge Distillation
Abstract: Knowledge distillation has become a central technique for training small language models, yet a fundamental question remains unresolved: what characterizes an effective teacher for a given student? This talk presents two complementary results that shed light on this problem. First, I will examine progressive distillation, where a student learns not only from …
SILO: Towards discrete diffusion models for language and image generation
Abstract: We discuss discrete diffusion models that offer a unified framework for jointly modeling categorical data such as text and images. We present a new model that we have developed for language generation called the Anchored Diffusion Language Model (ADLM). ADLM is grounded in a novel two-stage framework that first …
SILO: Bayesian Preference Exploration: Making Optimization Accessible to Non-Experts
Abstract: Optimization problems are everywhere — routing trucks, buying groceries, building a datacenter. Yet optimization methodology is hard to use. It requires the user to write down their objective and constraints as mathematical functions. In practice, the objective and constraints are unknown and must be tuned iteratively. An expert presents …
SILO: Qualia Optimization: Exploring Mathematical Formulations of AI Experience
Abstract: This talk explores the speculative question: what if current or future AI systems have qualia, such as pain or pleasure? It does so by assuming that AI systems might someday possess qualia—and that the quality of these subjective experiences should be considered alongside performance metrics. Concrete mathematical problem settings, …