Orchard View Room – Page 3

SILO: First-Order Algorithms for Large-Scale Optimization

Abstract: It is well known that for nonconvex unconstrained optimization with Lipschitz smoothness, gradient descent and stochastic gradient descent are the optimal first-order algorithms in the deterministic and stochastic settings, respectively. This naturally raises two questions: In the constrained setting, is it possible to design algorithms that achieve the same …

SILO: Searching for architectures and BERT moments in specialized AI applications

Abstract: In 2018, advances in architecture design and self-supervised learning led to the “BERT moment” in natural language processing, in which supervised learning workflows were permanently supplanted by the pretraining and fine-tuning of massive Transformer models. This spurred scientists in more specialized areas—e.g. genomics, satellite imaging, and time series forecasting—to develop …

SILO: Some Online Combinatorial Optimization and Dynamic Pricing Problems

Abstract: Optimizing subsets of items arises in many contexts, from designing antibiotic cocktails, to bundling cable channels or streaming services, to selecting the tap list at a pub. Such problems often exhibit diminishing returns: adding a third antibiotic may improve efficacy, but not as much as adding the second. Prior …

SILO: Stable Estimators for Fast Private Statistics

Abstract: We will discuss a new set of techniques for stable statistical estimation, leading to fast and near-optimal private algorithms for mean estimation, covariance estimation, and linear regression. The analysis proceeds by constructing a stabilizing wrapper around a greedy outlier-removal process. We will also discuss connections with a recent line …

SILO: Recovering Dantzig-Wolfe Bounds by Cutting Planes

Abstract Dantzig-Wolfe (DW) decomposition is a well-known technique in mixed-integer programming (MIP) for decomposing and convexifying constraints to obtain potentially strong dual bounds. We investigate cutting planes that can be derived using the DW decomposition algorithm and show that these cuts can provide the same dual bounds as DW decomposition. …

SILO: Optimizing Optimization Methods, To and Beyond Minimax Optimality

Abstract: This talk will take up the task of designing the provably best possible gradient method for smooth convex optimization. Methods with big-O optimal worst-case guarantees were (famously) discovered in the 80s by Nesterov. Methods with exactly minimax optimal worst-case guarantees were developed in the last decade. As a first …

SILO: Automating Statistical Inference for Modern Probabilistic Models

Abstract: How do you reconstruct an image of a black hole using only noisy telescope measurements in the Fourier domain? How do you predict the cure state of a carbon fibre aircraft wing as it cures in an autoclave using a few faulty thermocouples? In theory, Bayesian probabilistic models are …

SILO: Do Large Language Models Need Statistical Foundations?

Abstract: In this talk, we advocate for the development of rigorous statistical foundations for large language models (LLMs). We begin by elaborating two key features that motivate statistical perspectives for LLMs: (1) the probabilistic, autoregressive nature of next-token prediction, and (2) the complexity and black box nature of Transformer architectures. …

SILO: Variational inference – reconciling statistical and convergence guarantees

Abstract: As a computational alternative to Markov chain Monte Carlo approaches, variational inference (VI) is becoming increasingly popular for approximating intractable posterior distributions in large-scale Bayesian models due to its comparable efficacy and superior efficiency. Several recent works provide theoretical justifications of VI by proving its statistical optimality for parameter …

SILO: Reinforcement Learning and Bayesian Optimization for Nuclear Fusion

Abstract: Nuclear fusion holds the promise of limitless clean energy and would solve many of the world’s grand challenges. The most promising approach to date uses tokamaks, but we have not yet been able to sustain plasmas at the temperatures, pressures, and durations needed to make fusion power viable. This …