location: Orchard View Room
SILO: Minimizing quadratics over integers
Abstract: Mixed integer quadratic programming is the problem of minimizing a quadratic polynomial over points in a polyhedral region with some integer components. It is a natural extension of mixed integer linear programming, and it has a wide array of applications. In this talk, I will survey some recent theoretical …
SILO: Neural Operators for Scientific Applications: Learning on Function Spaces
Abstract: Applying AI to scientific problems like weather forecasting and aerodynamics is an active research area, promising to accelerate model development and enable faster scientific discovery and engineering design. In practice, these applications require learning spatiotemporal processes and solutions to partial differential equations on continuous domains at multiple scales – …
SILO: Self-Improving Transformers: Overcoming Length Generalization Challenges
Abstract: Large language models can perform algorithmic tasks through test-time computation but struggle to generalize far beyond the task difficulty of the training distribution. These limitations manifest across even simple tasks like arithmetic, string manipulation, and maze solving, where transformers learn shortcuts rather than the underlying algorithms. While prior solutions …
SILO: Efficiently Searching for Distributions
Abstract: How efficiently can we search distributions? The problem is modeled as follows: we are given knowledge of k discrete distributions v_i for 1 <= i <= k over the domain [n] = {1,…,n} which we can preprocess. Then we get samples from an unknown discrete distribution p, also over …
SILO: Theory for Diffusion Models
Abstract: In this talk I will survey our recent efforts to develop a rigorous theory for understanding diffusion generative modeling. The first part will cover discretization analyses that prove that diffusion models can approximately sample from arbitrary probability distributions provided one can have a sufficiently accurate estimate for the score …
SILO: Learning Dynamics for Nash and Coarse Correlated Equilibria in Bimatrix Games
Abstract: In this talk, we will focus on learning in two-player games. First, we will provide a brief introduction to the possible behaviors of learning algorithms and mention various techniques that have been extensively used to guarantee convergence to Nash equilibria in zero-sum games. Finally, we will demonstrate how these …
SILO: American Family Funding Initiative short talks
On the Effectiveness of Dataset Alignment for Fake Image Detection Anirudh Sundara Rajan, MS student, Computer Sciences Abstract: As latent diffusion models (LDMs) democratize image generation capabilities, there is a growing need to detect fake images. A good detector should focus on the generative model’s fingerprints while ignoring image properties …
SILO: Acceleration by Stepsize Hedging
Abstract: Can we accelerate the convergence of gradient descent without changing the algorithm — just by optimizing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in $k^{\log_p 2} = k^{0.7864}$ iterations, where $p=1+\sqrt{2}$ is the silver ratio and $k$ is …
SILO: Beyond Decoder-Only Next Token Prediction
Abstract: This talk presents two distinct approaches that expand the potential of Transformer architectures beyond the traditional decoder-only, causal-attention models for next-token prediction. In the first half, we will examine looped Transformers with an adaptive iteration mechanism, demonstrating that these models can learn highly length-generalizable solutions for algorithmic tasks. The …