Title: Near-Optimal Bayesian Active Learning at Scale: Submodular Surrogates and Beyond
Abstract: In this talk, I will introduce the decision-theoretic value of information problem in the context of Bayesian active learning, where the goal is to learn the value of some unknown target variable (e.g., a classifier) through a sequence of informative, noisy tests. We show that for structured problems where the test outcomes are conditionally dependent given the target variable, common greedy heuristics, such as uncertainty sampling or myopic value of information, often perform poorly. We then devise efficient surrogate objectives that are amenable to greedy optimization, while still achieving strong approximation guarantees. A key property we seek in the design of such greedy heuristics is submodularity, a natural diminishing returns condition common to a broad class of decision-making problems. We discuss a few practical challenges for these approaches when training large models (e.g. it can be challenging to construct such surrogate functions, or computationally prohibitive to collect data in large batches). We introduce approximation algorithms to enable efficient data acquisition at scale. We demonstrate our algorithms on a variety of batched and sequential optimization tasks, including active learning, robotic manipulation, and sequential experimental design for protein engineering.
Orchard View Room, Virtual