Systems | Information | Learning | Optimization
 

SILO: Unsupervised Learning: Validation beyond Visualization

Abstract:

While machine learning is many times faster than humans at finding patterns in scientific data, the task of validating these patterns as “meaningful’” is still left to the scientist, or to ad-hoc methods such as visualization. To effectively accelerate scientific discovery with machine learning, human validation must be replaced with automated validation to the extent possible. Otherwise, instead of drowning in data, one risk drowning in hypotheses. In this talk I will present instances in which unsupervised learning tasks can be augmented with data driven guarantees of reproducibility and correctness.
In the case of clustering, I will introduce a new framework for proving that a clustering is approximately “correct”. The guarantees provided are distribution free, but unlike the PAC bounds in supervised learning, the bounds for clustering can be calculated exactly by solving a convex program and can be of direct practical utility.
In the case of non-linear dimension reduction by manifold learning, I will demonstrate some of my group’s contributions to making the output of ML algorithms reproducible and interpretable. Surprisingly, some of the results bring us back to familiar machine learning methods such as sparse recovery.
Joint work with Dominique Perrault-Joncas, James McQueen, Yu-Chia Chen, Samson Koelle, Hanyu Zhang, Weicheng Wu, Ioannis Kevrekidis

Bio:

Marina Meila is Professor of Statistics at the University of Washington and Senior Fellow of the University of Washington’s eScience Institute. Her long term interest is in statistical learning, particularly the discovery of geometric and combinatorial structure in data, efficient algorithms, and developing guarantees and validation methods for unsupervised learning with minimal or no assumptions about the data generating process. She has collaborated with scientists in applied inverse problems, materials science and theoretical chemistry. Meila holds a MS degree in Electrical Engineering from the Polytechnic Institute of Bucharest, and a PhD in Computer Science and Electrical Engineering from the Massachusetts Institute of Technology.

March 13 @ 12:30
12:30 pm (1h)

Discovery Building, Orchard View Room

Marina Meila, UWash