Systems | Information | Learning | Optimization
 

The surprising reasonableness of the earth movers distance in high dimensions

The earth mover’s distance (EMD) is a scalar measure of dissimilarity between histograms.  Introduced over 200 years ago, the EMD has played a central role in linear programming, information retrieval, and is emerging as useful objective in machine learning.  During the 1990’s, the EMD was generalized from a functional that acts …

Learning under Uncertainty: Jumping out of the traditional stochastic optimization framework

Uncertainty penetrates in every corner of machine learning, ranging from data and adversary uncertainty, model uncertainty, all the way to dynamics uncertainty, even task uncertainty, and beyond. When faced with complicated machine learning tasks under various forms of uncertainty, the traditional empirical risk minimization framework, along with the rich off-the-shelf …

Fundamental Perspectives on Machine Learning: Strategic Agents & Contemporary Models

Through recent advances in machine learning (ML) technology, we are getting closer to realizing the broadly stated goal of “articially intelligent”, autonomous agents. In many cases — like cognitive radio, swarm robotics, and e-commerce — these agents will not be acting in isolation, and it is critical for them to …

Advancing genome-scale phylogenomics through Disjoint Tree Merger methods

The estimation of evolutionary trees (called phylogenies) is an essential step in biological research; however large-scale phylogeny estimation continues to be computationally challenging. Many of the current leading methods are heuristics for NP-hard optimization problems, and these methods typically have limited parallelism for improving scalability to larger numbers of species. In this talk, I …

Unsolved Subspace Mysteries

Abstract: In this talk I will introduce two new subspace models: Mixture Matrix Completion, and Subspace Splitting. Both models are motivated by metagenomics, recommender systems, and dimensionality reduction, and are tightly related to low-rank matrix completion, subspace clustering, robust matched subspace detection, and the maximum feasible subsystem problem. I will …