Systems | Information | Learning | Optimization
 

The surprising reasonableness of the earth movers distance in high dimensions

The earth mover’s distance (EMD) is a scalar measure of dissimilarity between histograms.  Introduced over 200 years ago, the EMD has played a central role in linear programming, information retrieval, and is emerging as useful objective in machine learning.  During the 1990’s, the EMD was generalized from a functional that acts …

Learning under Uncertainty: Jumping out of the traditional stochastic optimization framework

Uncertainty penetrates in every corner of machine learning, ranging from data and adversary uncertainty, model uncertainty, all the way to dynamics uncertainty, even task uncertainty, and beyond. When faced with complicated machine learning tasks under various forms of uncertainty, the traditional empirical risk minimization framework, along with the rich off-the-shelf …

Fundamental Perspectives on Machine Learning: Strategic Agents & Contemporary Models

Through recent advances in machine learning (ML) technology, we are getting closer to realizing the broadly stated goal of “articially intelligent”, autonomous agents. In many cases — like cognitive radio, swarm robotics, and e-commerce — these agents will not be acting in isolation, and it is critical for them to …

Advancing genome-scale phylogenomics through Disjoint Tree Merger methods

The estimation of evolutionary trees (called phylogenies) is an essential step in biological research; however large-scale phylogeny estimation continues to be computationally challenging. Many of the current leading methods are heuristics for NP-hard optimization problems, and these methods typically have limited parallelism for improving scalability to larger numbers of species. In this talk, I …

Unsolved Subspace Mysteries

Abstract: In this talk I will introduce two new subspace models: Mixture Matrix Completion, and Subspace Splitting. Both models are motivated by metagenomics, recommender systems, and dimensionality reduction, and are tightly related to low-rank matrix completion, subspace clustering, robust matched subspace detection, and the maximum feasible subsystem problem. I will …

Optimization over nonconvex constraints & Gradient Coding via Sparse Random Graphs

Many problems in modern statistics can be formulated as an optimization problem with structured constraints, where the constraints often exhibit nonconvexity such as sparsity or low rank. However, working with nonconvex constraints presents challenges from both a theoretical and practical point of view. In this talk, we discuss a convergence …

Causal discovery with high dimensional non-Gaussian data & Scalable Generalized Linear Bandits: Online Computation and Hashing

In this talk, we will consider linear structural equation models which correspond to directed acyclic graphs (DAGs). These models assume that each observed variable is a linear function of the other variables and some error term. It has been previously shown for DAGs, when the error terms in a SEM …