Given a data set $\mathcal{X}=\{x_1, \dots, x_n\}$ and a weighted graph structure $\Gamma= (\mathcal{X},W)$ on $\mathcal{X}$, graph based methods for learning use analytical notions like graph Laplacians, graph cuts, and Sobolev semi-norms to formulate optimization problems whose solutions serve as sensible approaches to machine learning tasks. When the data set consists of samples from a distribution supported on a manifold (or at least approximately so), and the weights depend inversely on the distance between the points, a natural question to study concerns the behavior of those optimization problems as the number of samples goes to infinity. In this talk I will focus on optimization problems closely connected to clustering and supervised regression that involve the graph Laplacian. For clustering, the spectrum of the graph Laplacian is the fundamental object used in the popular spectral clustering algorithm. For regression, the solution to a semilinear elliptic PDE on the graph provides the minimizer of an energy balancing regularization and data fidelity, a sensible object to use in non-parametric regression.

Using tools from optimal transport, calculus of variations, and analysis of PDEs, I will discuss a series of results establishing the asymptotic consistency (with rates of convergence) of many of these analytical objects, as well as provide some perspectives on future research directions.

Using tools from optimal transport, calculus of variations, and analysis of PDEs, I will discuss a series of results establishing the asymptotic consistency (with rates of convergence) of many of these analytical objects, as well as provide some perspectives on future research directions.

February 13 @ 12:30

12:30 pm (1h)

Discovery Building, Orchard View Room

NIcolas Garcia Trillos