Spring 2020

Adversarial Robustness From Well-Separated Data

Classifiers are known to be vulnerable to adversarial examples, which are imperceptible modifications of true inputs that lead to misclassification. This raises many concerns, and recent research aims to better understand this phenomenon. We make progress on two fronts: 1) We take a holistic look at adversarial examples for non-parametric …

Advances in Gradient Descent Methods for Non-Convex Optimization

With a flurry of recent research motivated by applications to machine learning, convergence of gradient descent methods for smooth non-convex unconstrained optimization is well understood in the centralized setting. In this talk I will discuss our progress towards understanding how convergence of gradient descent methods (including SGD and acceleration) is …

Backward Feature Correction: How can Deep Learning performs Deep Learning

How does a 110-layer ResNet learn a high-complexity classifier using relatively few training examples and short training time? We present a theory towards explaining this learning process in terms of hierarchical learning. We refer to hierarchical learning as the learner learns to represent a complicated target function by decomposing it into a …

Why some robust estimators are efficiently computable?

Recent advances of computational robust statistics have produced efficient estimators with provable near-optimal statistical guarantees for a variety of problems. These estimators often involve non-convex optimization, and it is not clear why these non-convex problems are efficiently solvable, but many classical non-convex formulations are not. We make an attempt to …

Biologically interpretable machine learning modeling for understanding functional genomics

Robust phenotype-genotype associations have been established for a number of human diseases including brain disorders (e.g., schizophrenia, bipolar disorder). However, understanding the cellular and molecular causes from genotype to phenotype remains elusive. To address this, recent scientific projects have generated large multi-omic datasets — e.g., the PsychENCODE consortium generated ~5,500 genotype, …

Learning to do Structured Inference in Natural Language Processing

Many tasks in natural language processing, computer vision, and computational biology involve predicting structured outputs. Researchers are increasingly applying deep representation learning to these problems, but the structured component of these approaches is usually quite simplistic. For example, neural machine translation systems use unstructured training of local factors followed by …

A function space view of overparameterized neural networks

Contrary to classical bias/variance trade-offs, deep learning practitioners have observed that vastly overparameterized neural networks with the capacity to fit virtually any labels nevertheless generalize well when trained on real data. One possible explanation of this phenomenon is that complexity control is being achieved by implicitly or explicitly controlling the magnitude of …

Multi-armed bandit problems with history-dependent rewards

The multi-armed bandit problem is a common sequential decision-making framework where at each time step a player selects an action and receives some reward from selecting that action. The aim is to select actions to maximize the total reward. Commonly it is assumed that the (expected) reward of each action …

Statistics meets computation: Trade-offs between interpretability and flexibility

Modeling and tractable computation form two fundamental but competing pillars of data science; indeed, fitting models to data is often computationally challenging in modern applications. At the same time, a “good” model is one that imposes the right kind of structure on the underlying data-generating process, and this involves trading …

Learning from Societal Data: Theory and Practice

Machine learning algorithms for policy and decision making are becoming ubiquitous. In many societal applications, the inferences we can draw are often severely limited not by the number of subjects in the data but rather by limited observations available for each subject. My research focuses on tackling these limitations both …