Statistics meets computation: Trade-offs between interpretability and flexibility

Modeling and tractable computation form two fundamental but competing pillars of data science; indeed, fitting models to data is often computationally challenging in modern applications. At the same time, a “good” model is one that imposes the right kind of structure on the underlying data-generating process, and this involves trading off the competing objectives of interpretability and flexibility. With a focus on balancing these tensions, I present tractable methodological solutions for fitting flexible models in some canonical machine learning tasks.

The bulk of the talk will focus on a class of “permutation-based” models, which present a flexible alternative to parametric modeling in a host of inference problems involving data generated by people. I introduce a set of algorithmic tools that handles structured missing data and breaks a conjectured computational barrier, demonstrating that carefully chosen non-parametric structure can significantly improve robustness to mis-specification while maintaining interpretability. To conclude the talk, I draw on this perspective to study two vignettes in high-dimensional regression and reinforcement learning. A focus on exploiting structure in these contexts leads to novel statistical as well as algorithmic insight.

March 25, 2020

12:30 pm (1h)

Discovery Building, Orchard View Room

Ashwin Pananjady