Systems | Information | Learning | Optimization

Causal Inference and the Data-Fusion Problem


Causal inference is usually dichotomized into two categories, experimental (Fisher) and observational (Pearl) which, by and large, are studied separately. Reality is more demanding. Experimental and observational studies are but two extremes of a rich spectrum of research designs that generate the bulk of the data available in practical, large scale situations. In typical medical explorations, for example, data from multiple observations and experiments are collected, coming from distinct experimental setups, different sampling conditions, and heterogeneous populations.

In this talk, I will discuss some of the latest results in the field of causal inference that attempt to make sense of large and heterogeneous amounts of data. In particular, I will introduce the data-fusion problem, which is concerned with piecing together multiple datasets collected under disparate conditions (to be defined) so as to obtain statistically valid answers to queries of interest. The availability of multiple heterogeneous datasets presents new opportunities to data analysts since the knowledge that can be acquired from combined data would not be possible from any individual source alone. However, the biases that emerge in heterogeneous environments require new analytical tools. Some of these biases, including confounding, sampling selection, and cross-population biases, have been addressed in isolation, largely in restricted parametric models. I will present my work on a general, non-parametric framework for handling these biases and, ultimately, a theoretical solution to the problem of data-fusion in causal inference tasks. I will end the talk discussing some of the implications of this new framework to decision-making, including the new, sharp boundary between population-level versus individual-level (personalized) inferences.

Suggested readings:

E. Bareinboim and J. Pearl, Causal inference and the Data-Fusion Problem, Proceedings of the National Academy of Sciences, 113(27): 7345-7352, 2016.

E. Bareinboim, A. Forney, and J. Pearl, Bandits with Unobserved Confounders: A Causal Approach, Proceedings of Neural Information Processing Systems (NIPS), 2015.


Elias Bareinboim is an assistant professor in the Department of Computer Science at Purdue University, with a courtesy appointment in Statistics. His research focuses on causal and counterfactual inference and their applications to data-driven fields. Bareinboim received a Ph.D. in Computer Science from UCLA working with Judea Pearl. His doctoral thesis was the first to propose a general solution to the problem of “data-fusion” and provides practical methods for combining datasets generated under different experimental conditions. Bareinboim’s recognitions include IEEE AI’s 10 to Watch, the Dan David Prize Scholarship, the Yahoo! Key Scientific Challenges Award, and the 2014 AAAI Outstanding Paper Award.

March 15 @ 12:30
12:30 pm (1h)

Discovery Building, Orchard View Room

Elias Bareinboim