Learning Large-Scale Poisson DAG Models based on OverDispersion Scoring

We address the question of identifiability and learning algorithms for large-scale Poisson Directed Acyclic Graphical (DAG) models. We define general Poisson DAG models as models where each node is a Poisson random variable with rate parameter depending on the values of the parents in the underlying DAG. First, we prove that Poisson DAG models are identifiable from observational data, and present a polynomial-time algorithm that learns the Poisson DAG model under suitable regularity conditions. The main idea behind our algorithm is based on overdispersion, in that variables that are conditionally Poisson are overdispersed relative to variables that are marginally Poisson. Exploiting overdispersion allows us to learn the causal ordering and then use ideas from learning large-scale regression models to reduce computational complexity. We provide both theoretical guarantees and simulation results for both small and large-scale DAGs to validate the success of our algorithm.

September 9, 2015

12:30 pm (1h)

Discovery Building, Orchard View Room

Gunwoong Park