Systems | Information | Learning | Optimization

A Matrix Factorization Approach to Multiple Imputation and DCM Bandits: Learning to Rank with Multiple Clicks

A Matrix Factorization Approach to Multiple Imputation:

Almost all empirical analysis in the social sciences is plagued with the problem of missing data, for instance in opinion surveys, some respondents choose not to answer certain questions, or in longitudnal surveys respondents in the pilot round may drop out in subsequent rounds. Dealing with missing values often distracts from main goal of study and many researchers use ad-hoc methods to deal with missing values, which then leads to concerns about validity of their inferences.

The Multiple Imputation technique developed by Rubin (1987) provides a structured and statistically valid technique for inferences in the presence of missing values. The basic idea here is that the inferences should reflect the uncertainty inherent in imputation. The chief caveat of the framework is the strong parametric assumptions involved. However, the current state (and implementation) of the literature on missing values in the social sciences remains firmy centered around the Multiple Imputation framework.

On the other hand computer scientists have long been using Matrix Factorization approaches to address the problem of missing data. In particular the use of Low Norm and Low Rank approximations of the data matrix are used. In this paper we extend work by Srebro (2004) and Udell et al (2014) to the Multiple Imputation framework.

As empirical applications, we consider two social science datasets – General Social Survey and National Longitudnal Survey of Youth, where missing data is often an issue. For these datasets we compare single imputations from traditional Multiple Imputation R-packages ‘Amelia’, ‘MICE’ and the Julia-package ‘LowRankModels’. We find that ‘LowRankModels’ imputations dominate in terms of our error metric.

Next steps involve implementing a Multiple Imputation version of our methods and ex-
tending our approach to longitudnal data (possibly using tensor analysis).

DCM Bandits: Learning to Rank with Multiple Clicks:

Search engines recommend a list of web pages. The user examines this list, from the first page to the last, and may click on multiple attractive pages. This type of user behavior can be modeled by the dependent click model (DCM). In this work, we propose DCM bandits, an online learning variant of the DCM model where the objective is to maximize the probability of recommending a satisfactory item. The main challenge of our problem is that the learning agent does not observe the reward. It only observes the clicks. This imbalance between the feedback and rewards makes our setting challenging. We propose a computationally-efficient learning algorithm for our problem, which we call dcmKL-UCB; derive gap-dependent upper bounds on its regret under reasonable assumptions; and prove a matching lower bound up to logarithmic factors. We experiment with dcmKL-UCB on both synthetic and real-world problems. Our algorithm outperforms a range of baselines and performs well even when our modeling assumptions are violated. To the best of our knowledge, this is the first regret-optimal online learning algorithm for learning to rank with multiple clicks in a cascade-like model.

This is joint work with Branislav Kveton, Csaba Szepesvari, and Zheng Wen.

February 10 @ 12:30
12:30 pm (1h)

Discovery Building, Orchard View Room

Nandana Sangupta, Sumeet Katariya