Systems | Information | Learning | Optimization
 

On the Existence of Compact Epsilon-Approximated Formulations for Knapsack in the Original Space

We show that there exists a family of Knapsack polytopes such that for each P in the family and each epsilon>0, any epsilon-approximated formulation of P in the original space R^n requires a number of inequalities that is super-polynomial in n. This answers a question by Bienstock and McClosky (2012). …

Quantifying Ad Fraud – Contamination Estimation via Convex Relaxations

Identifying contamination in datasets is important in a wide variety of settings, including view and click fraud in online advertising. After a brief overview of digital ad fraud, I’ll describe a technique for estimating contamination in large, categorical datasets. The technique involves solving a series of convex programs, resulting in …

Internet Device Graphs

Digital adverting is arguably the largest and most ubiquitous application of machine learning. Learning algorithms pick the ads we see by inferring information about who we are and what we might buy. Graph datasets, due to their simplicity, play a central role in facilitating this inference. Internet Device Graphs are …

An Active Learning System with applications to Psychology Research

Today, machine learning is responsible for most of what we perceive as the personalization of the web: automatic recommendations for movies (Netflix) or music (Spotify,Last.fm), personalized search results based on your recent searches or email (Google), automatic credit card fraud detection (Chase), social network friend identification (Facebook,Linked-in), and, of course, …