In this work, we aim to create a data marketplace – a robust matching mechanism to efficiently buy and sell data while optimizing social welfare and maximizing revenue. While the monetization of data and pre-trained models is an essential focus by many industries and vendors today, there does not exist a market mechanism that can price data and match buyers to vendors while still addressing the (computational and other) complexity associated with creating a market platform. The challenge in creating such a marketplace stems from the very nature of data as an asset: (i) it can be replicated at zero marginal cost; (ii) its value to a firm is inherently combinatorial (i.e. the value of a particular dataset depends on what other (potentially correlated) datasets are available); (iii) its value to a firm is dependent on which other firms get access to the same data; (iv) prediction tasks and the value of an increase in prediction accuracy vary widely between different firms, and so it is not obvious how to set prices for a collection of datasets with correlated signals; (v) finally, the authenticity and truthfulness of data is difficult to verify a priori without first applying it to a prediction task. Our proposed marketplace will take a holistic view of this problem and provide an algorithmic solution combining concepts from statistical machine learning, economics of data with respect to various application domains, algorithmic market design, and mathematical optimization under uncertainty. We will discuss some examples motivating this work.
This is joint work with Anish Agarwal, Tuhin Sarkar, and Devavrat Shah.
Discovery Building, Orchard View Room