Abstract:
Optimizing subsets of items arises in many contexts, from designing antibiotic cocktails, to bundling cable channels or streaming services, to selecting the tap list at a pub. Such problems often exhibit diminishing returns: adding a third antibiotic may improve efficacy, but not as much as adding the second. Prior work models this phenomenon through submodularity, a property that allows for favorable computational guarantees. We study the online setting, where nothing is known a priori about item values. At each round, the learner proposes a set and observes the resulting outcome (e.g., prescribing a drug cocktail and seeing whether the patient recovers). The goal is to maximize cumulative reward, or equivalently, minimize cumulative loss. While related problems have been studied, we establish new upper and lower bounds that match for the first time and take an unusual but enlightening form.
We also consider the challenge of jointly optimizing a set and a scalar decision (e.g., choosing a cocktail and a dosage or price). This brings forward questions about modeling human choices among multiple options. We show that, while pricing a single item online is tractable under minimal assumptions (cf. Kleinberg and Leighton, 2003), the problem becomes intractable when buyers have arbitrary unknown valuations. We therefore characterize the minimal assumptions under which tractable learning is possible when pricing multiple items for sequentially arriving heterogeneous buyers, comparing against the best fixed prices in hindsight. In essence, tractability requires that expected behavior vary smoothly with posted prices. Finally, if time permits, I will briefly highlight recent work on multi-seller marketplaces and mechanisms that provably prevent non-competitive pricing.