Processing math: 100%

Cascading Contextual Assortment Bandits

Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track

Bibtex Paper Supplemental

Authors

Hyun-jun Choi, Rajan Udwani, Min-hwan Oh

Abstract

We present a new combinatorial bandit model, the \textit{cascading contextual assortment bandit}. This model serves as a generalization of both existing cascading bandits and assortment bandits, broadening their applicability in practice. For this model, we propose our first UCB bandit algorithm, UCB-CCA. We prove that this algorithm achieves a T-step regret upper-bound of ˜O(1κdT), sharper than existing bounds for cascading contextual bandits by eliminating dependence on cascade length K. To improve the dependence on problem-dependent constant κ, we introduce our second algorithm, UCB-CCA+, which leverages a new Bernstein-type concentration result. This algorithm achieves ˜O(dT) without dependence on κ in the leading term. We substantiate our theoretical claims with numerical experiments, demonstrating the practical efficacy of our proposed methods.