Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Dirk van der Hoeven
We present \textproc{Gaptron}, a randomized first-order algorithm for online multiclass classification. In the full information setting we provide expected mistake bounds for \textproc{Gaptron} with respect to the logistic loss, hinge loss, and the smooth hinge loss with O(K) regret, where the expectation is with respect to the learner's randomness and K is the number of classes. In the bandit classification setting we show that \textproc{Gaptron} is the first linear time algorithm with O(K√T) expected regret. Additionally, the expected mistake bound of \textproc{Gaptron} does not depend on the dimension of the feature vector, contrary to previous algorithms with O(K√T) regret in the bandit classification setting. We present a new proof technique that exploits the gap between the zero-one loss and surrogate losses rather than exploiting properties such as exp-concavity or mixability, which are traditionally used to prove logarithmic or constant regret bounds.