Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Akifumi Okuno, Hidetoshi Shimodaira
k-nearest neighbour (k-NN) is one of the simplest and most widely-used methods for supervised classification, that predicts a query's label by taking weighted ratio of observed labels of k objects nearest to the query. The weights and the parameter k∈N regulate its bias-variance trade-off, and the trade-off implicitly affects the convergence rate of the excess risk for the k-NN classifier; several existing studies considered selecting optimal k and weights to obtain faster convergence rate. Whereas k-NN with non-negative weights has been developed widely, it was also proved that negative weights are essential for eradicating the bias terms and attaining optimal convergence rate. In this paper, we propose a novel multiscale k-NN (MS-k-NN), that extrapolates unweighted k-NN estimators from several k≥1 values to k=0, thus giving an imaginary 0-NN estimator. Our method implicitly computes optimal real-valued weights that are adaptive to the query and its neighbour points. We theoretically prove that the MS-k-NN attains the improved rate, which coincides with the existing optimal rate under some conditions.