Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track
Akshaykumar Gattani, Sharath Raghvendra, Pouyan Shirzadian
Algorithms for the minimum-cost bipartite matching can be used to estimate Wasserstein distance between two distributions.Given two sets A and B of n points in a 2-dimensional Euclidean space, one can use a fast implementation of the Hungarian method to compute a minimum-cost bipartite matching of A and B in ˜O(n2) time. Let Δ be the spread, i.e., the ratio of the distance of the farthest to the closest pair of points in A∪B. In this paper, we present a new algorithm to compute a minimum-cost bipartite matching of A and B with a similar worst-case execution time of ˜O(n2logΔ). However, when A and B are drawn independently and identically from a fixed distribution that is not known to the algorithm, the execution time of our algorithm is, in expectation, ˜O(n7/4logΔ).To the best of our knowledge, our algorithm is the first one to achieve a sub-quadratic execution time even for stochastic point sets with real-valued coordinates.Our algorithm extends to any dimension d, where it runs in ˜O(n2−12dΦ(n)) time for stochastic point sets A and B; here Φ(n) is the query/update time of a dynamic weighted nearest neighbor data structure. Our algorithm can be seen as a careful adaptation of the Hungarian method in the geometric divide-and-conquer framework.