Processing math: 12%

Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

Bibtex Paper Reviews And Public Comment » Supplemental

Authors

Gavin Brown, Marco Gaboardi, Adam Smith, Jonathan Ullman, Lydia Zakynthinou

Abstract

We present two sample-efficient differentially private mean estimators for d-dimensional (sub)Gaussian distributions with unknown covariance. Informally, given n samples from such a distribution with mean \mu and covariance \Sigma, our estimators output \tilde\mu such that \| \tilde\mu - \mu \|_{\Sigma} \leq \alpha, where \| \cdot \|_{\Sigma} is the \emph{Mahalanobis distance}. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require \Omega(d^{3/2}) samples. Each of our estimators is based on a simple, general approach to designing differentially private mechanisms, but with novel technical steps to make the estimator private and sample-efficient. Our first estimator samples a point with approximately maximum Tukey depth using the exponential mechanism, but restricted to the set of points of large Tukey depth. Proving that this mechanism is private requires a novel analysis. Our second estimator perturbs the empirical mean of the data set with noise calibrated to the empirical covariance. Only the mean is released, however; the covariance is only used internally. Its sample complexity guarantees hold more generally for subgaussian distributions, albeit with a slightly worse dependence on the privacy parameter. For both estimators, careful preprocessing of the data is required to satisfy differential privacy.