skip to content

Faculty of Economics

Journal Cover

Klochkov, Y., Kroshnin, A. and Zhivotovskiy, N.

Robust k-means Clustering for Distributions with Two Moments

Annals of Statistics, forthcoming

(2020)

Abstract: We consider the robust algorithms for the k-means clustering problem where a quantizer is constructed based on N independent observations. Our main results are median of means based non-asymptotic excess distortion bounds that hold under the two bounded moments assumption in a general separable Hilbert space. In particular, our results extend the renowned asymptotic result of Pollard (1981) who showed that the existence of two moments is sufficient for strong consistency of an empirically optimal quantizer in Rd. In a special case of clustering in Rd, under two bounded moments, we prove matching (up to constant factors) non-asymptotic upper and lower bounds on the excess distortion, which depend on the probability mass of the lightest cluster of an optimal quantizer. Our bounds have the sub-Gaussian form, and the proofs are based on the versions of uniform bounds for robust mean estimators.

Author links: Yegor Klochkov  

PDF Link: https://www.researchgate.net/profile/Alexey-Kroshnin/publication/339088843_Robust_k-means_Clustering_for_Distributions_with_Two_Moments/links/5e4337cf458515072d932588/Robust-k-means-Clustering-for-Distributions-with-Two-Moments.pdf


Papers and Publications



Recent Publications


Carvalho, V. M., Nirei, M., Saito, Y. U. and Tahbaz-Salehi, A. Supply Chain Disruptions: Evidence from the Great East Japan Earthquake Quarterly Journal of Economics [2021]

Fruehwirth, J., Iyer, S. and Zhang, A. Religion and Depression in Adolescence Journal of Political Economy [2019]

Ai, C., Linton, O., Motegi, K. and Zhang, Z. A Unified Framework for Efficient Estimation of General Treatment Models Quantitative Economics, forthcoming [2021]

Todd, P. E. and Zhang, W. A Dynamic Model of Personality, Schooling, and Occupational Choice Quantitative Economics [2020]