Statistics for Unsupervised learning for large-scale data