Optimal detection of a sparse principal component

Sparse Principal Component Analysis (SPCA) is a remarkably useful tool for practitioners who had been relying on ad-hoc thresholding methods. Our analysis aims at providing a a framework to test whether the data at hand indeed contains a sparse principal component. More precisely we propose an optimal test procedure to detect the presence of a sparse principal component in a high-dimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NP-complete in general and we describe a computationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels and performs very well on simulated datasets. Moreover, we exhibit some evidence from average case complexity that this slight deterioration may be unavoidable.

October 31, 2012

12:30 pm (1h)

Discovery Building, Orchard View Room

Philippe Rigollet