A large covariance matrix estimator under intermediate spikiness regimes

Farnè, Matteo; Montanari, Angela

The present paper concerns large covariance matrix estimation via composite minimization under the assumption of low rank plus sparse structure. In this approach, the low rank plus sparse decomposition of the covariance matrix is recovered by least squares minimization under nuclear norm plus $l_1$ norm penalization. This paper proposes a new estimator of that family based on an additional least-squares re-optimization step aimed at un-shrinking the eigenvalues of the low rank component estimated at the first step. We prove that such un-shrinkage causes the final estimate to approach the target as closely as possible in Frobenius norm while recovering exactly the underlying low rank and sparsity pattern. Consistency is guaranteed when $n$ is at least $O(p^\frac32\delta)$, provided that the maximum number of non-zeros per row in the sparse component is $O(p^\delta)$ with $\delta \leq \frac12$. Consistent recovery is ensured if the latent eigenvalues scale to $p^\alpha$, $\alpha \in[0,1]$, while rank consistency is ensured if $\delta \leq \alpha$. The resulting estimator is called UNALCE (UNshrunk ALgebraic Covariance Estimator) and is shown to outperform state of the art estimators, especially for what concerns fitting properties and sparsity pattern detection. The effectiveness of UNALCE is highlighted on a real example regarding ECB banking supervisory data.

Matteo Farnè, Angela Montanari (2018). A large covariance matrix estimator under intermediate spikiness regimes.