A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares

TitleA Statistical Perspective on Randomized Sketching for Ordinary Least-Squares
Publication TypeUnpublished
Year of Publication2014
AuthorsRaskutti, G., & Mahoney M. W.
Other Numbers3676

We consider statistical aspects of solving large-scale least-squares (LS) problems using randomized sketching algorithms. Prior work has typically adopted an \emph{algorithmic perspective}, in that it has made no statistical assumptions on the input X and Y, and instead it has assumed that the data (X,Y) are fixed and worst-case. In this paper, we adopt a \emph{statistical perspective}, and we consider the mean-squared error performance of randomized sketching algorithms, when data (X,Y) are generated according to a statistical linear model Y=X?+?, where ? is a noise process. To do this, we first develop a framework for assessing, in a unified manner, algorithmic and statistical aspects of randomized sketching methods. We then consider the statistical predicition efficiency (SPE) and the statistical residual efficiency (SRE) of the sketched LS estimator; and we use our framework to provide results for several types of random projection and random sampling sketching algorithms. Among other results, we show that the SRE can be bounded when p?r?n but that the SPE typically requires the sample size r to be substantially larger. Our theoretical results reveal that, depending on the specifics of the situation, leverage-based sampling methods can perform as well as or better than projection methods. Our empirical results reveal that when r is only slightly greater than p and much less than n, projection-based methods out-perform sampling-based methods, but as r grows, sampling methods start to out-perform projection methods.

Bibliographic Notes

Technical Report, Preprint: arXiv:1406.5986

Abbreviated Authors

G. Raskutti and M. W. Mahoney

ICSI Research Group

Big Data

ICSI Publication Type