A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark

TitleA multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark
Publication TypeConference Paper
Year of Publication2016
AuthorsGittens, A., Kottalam J., Yang J., Ringenburg M. F., Chhugani J., Racah E., Singh M., Yao Y., Fischer C., Ruebel O., Bowen B. P., Lewis N. G., Mahoney M., Krishnamurthy V., & Prabhat
Published inProceedings of the 5th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics
Keywordsdata analytics, high performance computing, matrix factorization
Abstract

We investigate the performance and scalability of the randomized CX low-rank matrix factorization and demonstrate its applicability through the analysis of a 1TB mass spectrometry imaging (MSI) dataset, using Apache Spark on an Amazon EC2 cluster, a Cray XC40 system, and an experimental Cray cluster. We implemented this factorization both as a parallelized C implementation with hand-tuned optimizations and in Scala using the Apache Spark high-level cluster computing framework. We obtained consistent performance across the three platforms: using Spark we were able to process the 1TB size dataset in under 30 minutes with 960 cores on all systems, with the fastest times obtained on the experimental Cray cluster. In comparison, the C Implementation was 21X faster on the Amazon EC2 system,due to careful cache optimizations, bandwidth-friendly access of matrices and vector computation using SIMD units. We report these results and their implications on the hardware and software issues arising in supporting data-centric workloads in parallel and distributed environments.

URLhttp://www.stat.berkeley.edu/~mmahoney/pubs/cx_parlearn_ipdps_2016.pdf
ICSI Research Group

Big Data