Technical Reports

HPL-2009-38

Click here for full text: PDF

GPU-Accelerated Large Scale Analytics

Wu, Ren; Zhang, Bin; Hsu, Meichun
HP Laboratories

HPL-2009-38

Keyword(s): Data-mining, Clustering, Parallel Algorithm, GPU, GPGPU, K-Means Multi-core, Many-core

Abstract: In this paper, we report our research on using GPUs as accelerators for Business Intelligence(BI) analytics, with special interests on very large datasets, which are common in today's real world BI applications. While many published works have shown that GPUs can be used to accelerate various general purpose applications with respectable performance gains, few attempts have been done for very large problems. Our goal here is to investigate if the GPUs can be useful accelerators for BI analytics with very large datasets beyond GPU's onboard memory capacity. Using a popular clustering algorithm, K-Means, as an example, our results have been very positive. For datasets smaller than GPU's onboard memory, the GPU accelerated version is 6-12x faster than our highly optimized CPU only version running on a 8 core workstation, or 200-400x faster than the popular benchmark program, MineBench, running on single core. This is also 2-4x faster than the best reported work. For large datasets which exceed the GPU's memory capacity, we further showed that by carefully overlapping the computation on both CPU and GPU, as well as data transfers between them, the GPU accelerated version can still offers dramatic performance boost. For example, for a dataset with 100 million 2-d data points and 2 thousand clusters, the GPU accelerated version took about 6 minutes, compare to 58 minutes for CPU only version running on 8 core workstation. Compared to other approaches, GPU- accelerated analytics potentially provide the best raw performances, best cost-performance ratio, and the best energy performance ratio.

10 Pages

External Posting Date: March 6, 2009 [Fulltext]. Approved for External Publication
Internal Posting Date: March 6, 2009 [Fulltext]

Back to Index