Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP

hp.com home


Technical Reports


printable version
» 

HP Labs

» Research
» News and events
» Technical reports
» About HP Labs
» Careers @ HP Labs
» People
» Worldwide sites
» Downloads
Content starts here

  Click here for full text: PDF

Architectural Sensitive Application Characterization: The Approach of High-Performance Index-Set (HP-Set)

Zhang, Zheng

HPL-2001-75

Keyword(s): shared memory multiprocessor architecture; performance evaluation

Abstract: Good simulation tools that provide architectural relevant insights play vital roles in building complex system such as shared-memory multiprocessors. In this report, we discuss HP-Set, a simulation tool that takes the core scheduling component of CIAT and integrates it with a set of statistic gathering probes that generate the corresponding index. HP-Set stands for High Performance index-Set. In a nutshell, HP-Set is a portfolio with its major indexes being the following: general statistics, coherent misses, data reuse and locality, granularity and the IO index. The objective of HP-Set is to be architectural sensitive and yet not to evolve into the role of a full functional simulator. We achieve the goal by getting rid of fancy statistics and by actually implementing relevant protocols that aim at optimizing certain aspects of the index. By comparing the index with and without the perturbation of the protocols, we will know not only how big the impact the index has on the overall performance, but also how likely we can improve them architecturally. Using HP-Set, we analyzed several commercial applications and obtained insights not available before. For example, our overall analysis points out that it's a common misconception that TPCC is more memory intensive than TPCD, the difference is rather due to their pressures on the memory system. Our communication index indicates that the third-party dirty hits dominate, and thus faster directory lookup and cache-to-cache transfer optimizations should be encouraged. On the other hand, significant number of false-sharing misses is and will continue to be a dominant performance factor. Our granularity analysis suggests that spatial localities of coherent objects are rather limited, and blind sequential prefetching might do more harm than benefit. Our IO-Memory analysis finds that IO contributes a non-negligible factor in total system traffic and is the major cause cache misses.

20 Pages

Back to Index

»Technical Reports

» 2009
» 2008
» 2007
» 2006
» 2005
» 2004
» 2003
» 2002
» 2001
» 2000
» 1990 - 1999

Heritage Technical Reports

» Compaq & DEC Technical Reports
» Tandem Technical Reports
Privacy statement Using this site means you accept its terms Feedback to HP Labs
© 2009 Hewlett-Packard Development Company, L.P.