Technical Reports

HPL-2009-175

Click here for full text: PDF

Enhancing and Optimizing a Data Protection Solution

Cherkasova, Ludmila; Lau, Roger; Burose, Harald; Kappler, Bernhard
HP Laboratories

HPL-2009-175

Keyword(s): backup, workload analysis, dynamics, enterprise information assets analysis, optimization, performance modelling

Abstract: Analyzing and managing large amounts of unstructured information is a high priority task for many companies. For implementing content management solutions, companies need a comprehensive view of their unstructured data. In order to provide a new level of intelligence and control over data resident within the enterprise, one needs to build a chain of tools and automated processes that enable the evaluation, analysis, and visibility into information assets and their dynamics during the information life- cycle. We propose a novel framework to utilize the existing backup infrastructure by integrating additional content analysis routines and extracting already available filesystem metadata over time. This is used to perform data analysis and trending required for adding performance optimization and self- management capabilities to backup and information management tasks. Backup management faces serious challenges on its own: processing ever increasing amount of data while meeting the timing constraints of backup windows could require adaptive changes in backup scheduling routines. We revisit a traditional backup job scheduling and demonstrate that random job scheduling may lead to inefficient backup processing and an increased backup time. In this work, we use a historic information about the object backup processing time and suggest an additional job scheduling, and automated parameter tuning which may significantly optimize the overall backup time. Under this scheduling, called LBF, the longest backups (the objects with longest backup time) are scheduled first. We evaluate the performance benefits of the introduced scheduling using a realistic workload collected from the seven backup servers at HP Labs. Significant reduction of the backup time (up to 30%) and improved quality of service can be achieved under the proposed job assignment policy.

10 Pages

Additional Publication Information: Published in Proceedings of the 17th IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'2009), London, UK, September 21-23, 2009.

External Posting Date: December 16, 2009 [Fulltext]. Approved for External Publication
Internal Posting Date: August 6, 2009 [Fulltext]

Back to Index