Selecting RAID levels for disk arrays

Eric Anderson, Ram Swaminathan, Alistair Veitch, Guillermo Alvarez and John Wilkes

Abstract:

Disk arrays have a myriad of configuration parameters that interact in counter-intuitive ways, and those interactions can have significant impacts on cost, performance, and reliability. Even after values for these parameters have been chosen, there are exponentially-many ways to map data onto the disk arrays logical units. Meanwhile, the importance of correct choices is increasing: storage systems represent an growing fraction of total system cost, they need to respond more rapidly to changing needs, and there is less and less tolerance for mistakes. We believe that automatic design and configuration of storage systems is the only viable solution to these issues. To that end, we present a comparative study of a range of techniques for programmatically choosing the RAID levels to use in a disk array.

Our simplest approaches are modeled on existing, manual rules of thumb: they tag data with a RAID level before determining the configuration of the array to which it is assigned. Our best approach simultaneously determines the RAID levels for the data, the array configura-tion, and the layout of data on that array. It operates as an optimization process with the twin goals of minimizing array cost while ensuring that storage workload perfor-mance requirements will be met. This approach produces robust solutions with an average cost/performance 14 17% better than the best results for the tagging schemes, and up to 150-200% better than their worst solutions.

We believe that this is the first presentation and systematic analysis of a variety of novel, fully-automatic RAID-level selection techniques.


How to view this document:


Last modified: Tue Jul 10 21:36:53 PDT 2001 by Alistair Veitch (aveitch@hpl.hp.com)