Saturday, June 6, 2009

What Is Our Current Data Storage Capacity?

Storage capacity, current data volume, and rate of growth are the three minimum pieces of information required to determine when more storage capacity is needed. Organizations should keep close tabs on these three metrics. Many people speak of storage capacity in terms of how much disk drive capacity they have. Hard disks are not the only media data is stored on. Hard drives are commonly used for primary online storage and tape or other removable media (e.g. optical disk) for backup. Industry offers compelling technology and solutions to store data online or near-line on tape and optical based media but hard drives are most prevalent. When conducting enterprise capacity planning, planners should examine all tiers of storage. Ultimately, data should be categorized by business value and stored in a tier of storage that provides the appropriate level of performance and protection.

Considering that data storage costs comprise as much as 15 percent of IT operational spending and up to 20 percent of IT capital spending[i], storage capacity and data volumes are worthy of management attention. Enterprise IT usage policies and storage quotas are key controls for managing the growth of data. Strategic planning and development of an enterprise data life cycle policy and associated procedures can help organizations eliminate unneeded data and reclaim data storage space.

Management of storage capacity requires the ability to routinely monitor it. Manual monitoring of storage capacity is impractical in large environments. A storage resource management tool can provide automated monitoring and control of many storage resources. For more information on storage resource management tools, see my paper “How Much Data Do We Have?”

When examining hard drive media capacity, IT management needs to understand there is a significant difference between raw capacity and usable capacity. This is an extremely important fact to keep in mind when dealing with storage equipment vendor sales personnel. The Storage Networking Industry Association (SNIA) defines usable capacity of a disk as “…the total formatted capacity of the disk.” Formatted capacity does not include raw capacity reserved for metadata, disk size equalization, or check data. Usable capacity is the number to focus on when shopping for additional disk based storage.

The amount of usable capacity available from a given raw capacity varies depending upon how the disk or array of disks is configured. RAID (Redundant Array of Independent Disks) configuration has a significant impact on the ratio of raw space to usable capacity. For example an array of disks configured for RAID 1, in which all data is mirrored, will use two units of raw capacity for every unit of usable storage. Therefore an array with 50 TBs of raw capacity configured for RAID 1 will yield less approximately 25 TBs of usable capacity.

To read my entire white paper on this topic, go to http://www.storagestrategies.com/data_storage_strategies_whitepapershome.html

[i] Corporate Executive Board