RAID and Disk Size - Search for Performance

Centralizing your storage is always a very good idea - you can manage storage requirements of most servers through a central storage system, without the hassle of juggling local disks within servers.

But centralizing a storage opens a whole new world of hassles:

  • Physical limits- depending your choice of vendor and class of storage you may be limited by number available slots for drives
  • Technical limits- depending your choice of vendor and class of storage, it may support hundreds od drives, but not with your current CPU's or cache memory
  • Higher costs - everything within the storage costs - physical drives, CPU's, cache memory, drive bays, licenses for storage management software. And all these usually have exorbitant prices.
So when looking for a storage, there is always a tug of war: limited budget vs functionality, drive space and performance.

Let's discuss all three elements countering the budget:

  • Functionality - this are covers overall management, non-disruptive OS upgrades, point-in-time snapshots, point-in-time clones, replication functionality etc. These are very easy to declare as requirement by the client, and leave very little 'wiggle space' for the storage vendors to try to sell something else or reduce the price at the RFP by reducing .
  • Drive Space and Performance - Here is the conflict between storage vendors and clients: Storage vendors do not sell space and rarely sell performance, they sell hard drives. And everything in their portfolio (cache, slots, licenses) is based on physical drives. So they will always push the client into a 'number of drives' mentality. This is wrong, the client needs to think in terms of useable space and Input/Output Operations per Second (IOPS), because at the end of the day, the servers do not care that you have 20 drives, when they see only 100GB of partition and only 200 IOPS when they need 1000. And here we hit the problem of balance - as you are well aware, a storage can provide different levels of data protection through redundancy or parity, at the cost of physical capacity and performance.
When declaring your useable space, you need to either declare the number of IOPS that it needs to support or (which is very difficult) or to declare a RAID level. Since estimating actual IOPS requirement is difficult, you can always approach it with a 'I need a better functionality then I have at the moment'. This is very easy to achieve with the Wmarow's IOPS calculator:

  1. Input the parameters for number of drives and raid level that is currently servicing your server.
  2. Then input the estimated number of drives and organization (RAID) that you are thinking of buying.
  3. Compare the IOPS results.
  4. If you are migrating more servers to one RAID group, add up all initial IOPS and compare to the one resulting IOPS
  5. You need to achieve a better IOPS result for the target then currently, by at least 50%

The results will vary wildly, based on number and type of drives, as well as RAID level. We have calculated a sample of IOPS results for a 2 TB capacity drive using different RAID levels and disk drives, with an assumption of using a small storage with only 16 slots for disks (click the image for large version):

Please note that the actual IOPS result of a certain storage system may be different in absolute value, because of processor power, advanced algorithms and cache memory. But regardless of these attributes, the relative ratio between the produced IOPS will remain the same - RAID0 will be always 3 times faster then RAID5 on same drives.

Also, please note that no matter what the abilities of the storage system that you are looking at, there are physical limitations to each disk, and these cannot be overcome by any amount of cache, intelligent algorithms or processing power of the storage system.

In conclusion, since the absolute value of different storage system may be different, what is the best way for a client to be certain that he/she will receive the balance of protection and performance that is needed ? There are two options:
  1. Test the configuration. If someone wants to sell a storage, he/she should be able to create a same configuration storage at a lab environment, and you then generate a full load of performance and load testing of the configuration
  2. Ask for a guarantee - give the salespeople the parameters of the services on the servers (databse, file servers etc.). These can be collected through performance monitor and database tools. Then make the vendor guarantee with financial penalties that any of the functions will perform two times faster (or any other parameter) with the same servers.

Talkback and comments are most welcome

Related posts
Choosing a System Integrator - Follow the money
Cloud Computing - Premature murder of the datacenter

Designed by Posicionamiento Web