RAID and Disk Size - Search for Performance

Centralizing your storage is always a very good idea - you can manage storage requirements of most servers through a central storage system, without the hassle of juggling local disks within servers.

But centralizing a storage opens a whole new world of hassles:

  • Physical limits- depending your choice of vendor and class of storage you may be limited by number available slots for drives
  • Technical limits- depending your choice of vendor and class of storage, it may support hundreds od drives, but not with your current CPU's or cache memory
  • Higher costs - everything within the storage costs - physical drives, CPU's, cache memory, drive bays, licenses for storage management software. And all these usually have exorbitant prices.
So when looking for a storage, there is always a tug of war: limited budget vs functionality, drive space and performance.

Let's discuss all three elements countering the budget:

  • Functionality - this are covers overall management, non-disruptive OS upgrades, point-in-time snapshots, point-in-time clones, replication functionality etc. These are very easy to declare as requirement by the client, and leave very little 'wiggle space' for the storage vendors to try to sell something else or reduce the price at the RFP by reducing .
  • Drive Space and Performance - Here is the conflict between storage vendors and clients: Storage vendors do not sell space and rarely sell performance, they sell hard drives. And everything in their portfolio (cache, slots, licenses) is based on physical drives. So they will always push the client into a 'number of drives' mentality. This is wrong, the client needs to think in terms of useable space and Input/Output Operations per Second (IOPS), because at the end of the day, the servers do not care that you have 20 drives, when they see only 100GB of partition and only 200 IOPS when they need 1000. And here we hit the problem of balance - as you are well aware, a storage can provide different levels of data protection through redundancy or parity, at the cost of physical capacity and performance.
When declaring your useable space, you need to either declare the number of IOPS that it needs to support or (which is very difficult) or to declare a RAID level. Since estimating actual IOPS requirement is difficult, you can always approach it with a 'I need a better functionality then I have at the moment'. This is very easy to achieve with the Wmarow's IOPS calculator:

  1. Input the parameters for number of drives and raid level that is currently servicing your server.
  2. Then input the estimated number of drives and organization (RAID) that you are thinking of buying.
  3. Compare the IOPS results.
  4. If you are migrating more servers to one RAID group, add up all initial IOPS and compare to the one resulting IOPS
  5. You need to achieve a better IOPS result for the target then currently, by at least 50%

The results will vary wildly, based on number and type of drives, as well as RAID level. We have calculated a sample of IOPS results for a 2 TB capacity drive using different RAID levels and disk drives, with an assumption of using a small storage with only 16 slots for disks (click the image for large version):

Please note that the actual IOPS result of a certain storage system may be different in absolute value, because of processor power, advanced algorithms and cache memory. But regardless of these attributes, the relative ratio between the produced IOPS will remain the same - RAID0 will be always 3 times faster then RAID5 on same drives.

Also, please note that no matter what the abilities of the storage system that you are looking at, there are physical limitations to each disk, and these cannot be overcome by any amount of cache, intelligent algorithms or processing power of the storage system.

In conclusion, since the absolute value of different storage system may be different, what is the best way for a client to be certain that he/she will receive the balance of protection and performance that is needed ? There are two options:
  1. Test the configuration. If someone wants to sell a storage, he/she should be able to create a same configuration storage at a lab environment, and you then generate a full load of performance and load testing of the configuration
  2. Ask for a guarantee - give the salespeople the parameters of the services on the servers (databse, file servers etc.). These can be collected through performance monitor and database tools. Then make the vendor guarantee with financial penalties that any of the functions will perform two times faster (or any other parameter) with the same servers.

Talkback and comments are most welcome

Related posts
Choosing a System Integrator - Follow the money
Cloud Computing - Premature murder of the datacenter

Maintaining quality in outsourcing telco services

More and more IT services are being outsourced. And as telco services are now easily integrated and transported over IP protocols, the outsourcing is being well established with telco.

But the issue with telco services is that quality in telco is very difficult to properly define. This is because there are parameters that are difficult to track – sound quality, response of system to tone-dial menu selection of an IVR, unexpected intermittent interruptions of voice communication, temporarily unavailable service.
And when part of the telco service is outsourced, it becomes even more difficult to manage the quality of such services.

Here are some elements that will affect the quality of outsourced telco services:

  1. Oversubscription to outsourcing service – the service may be of a variable quality, with off and on periods when service is poor and then it’s great. This is usually connected to oversubscription of the outsourcing service, and when their services are overloaded, the customer facing service is of poor quality.
  2. Availability of the oursourcing servers – simple and straightforward, power outages, server outages, cooling outages all create failures that interrupt service. Even if there are secondary servers, the switchover will fail all active connections
  3. Connectivity to outsourcing service - most outsourcing services are far and away, most often in asia. So internet links will be the primary connectivity media to such outsourcing services. But the internet as a medium has a lot of possible issues and failures of connectivity paths are not that rare.

When the outsourcing service is part of your call management, things get very interesting. Services that are part of the call management process that are easily outsourced are ringback tone, voice mail, autoanswer etc.

How to solve this issue of quality when outsourcing? There is no magic bullet, but here are some experiences and pointers:
  • Ofcourse, you will create the standard contract with availability, packet loss and jitter criteria. (see related posts)
  • You can also include call disconnects or failure to connect.
  • It would be very good to try to connect this to customer complaint number, but the outsourcing service will be very reluctant to accept a quality of service condition is connected to a very subjective criteria that cannot be measured and confirmed by both parties independently.
  • Create a criteria of complaint to outsourcing service - for example, if the telco customer detects issues that are so large that they need to send a complaint to their outsourcing service more then 4 times every quarter, that would be a basis for a contract review. This clause is very wise to include especially in the first year of use of the outsourcing service, when you are still learning their weak points

Talkback and comments are most welcome

Related posts
Telco SLA - parameters and penalties
Is the Phone Working? - Alternative Telephony SLA
5 SLA Nonsense Examples - Always Read the Fine Print

Designed by Posicionamiento Web