A while back we had a couple of posts talking about application virtualization and server virtualization as part of our “What is Virtualization” series. We continue with our series now with the exploration of storage virtualization.
Storing data has been and will always be an issue for most companies, particularly given the rate at which storage requirements are increasing. Data storage is much more difficult than, say, storage of your own personal physical stuff.
To help me understand a bit better how storage virtualization works I go back to my condo reference. Assume that I live in a condominium complex with a bunch of other people. Since the units are a bit small, the complex offers storage units for the tenants’ use…a place to hold that extra bookcase, the leaf to your dining table, or your velvet Elvis painting. Unfortunately, each tenant gets only one storage unit. This means that if your unit is full it does you no good to know that your neighbor’s unit has nothing in it, because you have no access to it.
Now I like to travel, and when traveling I tend to pick up some knickknacks and tchotchkes from wherever I end up. I get back to my condo and find that my storage unit is already so full there is a note on my door from the fire marshal. So what do I do? I do what most folks do: No, I don’t get rid of anything (are you kidding?) – instead I rent a storage unit from the self-storage guys down the street. Now I have stuff in 2 different places. After a few years I have to get a third unit – but the unit down the street is full so I have to rent yet another storage unit across town. Clearly I have a problem: Aside from the obvious problem of being a pack rat, I have 3 storage units in 3 different locations. What happens if I need to find something that I’ve stored? Will I even know where to look? (Clue: Probably not, because if I was sufficiently organized to keep a record of what stuff is in which storage unit, I probably wouldn’t be the kind of person to accumulate that much stuff in the first place!)
How does this apply to your business data? Well, one of the biggest problems we run into when trying to organize data in the traditional way, where each server has its own local storage, is that Murphy’s Law dictates that free storage space never exists in the server that needs it. If my Exchange Server is running out of disk space, it does me no good to have 200 Gb free in my file server – just as it does me no good to know that your condo storage space is empty if mine is full. In fact, it’s even worse. If my condo storage space is full, and I really trust you, I might make a deal with you to use some of your condo storage space – but there’s just no way to make that free space in your file server available to your Exchange Server.
“Storage virtualization” refers to the process of taking a bunch of physical disks and turning them into a central “storage pool,” portions of which can then be allocated back to your individual servers in such a way that they believe that the storage is local to them when, in fact, it is not. This separation of the drives from the individual servers is the key to storage virtualization and its benefits.
Since the drives are now managed as one large pool it is possible to perform tasks that previously were not possible, such as the migration of data between drives without down time, or being able to allocate storage on demand to the servers that need it. Storage virtualization allows you to perform these helpful tasks from a single management point. We generally refer to this kind of storage virtualization system as a “Storage Area Network,” or “SAN.”
“Thin-provisioning” makes it even better. Instead of trying to guess in advance how much storage to allocate to each server, and then potentially having to adjust things later, I can tell each server that it has way more storage than is actually available. For example, I might tell ten different servers that they each had access to a terabyte of storage when, in fact, I only had a total of two or three terabytes of physical space in my storage pool. I then let my SAN dynamically allocate the physical storage to the servers that need it – but only allocate as much physical storage as necessary to store the actual data. The SAN will then alert me when I get close to running out of physical space so I can increase the size of my storage pool.
If my condo implemented storage virtualization, I imagine it would work like this. I would have one key and one drop off spot for all my storage items. Once I drop off my velvet Elvis I wouldn’t have to worry about where it would be stored and how much space it would take up. The storage management elves would find a place for it, and fetch it for me again when I requested it. Since I have so much stuff and my neighbor hardly has any we may end up sharing a storage closet, but neither of us knows, or cares. Heck, our stuff may be scattered across every storage closet in the complex.
After a bit of traveling, I may have enough stuff to fill up 2 storage closets all by myself. But I can still bring it to the general drop off location and not need to worry about which closet it goes in…because to me it looks like there is only one big closet. If management decides it would be easier to manage my stuff if it was all together in one room, they may elect to put all my stuff in a new, larger storage unit. All this would happen without me knowing or caring where my stuff is physically located.
Storage virtualization is not the newest technology out there – in fact, people were deploying SANs for all of the reasons listed above long before server virtualization became a big deal. But storage virtualization enables many of the coolest server virtualization features, such as live motion – the ability to migrate a running VM from one virtualization host to another. And we haven’t even begun to talk about the additional tools you may have for data protection, backup, and disaster recovery, such as the ability to leverage SAN replication to automatically send a copy of your critical data off-site. At the end of the day storage virtualization is great tool to save time, improve hardware utilization, increase agility, and most importantly save you money.
Your data needs to be organized and secure…just like my personal stuff. In fact, protecting your data is arguably more important, because the condo burns down I can probably get an insurance check for my physical stuff. But simply monetary damages may not be sufficient if you lose your data. (Can you even put a price tag on your data? Hint: How much is your business worth? Hint #2: What’s it worth to stay out of jail for violating laws on record retention?) Storage virtualization gives you another big toolbox full of tools to help you organize and secure it.
However it is still not an excuse to not throw a few things out every now and then.