This is the 290th article in the Spotlight on IT series. If you'd be interested in writing an article on the subject of backup, security, storage, virtualization, mobile, networking, wireless, cloud and SaaS, or MSPs for the series PM Eric to get started.

﻿Someone actually said to me, “What could go wrong?” Then I laughed. Not that something had gone wrong but because something may go wrong. We’ve all been there. We feel confident with a new system or collection of infrastructure we deploy. It’s a great feeling, isn’t it? We’ve solved a real problem for our organization with the new system or software we’ve deployed. Or maybe we’ve solved a real space, power and cooling issue by virtualizing our data center.

I never want to discredit anyone’s hard work, but the reality is it is one thing to set up something new and a totally different thing to deal with it over time. One of my good friends in the virtualization field Brian Kirsch is a virtualization architect in Milwaukee. He said it best: “I’m an architect. Supporting virtualized environments when things go wrong is a different expertise.”

Brian is exactly correct. Sure, we design for failure with great principles like multipathing of storage systems, dual controllers on almost everything, drive resiliency via RAID, redundant power supplies and more countless steps. But what happens when something DOES go wrong? What do we do?

Well, we in many situations are at the mercy of our backup strategy. That can be a good place to be, if we know what our options are. This comes from experience working very hard to identify different ways out of a jam. I’ve done this with hardware failures, data migrations and more. Simple idea brainstorming to solve a problem is very effective for normal situations, such as, “To move this datacenter from here to there, we’ll use this equipment and P2V this and that.” But does the same apply to getting out of a real problem when you need your backup solution to save the day?

That’s the piece of advice I’m going to focus on: Let’s have options. I’m assuming we are all mostly virtualized here with vSphere or Hyper-V, as that opens a lot of doors in our favor. We can move things around a lot easier, we have a lot more hardware independence, and, most importantly, it’s just cool.

For backup and restore scenarios, it’s a great way to use this same situational approach to addressing restores. If we deleted a simple file from a system on accident, do we need to bring back the whole virtual machine? What if that user AGAIN deleted an important email attachment? Again, do we restore the whole system to pluck out just what we need?

This is an opportunity for versatility to come to the rescue when things do indeed go wrong. And they will. One example is if the storage network fails. If it is the same network that transfers for the backups, you can see the issue. We’ve knocked out our primary resources and now the access to the backups is unavailable. Of course, facility issues are large reaching as well. What about when the issue is something global like an Active Directory domain? That can be catastrophic in regards to computer accounts being removed from logging in and key services such as email can’t send to users or administrators.

The takeaway here is to know what your options are. If you need to restore a file, make sure you can restore a file. If your backups are portable, make sure they are easily made available on a different storage resource; ideally on a different storage network. One example here is to use a SAN protocol such as Fibre Channel for production virtual machines and use iSCSI for backup storage. And when it comes to the backup software you use, make sure you take the time to identify all of the recovery scenarios provided. Not to just know that they are there, but go through the drill — even on meaningless virtual machines. It will do the following key things for you:

You will have a reasonable expectation of the behavior for that restore scenario

You will be familiar with the process

You’ll know it works

You’ll be less cautious when it comes to be prime time

I don’t like peeling back the onion and seeing what indeed could go wrong, but it is a worthwhile drill. What have you found along the way? What products make this easy for you? What applications are sticking points for you to get where you want to be? Share your stories below.