Evaluating Hyper-V Backup Storage Solutions

It’s not difficult to find recommendations about what storage to use with your backup solution — from the people that make backup storage solutions. I certainly can’t begrudge a company trying to turn a coin by promoting their products, but it’s also nice to get some neutral assistance. What I won’t do is throw a list of manufacturers at you and send you on your way; that doesn’t help anyone except the manufacturers on the list. What I am going to do is give you guidance on how to analyze your situation to determine what solutions are most applicable to you which gives you the ability to select the manufacturer(s) on your own terms. I’m also going to show you some possible backup designs that might inspire you in your own endeavors.

Needs Assessment

The very first thing to do is determine what your backup storage needs are. Most people are not going to be able to work from a simple formula such as: we have 1 TB of data so we need 1 TB of backup storage. Figure out the following two items first:

How long does any given bit of data need to be stored?

Do we only need the most recent copy of that data or do we need a historical record of the original and changed versions? For instance, if your CRM application is tracking all customer interactions and you do not purge data from it, how many backups of that data are necessary to meet your data retention goals?

As you are considering these, be mindful of any applicable legal regulations. This is especially true in finance and related fields, such as insurance. Do not try to get everything absolutely perfect in this first wave. This is the part where you prioritize your data and determine what its lifespan should be. You’ll need to have a decent understanding of the concepts in the next section before you can begin architecting your solution.

If you need to brush up on any of the basics to help you complete your needs assessment, we have an article that covers them.

Backup Storage Options

Alongside a needs assessment, you need to know what storage options are available to you. This will guide you to your final design. At a high level, the options are:

Non-hard disk media, such as tape and optical

Portable disk drives

Solid state media

Permanently-placed disk drives

Over-the-network third-party provider storage

Non-Hard Disk Media

Disk drives have precision internal mechanical components and electronics that fail. The conventional decades-old wisdom has been to copy data to some other type of media in order to protect it from these shortcomings. The two primary media types that fall into this category are tapes and optical systems.

Pros of Non-Hard Disk Media

Tried-and-true

Vendors have specialized to the particular needs of backup and restoration

Portable

Durable long-term storage (tape)

Relatively inexpensive long-term storage

Cons of Non-Hard Disk Media

Expense (tape)

Special drives and software are needed, which may fail and/or become obsolete while the media is still viable

Easily damaged

Very slow recovery process

Tape is the traditional king of backup media and is going strong today. It’s not very fast, but it’s highly portable, well-understood, and usually provides a solid ratio of expense, risk, and protection. It can be very expensive, however, but it’s typically the tape drive that drives the cost up the most. Media costs vary; smaller is cheaper, obviously, but there is also a difference in formats. DAT drives are cheaper than LTO drives, but even the highest capacity DAT tapes are nowhere near as large as most same-generation LTO tapes.

Tape must be cared for properly — it absolutely must be kept away from electromagnetic fields and heat. Tapes should be stored upright, preferably in a shielded container designed specifically for holding backup tapes. If these precautions are followed, tapes can easily last a decade. That said, the drives that can read a particular tape have a much shorter lifespan and you might have trouble finding a working drive that can read those old tapes. I’ve also run into issues where I had a good tape and a tape drive that was probably good enough to read it, but we couldn’t locate the software that recorded it. If you’re looking to hold onto backups for a very long time, tape has the highest shelf lifespan-to-cost ratio of all backup media.

Optical backup media popped up as an inexpensive alternative to tape. Optical drives are much cheaper than tape drives and optical media provides the same capacity at a fraction of the price of tape. However, optical media’s star never burned very brightly and dimmed very quickly. Optical media backups are very slow, the capacity-per-unit is not ideal, and durability is questionable. Optical media does have the ability to survive in electromagnetic conditions that would render tape useless, but is otherwise inferior. Unless you only have an extremely small amount of data to protect and your retention needs are no more than a few years, I would recommend skipping optical media.

A very large problem with tape and other non-disk media is that restoring data is a time-consuming process. If you want to restore just a few items, it will almost undoubtedly take far longer to locate that data on media than it will to restore it.

Portable Disk Drives

In my mind, portable disk drives are a relative newcomer in the backup market, although it has occurred to me that many of you have probably used them your entire career.

Pros of Portable Disk Drives

Inexpensive

Reasonably durable

Portable

Common interfaces that are likely to still be usable in the years to come

Relatively quick recovery process

Cons of Portable Drives

Mechanical and electronic failure can render data inaccessible except by specialized, expensive processes

Long-term offline storage capability is not well-known

Drive manufacturers do not tailor their products to the backup market (although some enclosure manufacturers do)

Fairly expensive long-term storage

The expense and physical size of portable drives have shrunk while their bandwidth and storage capacity have grown substantially, making them a strong contender against tape. Their great weakness is a reliance on internal mechanisms that are more delicate and complicated than tape, not to mention their electronic circuitry. Most should be well-shielded enough that minor electromagnetic and static electricity fields should not be of major concern.

What you must consider is that tape drives have been designed around the notion of holding their magnetic state for extended periods of time; if kept upright in a container with even modest shielding, they can easily last a decade. Hard drives are not designed or built to such standards. You’ll hear many stories of really old disks pulled out of storage and working perfectly with no data loss — I have several myself. The issue is that those old platters did not have anything resembling the ultra-high bit densities that we enjoy today. What that means is that the magnetic state for any given bit might have degraded a small amount without affecting the ability of the read/write head to properly ascertain its original magnetic state. The effects of magnetic field degradation will be more pronounced on higher density platters. I do not have access to any statistics, primarily because these ultra-high capacity platters haven’t been in existence long enough to gather such information, but at this time, I personally would not bank on a very large stored hard drive keeping a perfect record for more than a few years.

Hard disks that are rotated often will suffer mechanical or electronic failure long before magnetic state becomes a concern. A viable option is to simply swap new physical drives in periodically. If you want to use hard drives for very long-term offline storage, add it to your maintenance cycle to spin up old drives and transfer their contents to new drives that replace them.

Solid State Media

The latest entry in the backup market is solid state media. The full impact of solid state on the backup market has not yet been felt. I expect that it will cause great changes in the market as costs decline.

Pros of Solid State Media

Extremely durable

Fast (newer types)

Very portable

Cons of Solid State Media

Its high cost-to-capacity ratio is the primary reason that it has not overtaken all other media types. It is far more durable and some types are faster than both disk and tape. If you can justify the expense, solid state is the preferred option.

Permanently-Placed Disk Drives

Another option that has only become viable within the last few years is storage that never physically moves, such as NAS devices.

Pros of Permanently-Placed Drives

Very high reliability and redundancy — dependent upon manufacturer/model

High performance

Can be physically secured and monitored

Cons of Permanently-Placed Drives

High equipment expense

Best used with multi-site facilities

Dependent upon speed and reliability of inter-site connections

Loss of the primary business location and theft of backup media are ever-present concerns; the traditional solution has been to transport backup tapes offsite to a secured location, such as a bank safety deposit box (or somebody’s foyer credenza, that sort of thing happens a lot more often than many will admit). With the cost of Internet bandwidth declining, we now have the capability to transmit backup data over the wire to remote locations in a timely and secured fashion.

While I do not recommend it, it would theoretically be acceptable to use on-premises permanent disk drives for very short-term backup storage. This would allow for a very short RTO to address minor accidents. As long as it is made abundantly clear to all interested parties that such a solution is equally vulnerable to anything that threatens the safety and security of the site, there are viable applications for such a solution.

Over-the-Network Third Party Provider Storage

The primary distinguishing factor between this category and the prior entry is ownership. You can pay someone else to maintain the facility and the equipment that houses your offsite copies.

Pros of Third-Party Offsite Providers

In theory, it is a predictable recurring expense

Potential for additional included services at a lower expense than a do-it-yourself solution

Full-time subject-matter experts maintain your data for you

Cons of Third-Party Offsite Providers

In theory, providers could make dramatic changes in pricing and service offerings and effectively hold your data and reliability of storage hostage

Trust and integrity concerns

You may not be able to control the software and some other components of your backup strategy

There are several enticing reasons to work with offsite backup providers. Many offer additional services, such as hosting your data in a Remote Desktop Services environment as a contingency plan. Truthfully, I believe that the primary barrier in the cloud-based storage market is trust. Several of the organizations offering these services are “fly-by-night” operations trying to turn a fast dollar by banking on the fact that almost none of their customers will ever need to rely on their restoration or hosting services. I also don’t think that the world is soon going to forget how Microsoft did everything but make it a requirement that we sync our Windows profiles into Onedrive and then radically increased the costs of using the service. Large service providers can do that sort of things to their customers and survive the fallout.

You can approach third-party offsite storage in two ways:

A full-service provider that supplies you with software that operates on your equipment and transmits to theirs

A no-/low-frills provider that supplies you with a big, empty storage space for you to use as you see fit

What you receive will likely have great correlation with what you spend.

Designing a Backup Storage Solution

At this point, you know what you need and you know what’s available. All that’s left is to put your knowledge to work by designing a solution that suits your situation.

Backup Strategies

Let’s start with a few overall strategies that you can take.

Disk-to-Tape (or other Non-Disk Media)

This is the oldest and most common backup method. You have a central system that dumps all of your data on tape (or some other media, such as optical) using a rotation method that you choose.

Disk-to-Disk

A more recent option is disk-to-disk. Your backup application transfers data to portable disks which are then transferred offsite or to a permanent installation, hopefully in another physical location.

Disk-to-Disk-to-Tape

A somewhat less common method is to first place regular backups on disk. At a later time these backups, or a subset of them, can be transferred to tape. This gives you the benefit of rapidly recovering recent items while keeping fairly inexpensive long-term copies. You wouldn’t need to rotate as many tapes through, and the constant rewriting of the disks mean that they won’t be expected to retain their data for long.

Disk-to-Local-to-Offsite

Another recent option that can serve as a viable alternative to tape is first backing up data locally, then transferring it to offsite long-term storage, whether its a site that you control or one owned by a third-party provider. This type of solution eliminates the need to manually move data by entrusting someone to physically carry media. In order for this type of solution to be viable, you must have sufficient outbound bandwidth to finish backups in a sufficiently small time frame.

Disk-to-Offsite

You could also opt to transfer your data directly offsite without keeping a local copy. This approach is essentially the same as the previous, but there’s nothing left at the primary location.

Backup Storage Examples

Let’s consider a few real-world-style examples.

Scenario 1

4 virtual machines 1 domain controller 1 file/print VM 1 application VM 1 SQL VM

300 GB total data

Cloud or ISP-based e-mail provider

No particular retention requirements

Uses line-of-business software with a database

This example is a fair match for a large quantity of small businesses. Some might have mixed their roles into fewer VMs and most will have somewhat different total backup data requirements, but this should scenario should have a large applicability base.

I would recommend using a set of portable hard drives in a rotation. I’d want a solid monthly full backup and a weekly full with at least two drives rotated daily. If using a delta-style application like Altaro VM Backup, the daily deltas are going to be very small so you won’t need large drives. Keeping historical data is probably not going to be helpful as long as at least one known good copy of the database exists.

If budget allows, I would strongly encourage using an offsite or third party storage-only provider to hold the monthly backups.

Probably the biggest thing to note here is that retention isn’t really an issue.

Scenario 2

4 virtual machines 1 domain controller 1 file/print VM 1 application VM 1 SQL VM

300 GB total data

Cloud or ISP-based e-mail provider

No particular retention requirements

Uses line-of-business software with a database

The layout here is the same as Scenario 1. As small as this is, it would be a good candidate for direct offsite transmission. Most backup applications that allow for such a thing allow a “seed” backup. You copy everything to a portable disk, have the disk physically transported to the destination backup site, then place that backup onto permanently-placed storage. From then on, nightly backups are just deltas from that original. Small businesses typically do not have a great deal of daily data churn, so this is a viable solution.

Scenario 3

4 virtual machines 1 domain controller 1 file/print VM 1 application VM 1 SQL VM

300 GB total data

Cloud or ISP-based e-mail provider

5-year retention requirements for financial data

Uses line-of-business software with a database

This is the same as the first scenario, only now we have a retention requirement. To figure out how to deal with that, we need to know what qualifies as “financial data”. If your accountant keeps track of everything in Excel, then those Excel files probably qualify. If it’s all in the line-of-business app and it holds financial records in the database for at least five years before purging, then you probably don’t need to worry about retention in backup.

I want to take a moment here to talk about retention, because I’ve had some issues getting customers to understand it in the past. If you’ve got a 5-year retention requirement, that typically means that you must be able to produce any data that was generated within the last five years. It does not necessarily mean that you need to have every backup ever taken for the last five years. If I created a file in December of 2012 and that file is still sitting on my file server, then it was included in the full backup that I took on Sunday, July 4th, 2016. I don’t need to produce an old backup. Retention mainly applies to deleted and changed data. So, in more real-world terms, if all of the data that is in scope for your retention plan is handled by your line of business application and it is tracking changes in the database for at least as far back as the retention policy, then the only thing that you need old backups for is if you suspect that people are purging data before it reaches its five-year lifespan. That’s a valid reason and I won’t discount it, but I also think it’s important for customers to understand how retention works.

Let’s say that the data applicable to the long-term retention plan is file-based and is not protected in the database. In that case, I would recommend investigating options for capturing annual backups. Retain twelve monthly backups and keep one per year. Annual backups can be discarded after five years. My preference for storage of annual backups:

Third-party offsite storage provider Self-hosted offsite permanent disk storage Portable hard disk Tape

Remember that we’re talking about up to 5 TB of long-term storage, although I wouldn’t recommend trying to keep 100% of the 300 GB in each annual backup. 5 TB of offline storage is not expensive (unless you’re buying a tape drive just for that purpose), so this should be a relatively easily attainable goal.

Scenario 4

7 virtual machines 2 domain controllers 1 file/print VM 2 application VMs 1 SQL VM 1 Exchange VM

1.2 TB total data

5-year retention requirements for financial data

This is a larger company than the preceding and it’s got some different requirements. The first thing to sort out will be what the 5-year retention requirement applies to and if it can be met just by ensuring that there is a solid copy of the database. Read the expanded notes for scenario 2, as they would apply here.

Truthfully, I would follow generally the same plan as in scenario 2. The drives would need to be larger, of course, but 1.2 TB in a single backup is very small these days. With applications such as Altaro VM Backup able to target multiple storage drives simultaneously, this system could grow substantially before portable disks become too much of a burden for a nightly rotation. This is in contrast to my attitude from only a few years ago, when I would have almost undoubtedly installed a tape drive and instituted something akin to a GFS rotation.

Scenario 5

Let’s look at a larger institution.

25 virtual machines Multiple domain controllers Large central file server Multiple application servers Highly available Exchange Highly available SQL

10 TB total data

5-year retention plan; financial only by law, but CTO has expanded the scope to all data

Honestly, even though it seems like there is a lot going on here, 10 TB is much more than most installations that fit this description will realistically be using. But, I wanted to aim large. This scenario is probably not going to be well-handled by portable drives unless you have someone on staff that enjoys carting them around and plugging them in. Even tape is going to struggle with this unless you’ve got the money for a multi-drive library.

Here’s what I would recommend:

A data diet. 10 TB? Really?

A reassessment of the universal 5-year retention goal

2 inexpensive 8-bay NAS devices, filled with 3 TB SATA disks in RAID-6, with a plan in the budget to bring in a third and fourth NAS

Part of this exercise is to encourage you to really work on assessing your environment, not just nodding and smiling and playing the ball as it lies. Ask the questions, do the investigations, find out what is really necessary. The last thing that you want to do is back up someone’s pirated Blu-Ray collection and then store it somewhere that you’re responsible for. “Employment gap to fulfill a prison sentence due to activities at a previous employer” is an unimpressive entry on a resume. Also, be prepared to gently challenge retention expectations. Blanket statements are often issued in very small and very large institutions because it sometimes costs them more to carefully architect around an issue than it does to just go all-in. Organizations in the middle can often benefit from mixed retention policies. So, before you just start drawing up plans to back up that 10 TB and keep it for 5 years, find out if that’s truly necessary.

My third bullet point assumes that you discover that you have 10 TB of data that needs to be kept for 5 years. That does happen. I’m also working from the assumption that any organization that needs to hold on to 10 TB of data has the finances to make that happen. I would configure the first NAS as a target for a solid rotation scheme similar to GFS with annuals. Use software deltas and compression to keep from overrunning your storage right away. All data should be replicated to the second NAS which should live in some far away location connected by a strong Internet connection. As space is depleted on the devices, bring in the second pair of NAS devices — by that time, 4 TB drives (or larger) might be a more economical choice.

I would also recommend bringing in a second tier of backup for long-term storage. That might take the form of an offsite provider or tape.

Hopefully, though, you discover that you really don’t need to backup 10 TB of storage and can just follow a plan similar to scenario 3.