Changing Data Center Workloads

February 21, 2014

Networking-wise, I’ve spent my career in the data center. I’m pursuing the CCIE Data Center. I study virtualization, storage, and DC networking. Right now, the landscape in the network is constantly changing, as it has been for the past 15 years. However, with SDN, merchant silicon, overlay networks, and more, the rate of change in a data center network seems to be accelerating.

Things are changing fast in data center networking. You get the picture

Whenever you have a high rate of change, you’ll end up with a lot of questions such as:

Where does this leave the current equipment I’ve got now?

Would SDN solve any of the issues I’m having?

What the hell is SDN, anyway?

I’m buying vendor X, should I look into vendor Y?

What features should I be looking for in a data center networking device?

I’m not actually going to answer any of these questions in this article. I am, however, going to profile some of the common workloads that you find in data centers currently. Your data center may have one, a few, or all of these workloads. It may not have any of them. Your data center may have one of the workloads listed, but my description and/or requirements is way off. All certainly possible. These are generalizations, and with all generalizations your mileage may vary. With that disclaimer out of the way, strap in. Let’s go for a ride.

Traditional Virtualization

It’s interesting to say that something which only exploded into the data center in a big way in about 2008 as now being “traditional”, rather than “new-fangled”. But that’s the situation we have here. Traditional virtualization workload is centered primarily around VMware vSphere. There are other traditional virtualization products of course, such as Red Hat’s RHEV, Xen, and Microsoft Hyper-V, but VMware has the largest market share for this by far.

Latency is not a huge concern (30 usecs not a big deal)

Layer 2 adjacencies are mandatory (required for vMotion)

Large Layer 2 domains (thousands of hosts layer 2 adjacent)

Converged infrastructures (storage and data running on the same wires, FCoE, iSCSI, NFS, SMB3, etc.)

Buffer requirements aren’t typically super high. Bursting isn’t much of an issue for most workloads of this type.

Fibre Channel is often the storage protocol of choice, along with NFS and some iSCSI as well

Cisco has been especially successful in this realm with the Nexus line because of vPC, FabricPath, OTV and (to a much lesser extent) LISP, as they address some of the challenges with workload mobility (though not all of them, such as the speed of light). Arista, Juniper, and many others also compete in this particular realm, but Cisco is the market leader.

With the multi-pathing Layer 2 technologies such as SPB, TRILL, Cisco FabricPath, and Brocade VCS (the latter two are based on TRILL), you can build multi-spine leaf/spine networks/CLOS networks that you can’t with spanning-tree based networks, even with MLAG.

This type of network is what I typically see in data centers today. However, there is a shift towards Layer 3 networks and cloud workloads over traditional virtualization, so it will be interested to see how long traditional virtualization lasts.

VDI

VDI (Virtual Desktops) are a workload with the exact same requirements as traditional virtualization, with one main difference: The storage requirements are much, much higher.

Latency is not as important (most DC-grade switches would qualify), especially since latency is measured in milliseconds for remote desktop users

Layer 2 adjacencies are mandatory (required for vMotion)

Large Layer 2 domains

Converged infrastructures

Buffer requirements aren’t typically very high

High-end storage backends. All about the IOPs, y’all

For storage here, IOPs are the biggest concern. VDI eats IOPs like candy.

Legacy Workloads

This is the old, old school. And by old school, I mean late 90s, early 2000s. Before virtualization changed the landscape. There’s still quite a few crusty old servers, with uptimes measured in years, running long-abandoned applications. The problem is, these types of applications are usually running something mission critical and/or significant revenue generating. Organizations just haven’t found a way out of it yet. And hey, they’re working right now. Often running on proprietary Unix systems, they couldn’t or wouldn’t be migrated to a virtualized environment (where it would be much easier to deal with).

The hardware still works, so why change something that works? Because it would be tough to find more. It’s also probably out of vendor-supported service.

Latency? Who cares. Is it less than 1 second? Good enough.

Layer 2 adjacencies, if even required, are typically very small, typically just needed for the local clustering application (which is usually just stink-out-loud awful)

100 megabit and gigabit Ethernet typically. 10 Gigabit? That’s science-fiction talk!

Buffers? You mean like, what shines the floor?

My own personal opinion is that this is the only place where Cisco Catalyst switches belong in a data center, and even then only because they’re already there. If you’re going with Cisco, I think everything else (and everything new) in the DC should be Nexus.

Cloud Workloads (Private Cloud)

If you look at a cloud workload, it looks very similar to the previous traditional virtualization workload. They both use VMs sitting on top of hypervisors. They both have underlying infrastructure of compute, network, and storage to support these VMs. The difference is primarily is in the operational model.

It’s often described as the difference between pets and cattle. With traditional virtualization, you have pets. You care what happens to these VMs. They have HA and DRS and other technologies to care for them. They’re given clever names, like Bart and Lisa, or Happy and Sleepy. With cloud VMs, they’re not given fun names. We don’t do vMotion/Live Migration with them. When we need them, they’re spun up. When they’re not, they’re destroyed. We don’t back them up, we don’t care if the host they reside on dies so long as there are other hosts carrying the workload. The workload is automatically sharded across the available hosts using logic in the application. Instead of backups, templates are used to create new VMs when the workload increases. And when the workload decreases, some of the VMs get destroyed. State is not kept on any single VM, instead the state of the application (and underlying database) is sharded to the available systems.

This is very different than traditional virtualization. Because the workload distribution is handled with the application, we don’t need to do vMotion and thus have Layer 2 adjacencies. This makes it much more flexible for the network architects to put together network to support this type of workload. Storage with this type of workload also tends to be IP-based (NFS, iSCSI) rather than FC-based (native Fibre Channel or FCoE).

With cloud-based workloads, there’s also a huge self-service component. VMs are spun-up and managed by developers or end-users, rather than the IT staff. There’s typically some type of portal that end-users can use to spin up/down resources. Chargebacks are also a component, so that even in a private cloud setting, there’s a resource cost associated and can be tracked.

OpenStack is a popular choice for these cloud workloads, as is Amazon and Windows Azure. The former is a private cloud, with the later two being public cloud.

Latency requirements are mostly the same as traditional virtualization

Because vMotion isn’t required, it’s all Layer 3, all the time

Storage is mostly IP-based, running on the same network infrastructure (not as much Fibre Channel)

Buffer requirements are typically the same as traditional virtualization

VXLAN/NVGRE burned into the chips for SDN/Overlays

You can use much cheaper switches for this type of network, since the advanced Layer 2 features (OTV, FabricPath, SPB/TRILL, VCS) aren’t needed. You can build a very simple Layer 3 mesh using inexpensive and lower power 10/40/100 Gbit ports.

However, features such as VXLAN/NVGRE encap/decap is increasingly important. The new Trident2 chips from Broadcom support this now, and several vendors, including Cisco, Juniper, and Arista all have switches based on this new SoC (switch-on-chip) from Broadcom.

High Frequency Trading

This is a very specialized market, and one that has very specialized requirements.

Latency is of the utmost concern. To the point of making sure ports are on the same ASIC. Latency is measured in nano-seconds, microseconds are an eternity

10 Gbit at the very least

Money is typically not a concern

Over-subscription is non-existent (again, money no concern)

Buffers are a trade off, they can increase latency but also prevent packet loss

This is a very niche market, one that Arista dominates. Cisco and a few other vendors have small inroads here, using the same merchant silicon that Arista uses, however Arista has had huge experience in this market. Every tick of the clock can mean hundreds of thousands of dollars in a single trade, so companies have no problem throwing huge amounts of money at this issue to shave every last nanosecond off of latency.

Hadoop/Big Data

Latency is of high concern

Large buffers are critical

Over-subscription is low

Layer 2 adjacency is neither required nor desired

Layer 3 Leaf/spine networks

Storage is distributed, sharded over IP

Arista has also extremely successful in this market. They glue PC RAM onto their switch boards to provide huge buffers (around 760 MB) to each port, so it can absorb quite a bit of bursty traffic, which occurs a lot in these types of setups. That’s about .6 seconds of buffering a 10 Gbit link. Huge buffers will not prevent congestion, but they do help absorb situations where you might be overwhelmed for a short period of time.

Since nodes don’t need to be Layer 2 adjacent, simple Layer 3 ECMP networks can be created using inexpensive and basic switches. You don’t need features like FabricPath, TRILL, SPB, OTV. Just fast, inexpensive, low power ports. 10 Gigabit is the bare minimum for these networks, with 40 and 100 Gbit used for connectivity to the spines. Arista (especially with their 7500E platform) does very well in this area. Cisco is moving into this area with the Nexus 9000 line, which was announced late last year.

Conclusions

Understanding the requirements for the various workloads may help you determine the right switches for you. It’s interesting to see how quickly the market is changing. Perhaps 2 years ago, the large-Layer 2 networks seemed like the immediate future. Then all of a sudden Layer 3 mesh networks became popular again. Then you’ve got SDN like VMware’s NSX and Cisco’s ACI on top of that. Interesting times, man. Interesting times.