Christoph Jaggi asked me a few questions about using VXLAN with EVPN to build data center fabrics and data center interconnects (including active/active data centers). The German version was published on Inside-IT, here’s the English version.

He started with an obvious one:

What is an active-active data center and why would I want to use an active-active data center?

Numerous organizations have multiple data centers for load sharing or disaster recovery purposes. They could use one of their data centers and have the other(s) as warm or cold standby (active/backup setup) or use all data centers at the same time (active/active).

Wondering whether you could use VXLAN with EVPN to solve your data center design challenge? Attend my workshop in Zurich (Switzerland) on December 5th.

It’s also possible to have a hybrid architecture where a subset of workloads is running in each data center, but none of the workloads runs in multiple data centers at the same time. Data centers are active/standby from the workload perspective (simplifying the application architecture or infrastructure requirements) but still fully utilized.

We covered these concepts in more details in the Designing Active-Active and Disaster Recovery Data Centers webinar.

What is VXLAN and what is EVPN? Are they independent of each other or do they always come as combination?

VXLAN is a simple data plane Ethernet-over-IP encapsulation scheme that enables us to tunnel Ethernet traffic across IP networks. It’s commonly used to implement overlay virtual networking in large-scale data center environments and private- and public clouds.

EVPN is the control plane for layer-2 and layer-3 VPNs. It’s similar to well-known MPLS/VPN control plane with added support for layer-2 (MAC) addresses, layer-3 (IPv4 and IPv6) addresses and IPv4/IPv6 prefixes.

VXLAN can be used without a control plane, and EVPN works with numerous data plane encapsulations including VXLAN and MPLS. Running VXLAN with EVPN is just the most popular combination.

For more details watch VXLAN and EVPN technical deep dive webinars.

What benefits does EVPN offer that might not be achieved when using a straight Ethernet underlay without additional VXLAN overlay?

Traditional Ethernet solutions have two challenges: stability (due to spanning tree protocol challenges) and scalability (every switch has to see know about every MAC address in the network).

VXLAN (but also VPLS, PBB, TRILL, SPB…) improves scalability - customer MAC addresses are not visible in the transport core.

Compared to most other solutions VXLAN also improves network stability, as the core network uses well-tested IP transport and IP routing protocols.

EVPN is the icing on the cake - it makes VXLAN or MPLS deployments better by replacing dynamic MAC address learning used in traditional Ethernet networks with a deterministic control-plane protocol tested across the global Internet (BGP).

For more details watch the layer-2 fabrics part of leaf-and-spine fabric architectures webinar.

What are the limitations and disadvantages of using VXLAN and EVPN?

Implementing VXLAN on hardware devices is still way more expensive than implementing simple Ethernet switching or 802.1ad (Q-in-Q). Hardware supporting VXLAN is therefore more expensive than simpler switching hardware.

EVPN is a complex control-plane protocol based on BGP. It also requires a well-functioning underlay (IP transport). It’s therefore harder to understand and implement than simpler VLAN-based solutions.

Wonder how complex EVPN is? Dinesh Dutt spent over four hours describing its data center intricacies without even mentioning EVPN-with-MPLS or typical service provider use cases.

Which vendors do offer VXLAN and EVPN support? Are the implementations fully interoperable?

Every single data center switching vendor (including Arista, Cisco, Cumulus, Extreme, and Juniper) has dropped whatever proprietary solution they were praising in the past (including FabricPath, DFA, VCS Fabric, IRF…) and implemented VXLAN encapsulation and EVPN control plane.

Unfortunately EVPN has many options, and vendors implemented different subsets of those options, resulting in what I call SIP of virtual networking (have you ever tried to get SIP working between VoIP products of different vendors?). While vendors claim their products interoperate (and participate in testing events to prove it), there are still a lot ways things can go wrong.

For example, the vendors can’t agree on simple things like what the best routing protocol is for the underlay, and how you should run BGP in the overlay. Each vendor is promoting a slightly different approach, making a designers' job a true nightmare.

Is the implementation straightforward or are there things that can go wrong?

We’re combining a hard problem (large-scale bridging) with a complex protocol (BGP) often implemented on a code base that was built to support totally different solutions (Internet routing with BGP or MPLS/VPN). There are tons of things that can go wrong, including subtle hardware problems and software bugs, more so if you’re trying to build an architecture that goes against what your vendor believes to be the right way of doing things.

What are the key things to look at when designing a data center solution using VXLAN and EVPN?

As always, start with simple questions:

What problem am I trying to solve?

Do I really have to solve it, or would redesigning some other part of the system (for example, application architecture) give us better overall results?

What options do I have to solve the problem?

Assuming you have to build large-scale virtual Ethernet networks spanning one or few data centers, VXLAN might be better than the alternatives.

You’ll find more information in the Define the Services and Requirements module of Building Next-Generation Data Center online course.

Next, you have to figure out whether to implement VXLAN in hardware (on top-of-rack switches, the approach taken by data center switching vendors) or in software (in hypervisors, the approach used by VMware, Microsoft, Juniper Contrail, many OpenStack implementations and most large-scale public cloud providers). We covered this dilemma in the VMware NSX, Cisco ACI or EVPN webinar.

If it turns out that it’s best to implement VXLAN in hardware (for example, due to significant amount of non-virtualized servers), you have to decide whether to use static configurations potentially augmented with simple automation, or deploy EVPN.

Keep in mind what I told you about EVPN interoperability challenges. Don’t mix-and-match different vendors in the same fabric. Build smaller pods (self-contained units of compute, storage and virtualization) and connect them with traditional technologies like IP routing or VLANs.

Last but definitely not least, while EVPN control plane used with VXLAN is become more stable, it’s not yet a rock-solid technology. Do thorough testing.

Interested? We’ll discuss all these questions in way more details during the day-long workshop in Zurich (Switzerland) on December 5th.