In this article, we are now going to start configuring NSX-T so that it will be ready for us to install PKS and consume the networking and security services provided by NSX-T. The result is that PKS can deliver on demand provisioning of all NSX-T components: Container Network Interface (CNI), NSX-T Container Plugin (NCP) POD, NSX Node Agent POD, etc) automatically when a new Kubernetes (K8S) Cluster is requested, all done with a single CLI or API call. In addition, PKS also provides a unique capability through its integration with NSX-T to enable network micro-segmentation at the K8S namespace level which allows Cloud/Platform Operators to manage access between application and/or tenant users at at a much finer grain level than was possible before which is really powerful!

As mentioned in the previous blog post, I will not be walking through a step-by-step NSX-T installation and I will assume that you already have a basic NSX-T environment deployed which includes a few ESXi host prepped as Transport Nodes and at least 1 NSX-T Controller and 1 NSX-T Edge. If you would like a detailed step by step walk through, you can refer to the NSX-T documentation here or you can even leverage my Automated NSX-T Lab Deployment script to setup the base environment and modify based on the steps in this article. In fact, this is the same script I have used to deploy my own NSX-T environment for PKS with some minor modification which I will be sharing at a later date.

If you missed any of the previous articles, you can find the complete list here:

PKS supports supports the following network deployment topologies:

PKS running outside of NSX-T PKS VMs (Ops Manager, BOSH, PKS Control Plane, Harbor) are deployed to either a VSS or VDS backed portgroup Connectivity between PKS VMs, K8S Cluster Management and T0 Router is through a physical or virtual router NAT is only configured on T0 to provide POD networks access to associated K8S Cluster Namespace



PKS running inside of NSX-T PKS VMs (Ops Manager, BOSH, PKS Control Plane, Harbor) is deployed to NSX-T Logical Switch (behind T0) Connectivity between PKS VMs, K8S Cluster Management and T0 Router is through a physical or virtual router Both PKS Admin Network & K8S Cluster Management Network Logical Switch can either be routed or NAT'ed



From an operational and troubleshooting standpoint, the NAT deployment is going to be the most complex due to a number of SNAT/DNAT rules that will need to be configured to ensure proper communication between the Management components (outside of NSX-T) and the PKS Admin & K8S Cluster Management (inside of NSX-T). Having a routable network which the PKS VMs will reside on whether that is VSS/VDS or NSX-T Logical Switch is the preferred option.

Note: I would like to thank both Francis Guillier (Technical Product Manager) and Gaetano Borione (PKS Architect) from the CNABU who really helped me understand some of the networking nuances and constraints when deciding how to deploy PKS from a networking standpoint.

Step 1 - Verify that you have prepped the ESXi hosts which will be used to deploy the K8S workload VMs. In my example below, I have two vSphere Clusters: Primp-Cluster which is my "Management" Cluster and will NOT be prepped with NSX-T and PKS-Cluster which will be prepped with NSX-T.

Note: Make sure that both your Management and Compute Cluster is managed by the same vCenter Server, this will impact the placement of your PKS Management VMs whether that lives in a dedicated Management Cluster or residing in the Compute Cluster. For lab and education purposes, use a single vCenter Server managing both the Management and Compute Cluster.

Step 2 - Create a new IP Pool which will be used to allocate Virtual IPs for the exposed K8S Services (e.g Load Balancer for application deployments). To do so, navigate to Inventory->Groups->IP Pool and provide the following:

Name: Load-Balancer-Pool

Load-Balancer-Pool IP Range: 10.20.0.10 - 10.20.0.50

10.20.0.10 - 10.20.0.50 CIDR: 10.20.0.0/24



Step 3 - Create a new IP Block which will by used by PKS on-demand to carve up into smaller /24 networks and assigned those to each K8S namespace. This IP block should be sized sufficiently to ensure you do not run out of addresses and currently it is recommended to use a /16 network (non-routable). To do so, navigate to DDI->IPAM and provide the following:

Name: PKS-IP-Block

PKS-IP-Block CIDR: 172.16.0.0/16



Step 4 - Create a new T0 Router which will be used to communicate with your external physical network. Make sure you have either created an Edge Cluster (can contain a single Edge) or create a new Edge Cluster if you have not already. The HA mode must be Active/Standby as NAT is used by the NCP service within K8S Management POD. To do so, navigate to Routing->Routers and provide the following:

Name: T0-LR

T0-LR Edge Cluster: Edge-Cluster-01

Edge-Cluster-01 High Availability Mode: Active-Standby

Active-Standby Preferred Member: edge-01



Step 5 - Create a static route on the T0 which will enable all traffic from the K8S Management PODs to be able to communicate outbound to our Management components. This is needed, for example the NCP POD will need to talk to NSX-T for creation of new networks and/or Load Balancer services based on application deployment from the K8S Clusters. In my example, 172.30.50.1 is the intermediate network's gateway which will be used to route traffic from within T0 to my virtual router (pfSense). To do so, click on the T0 Router you just created and navigate to Routing->Static Routes and provide the following:

Network: 0.0.0.0/0

0.0.0.0/0 Next Hop: 172.30.50.1



Step 6 - Next, we need to create two Logical Switches, one that will be used for the T0 uplink and the other for K8S Management Cluster (also known as the K8S Service network) which is used to run the K8S Management POD. To do so, navigate to Switching->Switches and add the following:

Name: Uplink-LS

Uplink-LS Transport Zone: TZ-VLAN

TZ-VLAN VLAN: 0

Name: K8S-Management-Cluster-LS

K8S-Management-Cluster-LS Transport Zone: TZ-Overlay

TZ-Overlay VLAN: 0



After this step, you should have two Logical Switches as shown in the screenshot below. The Uplink-LS should be on TZ-VLAN and the K8S-Management-Cluster-LS should be on TZ-Overlay



Step 7 - Now we need to configure the Uplink Router Port and assign it an address from the intermediate network so that we can route from the T0 to our physical or virtual router. To do so, navigate to Routing and then click on the T0 Router we had created earlier and select Configuration->Router Ports and provide the following:

Name: Uplink-1

Uplink-1 Type: Uplink

Uplink Transport Node: edge-01

edge-01 Logical Switch: Uplink-LS

Uplink-LS Logical Switch Port: Uplink-1-Port

Uplink-1-Port IP Address/mask: 172.30.50.2/24



Step 8 - Create a new T1 Router which will be used for the K8S Management Cluster POD. To do so, navigate to Routing->Routers and provide the following:

Name: T1-K8S-Mgmt-Cluster

T1-K8S-Mgmt-Cluster Tier-0 Router: T0-LR

T0-LR Failover Mode: Preemptive

Preemptive Edge Cluster: Edge-Cluster-01

Edge-Cluster-01 Edge Cluster Members: edge-01

edge-01 Preferred Member: edge-01



Step 9 - Configure the Downlink Router Port for the K8S Management Cluster which is where you will define the network that NSX-T will use for these VMs. In my example, I decided to use 10.10.0.0/24.To do so, click on the T1 Router that you had just created and navigate to Configuration->Router Ports and provide the following:

Name: Downlink-1

Downlink-1 Logical Switch: K8S-Mgmt-Cluster-LS

K8S-Mgmt-Cluster-LS Logical Switch Port: Downlink-1-Port

Downlink-1-Port IP Address/mask: 10.10.0.1/24



Step 10 - To ensure the K8S Management Cluster network will be accessible from the outside, we need to advertise these routes. To do so, click on the T1 Router you had created earlier and navigate to Routing->Route Advertisement and enable the following:

Status: Enabled

Enabled Advertise All NSX Connected Routes: Yes

Yes Advertise All NAT Routes: Yes



Step 11 - This step maybe optional depending how you have configured your physical networking, in which case you will need to use BGP instead of static routes to connect your physical/virtual network to NSX-T's T0 Router. In my environment, I am using a virtual router (pfSense) and easiest way to enable connectivity from both my management network as well as networks that my vCenter Server, ESXi hosts, NSX-T VMs and PKS Management VMs are hosted on to communicate with PKS is setting up a few static routes. We need to create two static routes to reach both our K8S Management Cluster Network (10.10.0.0/24) as well as K8S Load Balancer Network (10.20.0.0/24). For all traffic destine to either of these 10 networks, we will want them to be forwarded to our T0's uplink address which if you recall from Step 7 is 172.30.50.2. Depending on the physical or virtual router solution, you will need to follow your product documentation to setup either BGP or static routes.



Step 12 - At this point, we have completed all the NSX-T configurations and we can run through a few validation checks to ensure that when we go and deploy the PKS Management VMS (Ops Manager, BOSH and PKS Control Plane), we will not run into networking issues. This is a very critical step and if you are not successful here, you should go back and troubleshoot prior to moving on.

To verify Overlay network connectivity between ESXi hosts and Edge VM, you should be able to ping using the VXLAN netstack between all ESXi hosts as well as to the Edge VM's overlay interface. Below is a table of the IPs that were automatically allocated from the VTEP's IP Pool, you can discover these by logging onto the ESXi host but they should be sequential from the stating range of your defined IP Pool. Also make sure you have your physical and virtual switches configured to use MTU 1600 for overlay traffic.

Host IP Address esxi-01 192.168.0.10 esxi-02 192.168.0.11 esxi-03 192.168.0.12 edge-01 192.168.0.13

You can SSH to each ESXi host and run the following:

vmkping ++netstack=vxlan [IP]

or you can also do this remotely via ESXCLI by running the following:

esxcli network diag ping --netstack=vxlan --host=[IP]

To verify connectivity to NSX-T Networks as well as routing between physical/virtual networking to NSX-T, use your PKS Client VM which we had deployed from Part 2 and you should be able to ping the following addresses from that system:

10.10.0.1 (Downlink Router Port to our K8S Management Cluster network)

(Downlink Router Port to our K8S Management Cluster network) 172.30.50.2 (Uplink Router Port)

(Uplink Router Port) 172.30.50.1 (Intermediate Network's Gateway)

If you are unable to reach either the Edge VM's overlay interface and/or some of the NSX-T interfaces, a very common mistake (which I had made myself) was mix up the networking on my Edge VM deployment. When configuring your Edge Transport Node and assigning it to the correct HostSwitch, I thought fp-eth0 = Network Adapter 1, fp-eth1 = Network Adapter 2, and so fourth. It turns out NSX-T automatically ignores the Edge Management interface which is Network Adapter 1 and hence fp-eth0 starts on the 2nd vNIC of the VM. So make sure that whatever network you are using for the Overlay traffic for your ESXi hosts, that it is also configured on Network Adapter 2 as shown in the screenshot below. The 3rd vNIC will be used for Edge Uplink traffic which should connected to your intermediate network and in my case that is VLAN 3250. Once I made this change, I was able to successfully verify connectivity following the steps outlined above.



In the next blog post, we will start our PKS deployment starting with Ops Manager and BOSH. Stay tuned!