vSAN Stretched Cluster Networking

For a customer of mine, we are in the process of implementing a vSAN environment, where stretched clustering is used. They use a witness location, which is (independently) connected to both data-locations. For the vSAN networking, they do not use VRRP or another way to stretch there routing, but they do use stretched layer-2 networking.

So their environment looks like this:

vSAN Netwerk-Connectiviteit - Witness

In order to make sure that HA is handled correctly, it is advised to create a gateway-address within the vSAN network, in order to make sure that connectivity on the storage level is used to determine isolation. More information on this, in the blog of Duncan Epping:

http://www.yellow-bricks.com/2017/11/08/vsphere-ha-heartbeat-datastores-isolation-address-vsan/ 

So we created a VRF (Virtual Routing and Forwarding, a method to create a virtual router, within a physical entitiy (in this case a switch)) on both switch-stacks on both data-locations, in order to make sure that connectivity from the host to the switch-stack can be tested. With the use of a VRF we are still able to make the vSAN network unconnected to the rest of the network.

The switch stack on data-location 1, used ip-address 10.17.31.253 and the switch stack on location 2, used ip-address 10.17.31.254 (fictional ip-addresses, of course). Both addresses where than added as isolation addresses, to test the connectivity of the hosts to the environment. So when a host is unable to connect to both ip-addresses it will assume it is isolated.

Then we needed to add the witness network, but in order to make sure that connectivity was correctly configured, we needed a way to use the correct connection to connect from the witness to the hosts on the different data-locations. So traffic to and from the witness location to location 1 needed to use the connection to location 1 and traffic to and from location 2, needed to use the connection to location 2.

We created two transit VLAN’s, one from the witness location to location 1 and one from the witness location to location 2. In the routing table on the witness-location switch, we created static routes to the hosts on location 1, going through the transit network towards location 1 and the same for location 2.

On the VRF’s on the data-locations, we created static routes for the witness-vlan through  the directly connected transit networks. So when a host in location 1 tries to connect to the witness location, it will use the transit VLAN from location 1 to the witness location (and vice versa). And when a host in location 2 tries to connect to the witness location, it will use the transit VLAN from location 2 to the witness location (and vice versa).

Of course, we also needed to take into account the failure a connection from the witness locations to one of the data locations, so we also created static routes to the same entities with a higher cost, through the “other” data-location. So when one of the connections to the witness locations would fail, traffic would be rerouted through the other data-location.

In the end, the environment looks like this:

vSAN Netwerk-Connectiviteit - Witness - 2

So, it might not win the beauty contest, but it gets the job done (in Dutch that’s a saying ;)). It is also not very scalable, but for a small environment, it works.

 

New virtual LAB on Ravello (part 1)

So, a couple of months ago (two, to be precise) I wrote about a new lab I created at Ravello. The lab was set up, mainly to try out the functionality of Ravello and especially the bare-metal vSphere support they gave.

A couple of days later unfortunately, the lab was no longer accessible. I tried recovery and restoration, but both failed. So I had to rebuild the lab again (and this time make sure I save it as a blueprint, when everything is configured the way I wanted it.

But looking on the bright side, it gave me an opportunity to rebuild it slightly differently, as to fit my needs and wants.

So this time, I took the time to draw up a “design” as to how the environment needed to be. From a network-perspective I had my heart set on something like this:

Ravello Netwerk

I wanted to use an ESG as a routing device for the physical network, so I could fairly easily connect NSX to the “physical” world, with BGP or OSPF (to be able to test the protocols. The Ravello based routing-function does not offer these functions. I did need to use the Ravello router in order to access my management station, which has static routes to use the ESG, but a default route towards the Ravello router for all non-lab traffic.

The reason for connecting the vSAN network to the Ravello Router as well, is the fact that I wanted an available gateway address that I could use as an isolation address within vSAN. Since the Ravello router is available before the other parts of the Lab, this seems like the appropriate place to put an isolation address. More info on this, in Duncan’s excellent blog: http://www.yellow-bricks.com/2017/11/08/vsphere-ha-heartbeat-datastores-isolation-address-vsan/

So I ended up setting up one ESG for North/South routing to and from the virtual network and another ESG with just connectivity to the VLAN-based dPG’s.

After the routers were created and BGP was set up, all routes were nicely advertised and received throughout the network.

The “physical” ESG:

route-table-physical

The virtual ESG:

route-table-virtual

and finally, the DLR:

route-table-dlr

So, the first phase of the reinstall of the lab environment is completed. I have deployed, vCenter, vSphere, vSAN and NSX in pretty basic setup. Next up vROPS, vRealize Log Insight, vRealize Network Insight and maybe I will start playing around with vRealize Automation as well, since the combination of NSX and vRA is pretty powerful.

Last thing I did today, was saving it as a blueprint. It is sometimes nice to start over, but not every time ;).

 

 

Achievment: Unlocked…

In the past two-and-a-half years, since I joined PQR, a lot has changed in my job. I have been more an more involved in pre-sales and evangelization, mainly of VMware SDDC products (with a strong focus on NSX). Today was sort of a special day in that respect, since I had the privilege of presenting a session at the largest VMUG in the world, the NLVMUG in den Bosch.

Together with my colleague, Viktor van den Berg (https://www.viktorious.nl) we did a session around the combination of VMware vRealize Automation and NSX.

VMUG-1

In this presentation we showed how to use the combination of products and what is necessary from a (virtual) networking ánd security perspective to automate the deployment of complete blueprints, including the network components.

I had fun doing it and I would like to thank everyone for their kind words afterwards. Who know, maybe I will try this again, next year ;).

For people interesting in the topic, the presentation can be downloaded here:

NLVMUG-UserCon-NSX-vRA

And for anyone who wants to see this presentation in the flesh, I invite you to join us at the IT Galaxy event on april 17th: https://www.it-galaxy.nl/

 

 

 

Multi-User (RDSH) Identity Firewall in NSX

One of the main new features in VMware NSX 6.4 is the Multi-User (or RDSH) Identity Firewall. With this feature, it is possible to microsegment traffic, based on User-ID. I have been looking forward to this feature, since a couple of my customers have been waiting for it and it is a very powerful addition to the already pretty packed microsegmentation functionality.

I already wrote about this on my company-blog (in Dutch: http://www.pqr.com/blogs/nieuwe-versie-nsx-met-grote-stappen-voorwaarts) and in this article I will dive a little deeper into the underlying technology.

In order to use IDFW (with or without RDSH), it is necessary to connect NSX to an AD-domain. There are two methods of retrieving the user-information:

  • Log-scraping
  • Guest Introspection

With log-scraping the logs of the Domain Controllers are “scraped” and logons are thus detected. With Guest Introspection, the information is retrieved by a VM that needs to run on each host.

Since I wanted to use the Guest Introspection method, I created a new domain, with a new Domain Controller, running on Windows 2016. This is a prerequisite when using the Guest Introspection method. Another prerequisite is the use of the Guest Introspection feature, within NSX. It means deploying the GI-VM’s. More information about this can be found here: Install Guest Introspection on Host Clusters

So, I created a couple of users and a couple of groups:

Users-groups

User Piet is a member of the HR group, user Janine is a member of the Financiën group and user Ben is a member of the Administratie group.

After creating the users and groups, I connected the domain to NSX:domain connectionWhen the connection between the domain and NSX is successfully synchronized, it becomes possible to create a new Security Group within NSX with the AD-groups in it. I created 4 security groups:

  • Administratie
  • HR
  • Financiën
  • HR+Financiën

The “HR+Financiën” group contains both the HR and the Financiën group and used nesting.

Important is that the Section where the rules are placed in, needs to be enabled for “User Identity at source”:

multi-user DFW

Within the DFW I created a couple of rules, based on the security groups that were created. I used a couple of client-vm’s to test the connectivity to:DFW-regelsI created rules for “Administratie” to connect to Client03 and Client04 on HTTP and RDP and for “HR + Financiën” to connect to Client01 and Client02.

Then I created rules to block traffic from the other users to the clients.

So when logging in to my RDSH-host, I should be able to RDP into Client03 only if I am Ben, and to Client01 and Client02 only if I am Piet or Janine. When accessing Client01 on HTTP, I should have a connection when I am Piet of Janine, but not if I am Ben.

When I click on the security group in the DFW, I get the information about the user-sessions that corresponds to the security group:

HR+Financien-SG-loggedinSo, when Piet tries RDP-ing to Client01 or accessing its website (default IIS ;)), it looks like this:Piet-client01When I am Ben, RDP-ing into Client01 or accessing it’s website, I get denied, but RDP-ing into Client03 is allowed:Ben-client01

And when I look at Ben’s activity in vRealize Log Insight (don’t microsegment without it), I see the rules being applied as expected:

loggingReading from the bottom, the first line indicates the allowed RDP traffic to the “.3” (which is Client03). The 3 lines above that, show that RDP traffic to the “.1” (Client01) is nót allowed. And the 4 lines above that, show that HTTP traffic to Client01 is also not allowed.

When looking at Piet’s log:logging2

We can see that RDP ánd HTTP to Client01 is allowed.

So, all in all a very nice addition to NSX.

 

 

My Lab environment in Ravello’s Cloud on bare metal

For one of my customers, I have created a nested PoC environment, to be run in their own virtual environment (completely nested). The environment consists of three hosts, with all components as virtual machines within the hosts. Within the environment I used vSAN as the storage layer and of course NSX (the latest version, 6.4) in order to demonstrate the micro-segmentation functionality the customer wanted to test out.

I used to own a physical lab (consisting of a couple of cheap Proliant-servers (ML150G5) and my own Nexenta hybrid SAN), but the hardware was no longer “top notch” and so I decided to decommission it completely and rethink my options.

I created this PoC environment to be fully self containing. I used my own desktop computer for this, with one (not too large) SSD and a pretty slow HDD. 32 GB of memory and one processor. As you can imagine, this didn’t really fly… But in the end I created three hosts with all the vm’s on it, to run a complete SDDC.

Since I became a vExpert (back in 2016) I knew of the possibility to use Ravello’s cloud infrastructure (https://cloud.oracle.com/en_US/ravello) for free and had been planning on taking some time to use this. With the lab built, I decided to try and use the Ravello offering to expand the PoC environment and create my own cloud-based lab. With the Ravello service comes the opportunity to upload your VM’s (which I had exported to OVF’s) and use them in the cloud.

For the nested ESXi hosts I used the nested image that was created by William Lam (https://www.virtuallyghetto.com/nested-virtualization).

After uploading the complete ova-files to the Ravello cloud, I created an Application, called Lab Omgeving (which translates to Lab Environment, for non-Dutch readers ;)).

Since I use my own private-ip-range while creating the hosts and management components (since I didn’t want to add a DNS-server to the nested environment), I created the needed Network constructs within the environment. It is very easy to use the interface of the network management:

Ravello Network

When I started the hosts and tried to power on the vCenter Server and the NSX Manager, performance was not as I hoped. I did tweak the hosts a little, to give them some extra memory, but I was not allowed to go beyond 16 GB. Because I knew of the availability of the Bare Metal functionality within Ravello’s cloud, I started to look for a way to utilize this. I found a blog where was described how to use this at: https://robertverdam.nl/2017/09/21/ravello-oracle-cloud-infrastructure/. There it says that you can change the advanced settings within a vm and set PreferPhysicalHost=true to be able to use the bare metal functionality. And after setting this on my three hosts, it became possible to upgrade the hardware to 8 processors and 32 GB of memory, enough to create the lab I wanted:

ravello bare metalravello host config

Off course, the checkmark needs to be set on “Allow nested virtualization”.

In order to manage the environment, I chose to use a Management Server, based on Windows 2012R2, including DNS.

After this, the performance of the virtual environment was much better. The vCenter Server starts up automatically and with the opportunity of adding an external service to the Management Server, I was able to use RDP to access my lab environment.

One thing that is a downside of using the bare metal functionality, is that starting the virtual machines takes a lot longer than without using it. Sometimes it takes almost 15 minutes before the hosts start to boot, so it is not something that can be used for a quick look at something (especially since starting vCenter after this, also takes a couple of minutes). But then again, if you use a physical server, it also takes a long time before the OS starts to load ;).

ravello starting

All and all, I am very pleased with the first use of the Ravello bare metal functionality.

 

 

For years and years I have been an avid consumer of blogs from the community and I found them very useful. I have created several blog-posts, but all of them related to company activity. Now is the time to start a more personal blog, where I can write-up stuff that I find worth to write about, but not suitable for my company-blog.

I will be writing about the Software Defined DataCenter (SDDC), since that is the technology I am mostly working with.