Shutdown and Startup of a VCF Management Domain
After another redeploy of VCF, I decided that before I create a new WLD based on NSX-T (which failed twice already), I would create some snapshots of the MD hosts, so that I would be able to return to a state that is valid (without a WLD in error-state and/or other messy stuff).
I would like to create the snapshots in a powered-down state, so they are not just crash-consistent, so I decided to shut down the Management Domain (MD) (which is currently the only domain) to do this.
And as they say, never miss the opportunity to describe your experiences in a blog, I decided to write it all down. I know it is well described in the documentation, but for my own personal reference, it is good to have the steps handy if I want to do this again, without having to sift through a lot of irrelevant stuff (since we are not using Horizon, for instance).
Shutdown
So basically, what we need to do is the following:
- Shutdown virtual machines (in order)
- Put hosts in maintenance mode
- Shutdown hosts
And that is basically all.
The order in which to shut down the virtual machines is:
- vRealize Suite LifeCycle Manager (if present and part of VCF)
- vRealize Log Insight
- Worker Nodes
- Master Nodes
 
- vRealize Operations Manager (if present and part of VCF)
- All Remote Collectors
- All Data Nodes
- Replica Node
- Master Node
 
- vRealize Automation (if present and part of VCF)
- IaaS DEM
- IaaS Proxy
- Secondary IaaS Manager
- Primary IaaS Manager
- Secondary IaaS Web Server
- Primary IaaS Web Server
- vRA virtual appliances
- IaaS SQL Server
 
- NSX Appliances
- ESG’s
- NSX Managers for WLD (if present)
- NSX Manager for MD
- NSX Controller Cluster for MD
 
- vCenter Servers for WLD (if present)
- vCenter Server for MD
- SDDC Manager
- PSC’s
The last few machines (starting with the NSX Appliances) officially need to be shut down from the shell of the virtual machine (although I don’t see any logic in this for the NSX stuff). The command for shutting down the SDDC Manager and the PSC’s is:
# shutdown now
(at least, if you don’t want to wait a minute). Checking if the VM’s are down, can be done through the ESXi host where it is/was running.
After this, the MD hosts can be put into maintenance mode. This must be done in the cli since that is the only place that we can put them all into maintenance mode and make sure that vSAN is happy as well:
# esxcli system maintenanceMode set -e true -m noAction
When all hosts are in maintenance mode (this can be checked by entering the same command and getting a response that the host is already in maintenance mode), they can be shut down from the cli as well:
# poweroff
Startup
The startup process is more or less the same as the shutdown process, in reverse order. So, basically doing the following:
- Start the hosts in the MD
- Take the hosts out of Maintenance Mode
- Start the VM’s (in order)
The command for taking the host out of Maintenance Mode is (unsurprisingly):
# esxcli system maintenanceMode set -e false
The order described for starting the virtual machines is almost the reverse as the shutdown, there is one difference (and I do not know how relevant this is). The SDDC Manager needs to be started after the vCenter Server for the MD. Apart from that, the reverse order can be used. Of course, it is important to wait for the functionality to be present, before moving on to the next step.
