VMware, Synology, iSCSI, and Multipath I/O (MPIO)

Prerequisites

Here is a good article from VMware docs that explains this. https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.storage.doc/GUID-DD2FFAA7-796E-414C-84CE-1FCC14474D5B.html

The Gist

There is more than one physical adapter connected on separate subnets. Each subnet provides a path to the iSCSI target.

How I configured the ESXi host with the vSphere Client

I created two VMkernel adapters, one for iSCSI-A and one for iSCSI-B. You can see from the screenshot below that I have one VMkernel adapter configured on the 172.16.11.0/24 subnet and the other adapter on the 172.16.12.0/24 subnet. These are layer 2 networks as I only need to connect to the Synology NAS through a switch. I do not want the traffic passing through a router; it would add unnecessary latency.

Once the VMkernel adapter is configured, I configured the Teaming and failover policy for each adapter. Here is where I “pin” an iSCSI adapter to a physical adapter (a vmnic). My particular setup is using vmnic2 and vmnic3 which are ports three and four on the back of the Dell PowerEdge. VMware starts with vmnic0 and that is why the numbers do not align. For a sanity check, verify MAC addresses if you are unsure.

Dell PowerEdge ServerVMware Physical Adapter
Port 1vmnic0
Port 2vmnic1
Port 3vmnic2
Port 4vmnic3

Regardless, I configure (think “pin”, here) the vmk1 VMkernel adapter to vmnic2. I make it the Active adapter and set vmnic3 to be an Unused adapter.

I configure the vmk2 VMkernel adapter to vmnic3. I make vmnic3 the Active adapter and vmnic2 the Unused adapter.

This configuration will allow you to bind the network ports.

How I Configured the Synology NAS

Mind you, this configuration does not allow management of the device without adding a virtual network adapter to my management Windows 10 virtual machine attached to the iSCSI-A subnet. This is where Synology fails in my opinion. I love the device, but there should be at least three adapters. One for only management and two additional to carry the actual storage traffic. In the future, I will likely just go back to running a TrueNAS Core box. (https://www.truenas.com/download-truenas-core/)

I configure my LAN 1 adapter on the Synology with an IP in the subnet I specified for the iSCSI-A VMkernel adapter on ESXi host, in this case 172.16.11.0/24. I also configured the LAN 2 adapter with an IP in the subnet for the iSCSI-B VMkernel adapter on the ESXi host. Be sure to set VLAN ID tags and MTU correctly as well!

Open the iSCSI Manager and edit the target you are going to use for your ESXi hosts. Be sure to select the All network interfaces radio button on the Network Binding tab. You will also see a list of your configured Network Interfaces from Synology.

If you have more than one ESXi host you are connecting, you will also want to check the box to Allow multiple sessions on the Advanced tab.

Configure the Storage Adapter on the ESXi Host

Go back to your ESXi host and configure the storage adapter. I only use a software adapter as this is more than sufficient for my use.

Verify the Network Port Binding that was configured earlier.

On the Dynamic Discovery tab, add the iSCSI target you configured on the Synology. You only need to add one of the configured addresses.

Once the address is in the table, click Rescan Adapter. This will reach out and connect to the Synology.

Back on the Synology, you can verify that your ESXi hosts are connected by checking the Service Status.

Back in sthe vSphere Client, there should now be a Storage Device under the Devices tab.

If the target was already formatted with the VMFS file system, you should be ready to consume it now. If this is a new target, it will have to be formatted only once from any of the shared hosts. Once it is formatted, it should appear on all the other hosts. Sometimes, Rescan Storage will be necessary to query for the new device or update the name if it is not present.

Edit the Multipathing Policy

The very last thing that should be done is to adjust the multipathing policy. No point in going through all this trouble to just send and receive storage data over one path! This will have to be done from each host for each iSCSI target.

Go back to the host and click Storage Devices. Select the datastore you want to configure, and then choose Actions besides the Multipathing Policies section.

Change the Path Selection Policy to Round Robin (VMware) to take advantage of the multiple paths you have configured! You will see that the status of the configured paths show Active (I/O) now. This indicates that the paths are now participating in the storage traffic.

Conclusion

I highly recommend setting up iSCSI in a Round Robin multipathing policy, especially for at home labs where there are likely only 1 Gbps links. This will at least provide 2 Gbps effectively. It is a pain to set up the first time, but once you have it, you will likely see the benefits. This is also a good practice in Production environments, especially with 10 Gbps, 25 Gbps, or 100 Gbps iSCSI adapters on the market.

Building a Nested ESXi Lab for VMware Cloud Foundation (VCF) (updated 2023-Dec)

Introduction

The following post is very long and will contain updates as the technology changes and I figure out better ways to accomplish these tasks. VMware Cloud Foundation, or VCF, requires at least four nodes for the Management Domain. Unfortunately, I do not have hundreds of thousands of dollars for physical hardware to test and learn VCF.

Since this is a nested lab, there are a few things that will need to be set up to make this work. These items, in particular networking, have to be configured in a way to allow the nested virtual machines to communicate. This is not applicable in production where the physical hosts are cabled into Top of Rack switches.

Continue reading “Building a Nested ESXi Lab for VMware Cloud Foundation (VCF) (updated 2023-Dec)”

Reset VxRail Root and Mystic Accounts

I have been working with clients that are using VxRail for their infrastructure. While administering these VxRail deployments, sometimes the mystic or root accounts get locked, the password’s expire or are just plain lost. Either way, it is a very frustrating situation to find yourself in.

localhost login: root
Password:
Login incorrect

localhost login: _

Turn to Google and search for reset VxRail Manager password and come upon the following Dell kb, Dell VxRail: VxRail Manager root password is lost.(https://www.dell.com/support/kbdoc/en-us/000064579/vxrail-how-to-reset-the-root-password-for-vxrail-manager)

Even after following this article, you realize almost immediately that this hasn’t been updated to reflect newer versions. The very first picture depicts a SUSE Linux Enterprise 12 screenshot.

More recent versions of VxRail are running on SUSE Linux Enterprise 15. The following procedure will hopefully assist you until Dell can update their documentation.

Procedure

Start by taking a snapshot of your VxRail Manager!

Open a web or remote console and then restart the virtual machine. When you see the following splash screen, press the ‘e’ key on your keyboard to interrupt the boot sequence.

This will bring you to the GNU GRUB boot menu. Look for the line starting with linux (14 lines down in my case). Press Ctrl-e to go to the end of the line and add init=/bin/bash.

Press Ctrl-x or F10 to boot.

If you are following the numerous sources out there, they will point you to use the pam_tally2 utility. As you can see below, this won’t work…yet.

Create the log directory and change to it with the following:

mkdir -p /var/log
cd /var/log

Add the tallylog file.

touch tallylog
chmod 600 tallylog

Now, you should be able to use /sbin/pam_tally2. If you are not familiar with the syntax, the below images should help. The full help is at the end of this post for more information. You can see that the two users have 0 failures currently. If you do know the password and just want to unlock the account so you can log in again, use the following syntax.

/sbin/pam_tally2 -u <user name> -r

Reset the passwords

As long as the directory was created above, the passwd utility should be able to be used, now. Ignore the message that the password was used already. I tried completely new passwords and still receive this message.

Reboot the virtual machine

Unfortunately, I have not found a suitable way to reboot the virtual machine, yet. VMware Tools (more accurately, open-vm-tools) is not started since we are not booted in a full multi-user state.

Make sure you have completed your password or unlocking maintenance. When you are ready, go to the power control for the virtual machine and select the Power off option. Wait a moment before powering the virtual machine back on. At this point, the passwords you set or the account should be unlocked.

After you have verified that your accounts work, be sure to remove the snapshot you took in the beginning!

Hopefully this will help you out.

pam_tally2 Help

/sbin/pam_tally2: [-f rooted-filename] [--file rooted-filename]
   [-u username] [--user username]
   [-r] [--reset[=n]] [--quiet]

Changing the Primary Network Identifier (PNID) on vCenter

Preface

This has been occurring to me a lot lately. vCenter gets deployed and configured with a hostname. You are likely not giving it much thought until later when a particular situation arises. When it’s time to upload Transport Layer Security (TLS) certificates and you receive the following warning, “Error occurred while fetching tls: Invalid input certifcate : The Subject of the provided certificate does not contain the correct CN value”

I find that this is caused by the case-sensitive nature of the VMware vCenter Server Appliance (VCSA) and the Common Name (CN) in the certificate. When vCenter is deployed, if a lowercase hostname is provided (my personal preference, now), i.e. vcsa.aaronrombaut.com, the request for the certificate, the Certificate Signing Request (CSR), should also use the lowercase fully qualified domain name (FQDN). The idea is that the case needs to match, lowercase host name needs to match a lowercase CN and the opposite applies as well. An uppercase host name needs to match an uppercase CN.

The Fix

It’s ok if this situation occurs and you have two choices, really. If this is a brand new deployment, it may be easier to redeploy the VCSA and pay attention to the step when the host name is being applied. If this doesn’t apply and you just want to fix it, then follow along…

Connect to the VCSA. Ensure SSH is enabled on the VCSA. You can enable this from the Virtual Appliance Management Interface (VAMI) (fqdn:5480) or by logging in to the appliance’s console.

Check the current value of the Primary Network Identifier (PNID) with the following command.

/usr/lib/vmware-vmafd/bin/vmafd-cli get-pnid --server-name localhost

Set the pnid value to the same name, but change the case with the following command, obviously changing the case as appropriate.

/usr/lib/vmware-vmafd/bin/vmafd-cli set-pnid --server-name localhost --pnid <pnid>

Reboot the VCSA by typing the following.

reboot

Get up and take a short break…

To verify the change, log into vSphere Client > Menu > Inventory. The VCSA name should now match the case you set on the command line. At this point, it should be safe to apply the TLS certificates without receiving a warning.

I have found that on newer vCenters, one more step needs to take place. This may be a new necessary step and it may even be safe to just apply, but I don’t have the time to install x versions of VCSA and test.

Log on to the vSphere Client and navigate to Home > Inventory > <choose a vCenter> > Configure > Settings > General > Edit > Runtime settings.

Change the vCenter Server Name and then click Save. Reboot the appliance.

You can now definitely install the TLS certificates!