Before delving into the installation and configuration of vSAN, it’s necessary to discuss the requirements and the prerequisites. VMware vSphere is the foundation of every vSAN-based virtual infrastructure.
vSAN was first released with VMware vSphere 5.5 U1. Additional versions of vSAN were released with VMware vSphere 6.0 (vSAN 6.0), VMware vSphere 6.0 U1 (vSAN 6.1), and VMware vSphere 6.0 U2 (vSAN 6.2). Each of these releases included additional vSAN features, which will be discussed at various stages of this book and were listed in Chapter 1, “Introduction to vSAN.”
VMware vSphere consists of two major components: the vCenter Server management tool and the ESXi hypervisor. To install and configure vSAN, both vCenter Server and ESXi are required.
VMware vCenter Server provides a centralized management platform for VMware vSphere environments. It is the solution used to provision new virtual machines (VMs), configure hosts, and perform many other operational tasks associated with managing a virtualized infrastructure.
To run a fully supported vSAN environment, the vCenter server 5.5 U1 platform is the minimum requirement, although VMware strongly recommends using the latest version of vSphere where possible. vSAN can be managed by both the Windows version of vCenter server and the vCenter Server appliance (VCSA). vSAN is configured and monitored via the vSphere web client, and this also needs a minimum version of 5.5 U1 for support. vSAN can also be fully configured and managed through the command-line interface (CLI) and the vSphere application programming interface (API) for those wanting to automate some (or all) of the aspects of vSAN configuration, monitoring, or management. Although a single cluster can contain only one vSAN datastore, a vCenter server can manage multiple vSAN and compute clusters.
VMware ESXi is an enterprise-grade virtualization product that allows you to run multiple instances of an operating system in a fully isolated fashion on a single server. It is a bare-metal solution, meaning that it does not require a guest-OS and has an extremely thin footprint. ESXi is the foundation for the large majority of virtualized environments worldwide.
For standard datacenter deployments, vSAN requires a minimum of three ESXi hosts (where each host has local storage and is contributing this storage to the vSAN datastore) to form a supported vSAN cluster. This is to allow the cluster to meet the minimum availability requirements of tolerating at least one host failure.
With vSAN 6.1 (released with vSphere 6.0 U1), VMware introduced the concept of a 2-node vSAN cluster primarily for remote office/branch office deployments. There are some additional considerations around the use of a 2-node vSAN cluster, including the concept of a witness host, which will be discussed in more detail in Chapter 8, “Stretched Clusters.”
As of vSAN 6.0 a maximum of 64 ESXi hosts in a cluster is supported, a significant increase from the 32 hosts that were supported in the initial vSAN release that was part of vSphere 5.5, from here on referred to as vSAN 5.5. The ESXi hosts must be running version 6.0 at a minimum to support 64 hosts however.
At a minimum, it is recommended that a host have at least 6 GB of memory. If you configure a host to contain the maximum number of disk groups, we recommend that the host be configured with a minimum of 32 GB of memory. vSAN does not consume all of this memory, but it is required for the maximum configuration. The vSAN host memory requirement is directly related to the number of physical disks in the host and the number of disk groups configured on the host. You will learn more about this in Chapter 9, “Designing a vSAN Cluster.” In all cases we recommend to go with more than 32 GB per host to ensure that your workloads, vSAN and the hypervisor have sufficient resources to ensure an optimal user experience.
With the release of vSAN 6.0, VMware introduced the new all-flash version of vSAN. vSAN was only available as a hybrid configuration with version 5.5. A hybrid configuration is where the cache tier is made up of flash-based devices and the capacity tier is made up of magnetic disks. In the all-flash version, both the cache tier and capacity tier are made up of flash devices. The flash devices of the cache and capacity tier are typically a different grade of flash device in terms of performance and endurance. This allows you, under certain circumstances, to create all-flash configurations at the cost of SAS-based magnetic disk configurations.
When it comes to installing an ESXi for vSAN-based infrastructure, there are various options to consider regarding where to place the ESXi image. ESXi can be installed on a local magnetic disk, USB flash drive, SD card or SATADOM devices. Note that SATADOM support only appeared in vSAN 6.0; these devices were not supported in the initial vSAN release. At the time of writing (vSAN 6.2) stateless booting of ESXi (auto-deploy) is not supported. By deploying ESXi to a USB flash drive or SD card, you have the added advantage of not consuming a magnetic disk for the image. This disk can then be consumed by vSAN to create the distributed, shared vSAN datastore used for deploying VMs. However there are some drawbacks to this approach, such as a lack of space for storing log files and vSAN trace files.
For hosts with 512 GB or less of memory, booting the ESXi image from USB/SD devices is supported. For hosts with a memory configuration larger than 512 GB, ESXi needs to be installed on a local disk or a SATADOM device. This is discussed in more detail in Chapter 10, “Troubleshooting, Monitoring, and Performance.” When installing ESXi on USB or SD, note that you should use a device that has a minimum capacity of 8 GB.
If the host does not have USB/SD and a local disk is used to install ESXi, this disk cannot be part of a disk group and therefore cannot be used to contribute storage to the vSAN datastore. Therefore, in an environment where the number of disk slots is a constraint, it is recommended to use USB/SD or SATADOM.
Before enabling vSAN, it is highly recommended that the vSphere administrator validate that the environment meets all the prerequisites and requirements. To enhance resilience, this list also includes recommendations from an infrastructure perspective:
Minimum of three ESXi hosts for standard datacenter deployments. Minimum of two ESXi hosts and a witness host for the smallest deployment, for example, remote office/branch office.
Minimum of 6 GB memory per host to install ESXi.
VMware vCenter Server.
At least one device for the capacity tier. One hard disk for hosts contributing storage to vSAN datastore in a hybrid configuration; one flash device for hosts contributing storage to vSAN datastore in an all-flash configuration.
At least one flash device for the cache tier for hosts contributing storage to vSAN datastore, whether hybrid or all-flash.
One boot device to install ESXi.
At least one disk controller. Pass-through/JBOD mode capable disk controller preferred.
Dedicated network port for vSAN–VMkernel interface. 10 GbE preferred, but 1 GbE supported for smaller hybrid configurations. With 10 GbE, the adapter does not need to be dedicated to vSAN traffic, but can be shared with other traffic types, such as management traffic, vMotion traffic, etc.
L3 multicast is required on the vSAN network.
Before installing and configuring ESXi, validate that your configuration is on the official VMware compatibility guide for vSAN, which you can find at the following website:
With the release of vSAN 6.0, a new health check tool was made available which examines the state of your vSAN cluster, amongst other features. One of the health checks validates that your vSAN cluster disk controller hardware is supported as per the VMware compatibility guide. Administrators can download a file that contains the latest version of the hardware, driver and firmware versions, and the health check verifies that the configuration is correct.
vSAN has strict requirements when it comes to disks, flash devices, and disk controllers. With all the various options, configuring the perfect vSAN host can be a complex exercise. Before reading about all the components, you will want to learn about an alternative: vSAN ready nodes.
vSAN ready nodes are a great alternative to manually selecting components. Ready nodes would also be the preferred way of building a vSAN configuration. Various vendors have gone through the exercise for you and created configurations that are called vSAN ready nodes. These nodes consist of tested and certified hardware only and, in our opinion, provide an additional guarantee. vSAN ready nodes are also listed in the compatibility guide, as shown in Figure 2.1.
Figure 2.1 - vSAN Ready Nodes configurations
For the more adventurous types, or those who prefer a particular server model or vendor that is not currently listed in the vSAN ready nodes compatibility guide, some specifics for the various components, such as storage controllers and disk drives, must be called out. The sections that follow highlight these considerations in more detail.
Each ESXi host participating in the vSAN cluster requires a disk controller. It is recommended that this disk controller is capable of running in what is commonly referred to as pass-through mode, HBA mode, or JBOD mode. In other words, the disk controller should provide the capability to pass up the underlying magnetic disks and solid-state disks (SSDs) as individual disk drives without a layer of RAID (Redundant Array of Inexpensive Disks) sitting on top. The result of this is that ESXi can perform operations directly on the disk without those operations being intercepted and interpreted by the controller. vSAN will take care of any RAID configuration when policy attributes such as availability and performance for virtual machines are defined. The vSAN compatibility guide will call out the disk controllers that have successfully passed testing.
Every server vendor has many different disk controllers that can be selected when configuring a new server. The compatibility guide lists a number of controllers for the most commonly used server brands, as well as a couple of generic LSI disk controllers that are often used with brands like SuperMicro and Quanta. There are over 70 different disk controllers on the vSAN compatibility guide, providing an unlimited number of options in terms of hardware configurations.
In some scenarios, hardware may have already been acquired or the disk controllers that are available do not support pass-through mode. In other words, the devices behind these controllers are not directly visible to the ESXi host. In those scenarios, administrators must place each individual drive in a RAID-0 configuration and then the devices become visible to the ESXi host. However, you must ensure that this is a valid configuration for the controller. Once again, the compatibility guide will list whether a controller is supported in pass-through mode, RAID-0 mode, or indeed both. Make sure to validate the compatibility guide before configuring your disk controller in a specific way. Also note that the compatibility guide lists both the supported firmware and the driver for each individual disk controller. Validate these versions and upgrade if needed before deploying any virtual machines.
For disk controllers that do not support pass-through/HBA/JBOD mode, vSAN supports disk drives presented via a RAID-0 configuration. Volumes can be used by vSAN if they are created using a RAID-0 configuration that contains only a single drive. This needs to be done for both the magnetic disks and the SSDs. This can be done using the disk controller software/firmware. Administrators need to understand, however, that when SSDs are exposed to a vSAN leveraging a RAID-0 configuration, in many cases the drive is not recognized as a flash device because these characteristics are masked by the RAID-0 configuration. If this occurs, you will need to mark the drive as a flash device. This can be done via the vSphere web client. Simply select the device in question, and click on the appropriate disk services icon to toggle a device between Solid State Disk (SSD) and Hard Disk Drive (HDD) as shown in Figure 2.2.
There is also an example that shows how to address another common device presentation issue: how to mark a device as local. In some environments, devices can be recognized as shared volumes even though they are local to the ESXi host. This is because some SAS controllers allow devices to be accessed from more than one host. In this case, the devices, although local, are shown as shared (not local).
If you wish to mark a device as flash device, or mark a device as local, as of vSAN 6.0 this is now also possible via the vSphere web client as shown in Figure 2.2. Depending on the type of device and how it is currently marked, the menu and the icons will change accordingly. Note that it is only possible to use this functionality when the appropriate license key has been entered.
Figure 2.2 - Marking Storage Devices
When using RAID-0 instead of pass-through, you must take into consideration certain operational impacts. When pass-through is used, drives are (in most scenarios) instantly recognized, and there is no need to configure the drives as local/SSD. On top of that, when a RAID-0 set is used, the drive is bound to that RAID-0 configuration. This means that the RAID-0 set has a 1:1 relationship with a given drive. If this drive fails and needs to be replaced with a new drive, this relationship is broken and a new RAID-0 set with a new drive must be manually created. The effort involved will differ per RAID controller used, whereas with a disk controller in pass-through mode, replacing the drive is a matter of removing and inserting. However, depending on the RAID controller, vendor specific tools might be required to make the device “active” once more. In fact, new RAID-0 volumes may have to be created to allow vSAN to consume the replacement device.
VMware has carried out many performance tests using various types of disk controllers and RAID controllers. In most cases, the performance difference between pass-through and RAID-0 configurations was negligible. The VMware compatibility guide for vSAN lists the functionality that should be enabled or disabled per controller.
When utilizing RAID-0 configurations, you should disable the storage controllers write cache so as to provide vSAN full control. When the storage controller cache cannot be completely disabled in a RAID-0 configuration, you should configure the storage controller cache for 100% read cache, effectively disabling the write cache. The main reason for this is because vSAN has its own caching mechanism and we want to ensure that IOs which are acknowledged back to the guest OS has actually been stored on persistent storage (write buffer) rather than stored on disk controller cache which is outside of what vSAN controls.
Each ESXi host that is participating in a hybrid vSAN cluster and contributing storage to the vSAN datastore must have at least one capacity device. Additional capacity devices will obviously increase capacity, and may also increase performance as VM storage objects can be striped across multiple devices. Also for consideration, a higher number of capacity devices will lead to a larger number of capacity balancing options. When a disk has reached 80% of its capacity vSAN will automatically try to move components on that disk to other disks in the host, or disks on other hosts, to prevent that disk from running out of capacity.
For ESXi hosts that are participating in an all-flash vSAN configuration, flash devices are also used for the capacity tier.
Each capacity tier device will be part of a disk group. At most, a vSAN host can have five disk groups, each containing seven capacity devices, resulting in a maximum of 35 capacity devices, as depicted in Figure 2.3.
vSAN supports various types of magnetic disks, ranging from SATA 7200 RPM up to SAS 15K RPM, and these are listed on the compatibility guide. A large portion of VM storage I/O performance will be met by flash devices in the cache tier, but note that any I/O that needs to come from the capacity tier will be bound by the performance characteristics of those devices. A 7200 RPM SATA magnetic disk will provide a different experience then a high performance flash device, but usually will also come at a different price point, depending on the hardware vendor used. Chapter 9 provides various examples that demonstrate the impact of choosing magnetic disk devices and also provides an example of an all-flash configuration.
Figure 2.3 - Maximum disks and disk group configuration
Each ESXi host, whether it is in a hybrid configuration or an all-flash configuration, must have at least one flash device when that host is contributing capacity to a vSAN cluster. This flash device, in hybrid configurations, is utilized by vSAN as both a write buffer and a read cache. In an all-flash configuration, the flash device acts as a write cache only but tends to have a much higher endurance specification than the capacity tier flash devices. There is no need for a dedicated read cache in all-flash configurations, since reading from the capacity tier (which is also flash) is extremely fast.
The flash cache device in vSAN sits in front of a group of capacity devices. Each disk group requires one flash device. Because vSAN can have a maximum of five disk groups per host, the maximum number of flash devices per host used for the cache tier is also five. The more flash capacity in a host, the greater the performance will be because more I/O can be cached/buffered.
For the best vSAN performance, choose a high specification flash device. VMware supports various types of flash, ranging from SSDs to PCIe flash devices. VMware has published a list of supported PCIe flash devices, SSDs and NVMe devices in the VMware compatibility guide. Before procuring new equipment, review the VMware compatibility guide to ensure that your configuration is a supported configuration.
The designated flash device performance classes specified within the VMware compatibility guide are as follows:
Class A: 2,500–5,000 writes per second(no longer on the VCG)
Class B: 5,000–10,000 writes per second
Class C: 10,000–20,000 writes per second
Class D: 20,000–30,000 writes per second
Class D: 30,000–100,000 writes per second
Class E: 100,000+ writes per second
This question often arises: “Can I use a consumer grade SSD and will vSAN work?” From a technical point of view, vSAN works perfectly fine with a consumer grade SSD; however, in most cases, consumer-grade SSDs have much lower endurance guarantees, different (lower) performance characteristics, unpredictable latency spikes ranging from milliseconds to seconds and this is the main reason why Class A devices have been removed from the compatibility guide. Although it might be attractive from a price point to use a consumer-grade SSD, we like to stress that vSAN is dependent on flash for both buffering and caching; when your drive fails, this will impact the disk group to which this SSD is bound. When the flash device fails, the disk group is marked as unhealthy. This brings us to the second important column on the compatibility guide page, which is the flash device endurance class, which is as follows:
Class A: >= 365 TBW
Class B: >= 1825 TBW
Class C: >= 3650 TBW
Class D: >= 7300 TBW
The higher the class, the more reliable and longer the lifetime of the average device in this case. For those who are not aware, TBW stands for “terrabytes written” and is the number of writes the device is guaranteed to be able to endure.
After having looked at the various SSDs and PCIe flash devices, we have concluded that it is almost impossible to recommend a brand or type of flash to use. This decision should be driven by the budgetary constraints, server platform vendor support and more importantly by the requirements of the applications that you plan to deploy in your VMs running on vSAN.
This section covers the requirements and prerequisites from a networking perspective for vSAN. vSAN is a distributed storage solution and therefore heavily leans on the network for intra-host communication. Consistency and reliability are the keys.
Each ESXi host must have at least one 1 GbE network interface card (NIC) dedicated to vSAN hybrid configurations. For all-flash configurations, 10 GbE NICs are required. However, as a best practice, VMware and the authors of this book are recommending 10 GbE NICs for all configurations. For redundancy, you can configure a team of NICs on a per-host basis. We consider this a best practice, but it is not necessary to build a fully functional vSAN cluster.
vSAN is supported on both VMware vSphere Distributed Switches (VDS) and VMware standard switches (VSS). There are some advantages to using a Distributed Switch that will be covered in Chapter 3, “vSAN Installation and Configuration.” No other virtual switch types have been explicitly tested with vSAN. A license for the use of VDS is included with vSAN.
vSAN is supported over layer 2 (L2) (switched) or layer 3 (L3) (routed) networks. In the initial release, there was no L3 support. However, full support for L3 was introduced in version 6.0. Do note that vSAN relies on the availability of multicast traffic. This means that in both cases (L2 and L3) multicast traffic will need to be allowed and, in the case of L3, it must also be routed between the networks. We have noticed during the many conversations we have had with customers over the past 2 years that multicast traffic is usually not allowed by default, so make sure to talk to your networking team before configuring vSAN.
On each ESXi host that wants to participate in a vSAN cluster, a VMkernel port for vSAN communication must be created. The VMkernel port is labeled vSAN traffic and was introduced in vSphere 5.5. This VMkernel port is used for intra-cluster node communication. It is also used for reads and writes when one of the ESXi hosts in the cluster owns a particular VM but the actual data blocks making up the VM files are located on a different ESXi host in the cluster. In this case, I/O will need to traverse the network configured between the hosts in the cluster, as depicted in Figure 2.4, where VMkernel interface vmk2 is used for vSAN traffic by all the hosts in the vSAN cluster. The VM residing on ESXi-01 does all of its reads and writes leveraging the vSAN network.
Figure 2.4 - vSAN traffic
The protocol used by vSAN is a proprietary protocol. VMware has not published a specification of the protocol. This is similar to the approach taken for other VMware products and features such as vMotion, Fault Tolerance, vSphere Replication, and other VMware proprietary protocols, where VMware deems the information proprietary. The vSAN network is used for three different traffic types. It is important to know these because they introduce a requirement for your physical network switch configuration:
Multicast heartbeats: These are used to discover all participating vSAN hosts in the cluster, as well as to determine the state of a host. Compared to other traffic types, multicast heartbeats generate very few packets.
Multicast and unicast packets from the clustering service (CMMDS): This traffic does metadata updates like object placement and statistics. These generate more network traffic than the heartbeats, but it’s still a very small percentage.
Storage traffic (e.g., reads, writes): This is the majority of network traffic. Any host within the cluster can talk to any other host over unicast.
To ensure that vSAN hosts can communicate properly, it is required that multicast traffic is enabled on the physical switch. If multicast communication is not possible between the ESXi hosts in the vSAN cluster over the vSAN traffic network, the vSAN cluster will not form correctly.
Although multicast is just a small percentage of the total network traffic, it is a critical part. The majority of traffic in a vSAN cluster will be storage traffic as storage read and write I/O go over the network. Ensuring optimal network bandwidth is important. VMware recommends that physical switches that support real multicast traffic are used and that lower-end switches that convert the multicast traffic into broadcast traffic be avoided, if possible.
Jumbo frames are supported on the vSAN network. It is our belief that every vSAN deployment is different, both from a server hardware perspective and from a network hardware perspective. Therefore, it is difficult to recommend for or against the use of jumbo frames. In addition, there is an operational impact in implementing jumbo frames on non-greenfield sites. When jumbo frames are not consistently configured end to end, network problems may occur. Tests have been conducted to prove the benefits of jumbo frames, but results so far have been inconclusive. In some scenarios, a performance improvement of 15% is measured and a decrease of CPU utilization is observed. In other scenarios, no performance increase or CPU utilization decrease has been measured.
In an operationally mature environment where a consistent implementation can be guaranteed, the use of jumbo frames is left to the administrator’s discretion.
Another potential way of optimizing network performance is teaming of NICs. NIC teaming in ESXi is transparent to vSAN. You can team NICs in various ways. To allow vSAN to use multiple physical NIC ports, it is possible to implement either physical teaming (LACP) or create multiple vSAN VMkernel interfaces. Chapter 3 covers the configuration details and parameters in more detail. Note, however, that there is no guarantee that vSAN network traffic will be able to utilize the full bandwidth of multiple physical NICs at the same time; various factors play a part, including the size of the cluster, the number of NICs, and the number of different IP addresses used.
Although it is recommended to use 10 GbE NICs, there is no requirement to solely dedicate these cards to the vSAN network. NICs can be shared with other traffic types; however, you might consider using network I/O control (NIOC) to ensure that the vSAN traffic is guaranteed a certain amount of bandwidth over the network in the case where congestion of the network arises. This is especially true if a 10 GbE NIC shared with (for instance) vMotion traffic, which is infamous for utilizing all available bandwidth when possible. NIOC requires the creation of a VDS because NIOC is not available with VSS. Luckily, the distributed switch is included with the vSAN license.
Chapter 3 provides various examples of how NIOC can be configured for the various types of network configurations.
vSAN stretched cluster functionality was introduced with vSAN 6.1. This feature allows virtual machine components to be deployed across sites in different data centers, and if one site or data center fails, virtual machines can be restarted on the surviving site. There are a number of considerations to take into account for vSAN Stretched Cluster, including latency and bandwidth, not only between the data center sites, but also to the witness site. These will be covered in greater detail in the vSAN Stretched Cluster section, later in this book (Chapter 8), but we will list some of the basic guidelines here for your convenience:
Maximum of 5 ms RTT latency between data sites (requirement)
Maximum of 200 ms RTT between data sites and the witness site (requirement)
10 Gbps between data sites
100 Mbps from data sites to witness site
In much the same way as there are specific network requirement for vSAN stretched cluster, there are also network requirements around latency and bandwidth for 2-node ROBO deployments. With vSAN 6.1, 2-node/ROBO configurations were also introduced. For ROBO configurations the following general guidelines apply:
Maximum of 500 ms RTT between ROBO location and central witness (requirement)
1 Mbps from ROBO location to central witness
1 Gbps network connection between hosts on ROBO location
When you are enabling vSAN, a number of ESXi firewall ports are automatically opened (both ingoing and outgoing) on each ESXi host that participates in the vSAN cluster. The ports are used for inter-cluster host communication and for communication with the storage provider on the ESXi hosts. Table 2.1 provides a list of vSAN-specific network ports. Let it be clear that 98% (or more) of all traffic in a vSAN cluster will be RDT traffic on port 2233.
Table 2.1 - ESXi Ports and Protocols Opened by vSAN
Although configuring vSAN literally takes a couple of clicks, it is important to take the time to ensure that all requirements are met and to ensure that all prerequisites are in place. A stable storage platform starts at the foundation, the infrastructure on which it is enabled. Before moving on to Chapter 3, you should run through this checklist to confirm that all requirements have been met:
vSphere 6.0 U2 recommended.
Three hosts minimum.
6 GB memory per host minimum, 32 GB recommended minimum.
When exceeding 512 GB of host memory, it is required to install ESXi on magnetic disk. USB flash drives or SD cards are not supported.
Certified disk controller.
At least one certified flash device per host for caching.
At least one certified magnetic drive or flash device per host for capacity.
Dedicated 1 GbE NIC port or shared 10 GbE NIC port for vSAN.
Multicast enabled on vSAN network (layer 2 or layer 3).
The following list identifies additional recommendations, which are not requirements for a fully functional vSAN but which might be desirable from a production standpoint:
Network switch redundancy for vSAN
Network interface card (NIC) redundancy for vSAN
Jumbo frames consistently implemented end to end
NIOC to provide QoS to vSAN traffic