• No se han encontrado resultados

Next, we will look at another resilience feature of the VSA, the loss of an ESXi host. A running virtual machine called WinXP is already deployed on the VSA storage. We will see how an outage to one of the ESXi hosts in the cluster (the one which hosts the VSA storage on which the virtual machine is deployed) has no impact on the running virtual machine. This is achieved because another VSA member who maintains the secondary mirror of the datastore takes over the responsibility of presenting the datastore when the primary mirror has a failure.

This particular VSA is a three-node confi guration. This means there are three ESXi hosts running three VSA appliances (one each) and therefore three NFS datastores presented by the cluster to each of the ESXi hosts in the cluster.

In this particular Datastores view of the VSA Cluster, you can see that each datastore is being exported by a

unique VSA member. For example, VSADs-0 is being exported by the VSA member VSA-1.

T E C H N I C A L W H I T E P A P E R / 5 0 Figure 54. Host Failure

Now, we will run a command which is going to initiate an ‘uncontrolled shutdown’ on the ESXi host to simulate a hardware failure. The host that we are going to fail (10.20.196.26) is the host that has appliance VSA-2 running on it. By bringing down this ESXi host, we also bring down VSA-2 in an uncontrolled fashion. Since WinXP is running on an NFS datastore from this VSA (VSADs-2), we will see how the cluster handles the condition. This will cause an outage on the ESXi host on which it is run. We will then see how VSA handles this outage. What you should observe is a slight degradation in performance while the VSA cluster switches from the primary to the mirror datastore, but once the failover has successfully taken place, I/O should return to its previous performance.

From the ESXi shell on VSA-2, run the following command

# vsish -e set /reliability/crashMe/Panic 1

We can see that the host has faulted and that two of the datastores in our VSA have become degraded. This is because we have lost a VSA, and since each VSA provides both a primary and a replica, the outage will aff ect two datastores.

Note the red warning symbol against the host 10.201.96.26. Because the ESXi hosts are in a vSphere HA cluster, HA has detected a confi guration issue - a possible host failure. This is a very similar condition to that experienced when the front-end network fails.

Even though we have lost a VSA member, which has caused degradation on two datastores in the cluster, one of which our virtual machine was running on, our virtual machine continues to operate. This is because the VSA member which was previously hosting the replica part of that datastore has now been promoted to primary, and presents the datastore back to the ESXi hosts using the same IP address which was used by the failed cluster member. Therefore the failover is transparent to the ESXi hosts, and does not impact the virtual machines running on the datastore.

Figure 55. Virtual Machine Resumes without Needing a Reboot

Note that VSA-2 is now offline because the ESXi host on which it resides has failed. You can also see that the appliance VSA-0 is now exporting two NFS datastores, one of which is VSADs-2, which earlier were exported from the VSA on the failed ESXi, VSA-2.

Figure 56. One Appliance Now Presents Two NFS Datastores

So even though we have had a major server failure with one node experiencing an uncontrolled outage, the VSA is resilient enough to survive a failure of this nature and continue to do the NFS exports using mirrored volumes. That is, one of the VSA members takes the responsibility for exporting two NFS datastores in the event of another member failure. The signifi cant business benefi t of the resilience of the VSA is that it can prevent unplanned outages.

T E C H N I C A L W H I T E P A P E R / 5 2

Resilience: Replacing a Failed ESXi Host

By now you should be aware that VSA can handle failures at both the ESXi host and appliance level and continue to present the full complement of NFS datastores. This means that if the ESXi host on which the appliance is running goes down, the cluster will seamlessly present that NFS datastore from another node in the cluster. This is transparent to the ESXi hosts that have the NFS datastore mounted and is transparent to any virtual machines running on that datastore.

Let us discuss what happens if you have a hardware failure on one of your ESXi hosts and the server vendor is going to take a while to ship you the replacement part. One of the features of the VSA is that it will allow you to replace an offl ine/failed node with a brand new ESXi host. Look at a sample two–node confi guration here:

Figure 57. ESXi Host Failure also Impacts VSA Appliance on the Host

In this case, we have lost one of the nodes in a two–node cluster (and of course the appliance VSA-0 running on that node). In this case, VSA-1 will take over the presentation of the NFS datastore from VSA-0. This places both the NFS datastores into a degraded state, but the datastores are still presented to the ESXi hosts and the virtual machines on those datastores are unaff ected and continue to run. The term ‘degraded’ means that datastores have no mirror copy/replica. The only issue is that both NFS datastores are now being presented from the same appliance on the same ESXi host. The VSA Manager will show the appliance as offl ine in the Appliances view in VSA Manager:

To replace this node with a brand new node, and to bring the cluster out of the degraded state, select the offline appliance, right-click it and select the option to do a ‘Replace Appliance’:

Figure 59. Replace Appliance Initiated

Now, follow the wizard-driven steps to replace the offline appliance with a new appliance on a new ESXi host. Just like the installation process, the UI will show you all available ESXi 5 hosts in the datacenter. Two of these hosts are already used by the VSA Cluster (one of which is failed) and are not available for

selection, but as shown in the example a third host that is not in the cluster can be used:

Figure 60. Select a Replacement Host for the New Appliance

When the networking has been configured on the replacement ESXi and the VSA appliance deployed (all of which is done automatically), the volume and replica are created on the new appliance and synchronized with the NFS volumes already in the VSA Cluster.

T E C H N I C A L W H I T E P A P E R / 5 4 Figure 61. Appliance is Now Replaced

The VSA Cluster is now back to its optimal state. So, even though a node in the VSA Cluster may suff er a hardware failure, procedures have been built into the VSA Cluster to help customers keep it highly available, allowing a failed node to be swapped out of the cluster for a new healthy server. And this can be done while the VSA Cluster continues to present a full complement of NFS datastores.