To define a customized and well- balanced business continuity strategy, companies must consider a wide range of parameters. In particular, they should determine their recovery point objectives (RPO) and Recovery Time Objectives (RTO), which are key indicators to answer two fundamental questions:
• How much data is it acceptable to lose (RPO) per service or service chain?
• In how much time should the service or service chain (RTO) be restored?
The most critical applications with the RPO and RTO values closest to zero are those that pose the most demanding challenges such as:
• Eliminating single points of failure;
• Minimizing infrastructure costs;
• Managing high availability simply;
• Managing failures and load balancing automatically;
• Providing a continuous service even with a site failure.
Do the features of VMware Metro Storage Cluster allow enterprises to achieve these levels of availability and performance? Here are our answers and recommendations.
From the perspective of an extended high availability, as opposed to disaster recovery, companies are required to distribute their vSphere farms on two sites instead of one, as is usually customary.
The technology used is called vMSC (VMware Metro Storage Cluster), also known as Stetched Cluster.
VMware vMSC provides businesses the benefits of a local high availability cluster such as:
• vMotion and DRS (VM migration and dynamic allocation of VMs between hosts without interruption of service);
• High Availability (automatic restart of VM in the event of a host failure);
• Full Tolerance (permanent availability of applications in the event of a host failure).
The cluster is spread over two geographic sites. It is important to note that this configuration, unlike VMware SRM, uses only vCenter.
In the case of VMware Metro Storage Cluster, only the primary relationship level exists, which allows access to data on either part of the cluster, in real time.
The most important requirements are:
• A stretch storage architecture, active / active, with synchronous mirroring;
• A network connectivity extended to level 2;
• Latency (RTT or round-trip time) and the maximal distance between sites;
• Bandwidth;
• The quorum witness at a third site or the in cloud;
• Only one vCenter.
Here are some of the benefits associated with such an approach.
• RPO and RTO values close to zero;
• The ability to migrate VMs among sites without service interruption;
• No issues changing IP addresses;
• Automatic and immediate treatment of storage failures;
• Transparency for users in the event of site failover.
Do you need a recovery solution or an extended high-availability solution? Which approach is compatible with your service level requirements (SLAs)?
We recommend establishing two scenarios, one based on vMSC and the other on VMware Site Recovery Manager (SRM).
• Option 1 : two production data centers in active-active mode with extended storage and networking.
• Option 2 : two data centers in active / passive mode, one for production and one for testing and development. If the production site fails, SRM performs a scheduled recovery of VMs on the secondary site. There are many alternative tools, although with less orchestration, such Veeam Backup & Replication or Zerto Virtual Replication.
Our specialists can assist you in determining the most appropriate scenarios and their respective ROIs.
In addition to the requirements above, the following points should be considered:
• As with a local cluster, the solution has only one vCenter. In the case of failure, both sites are disturbed;
• DRS and HA does not have site awareness;
• vMSC is an extended high availability solution, and as such, does not have procedures for dealing with unplanned outages and is not able to resolve a corruption.
The testing phase during the implementation of your high availability environment with VMware vMSC is a critical step. The various failure scenarios must be thought of and tested prior to production. Everything must be documented and executed according to the test plan.
Experience shows that this phase is often neglected or forgotten for lack of time and resources.
Ask our experts to help improve your infrastructure project while still developing the skills and independence of your team.
Image : © Yabresse - Fotolia.com