VMware Disaster Recovery To Azure Step-By-Step – Part 3

Azure Disaster Recovery

In the 1st part, we saw how to deploy and configure the Azure Site Recovery Configuration Server, in this 2nd part we will see how to configure Azure.

In the 2nd part, we saw the Azure configuration and how to enable replication of our on-premises VMs.

Now we will test Failover.

Image result for azure site recovery

IMPORTANT

If your on-premises VMs are set up to boot using UEFI which is VMware recommendations, you will have to make some change before the Failover or you will not be able to Failback.

GPT VS MBR – What’s the Difference?

GPT(or GUID Partition Table) is a standard for the layout of the partition table on a physical hard disk. It is available on computers with UEFI/EFI installed (not BIOS). GPT disks have these features:

  • It can have 128 partitions on a GPT disk.
  • A single partition can have 256TB space.

MBR( or Master Boot Record) is the 512-byte boot sector that is the first sector of a partitioned data storage device. MBR disks use the standard BIOS partition table. MBR disks have these features:

  • No more than 4 primary partitions on an MBR disk
  • 2TB maximum allowed space on an MBR disk

The MBR partition method is not recommended for disks larger than 2TB. But why do some people want to convert to MBR? Here’s the reason. You can boot Windows from GPT only if your computer has UEFI/EFI installed. MBR disks are supported by all Windows versions. How to convert GPT to MBR without data loss? 

So you will need to change your OS disk from GPT to MBR and your VM option to boot using BIOS and not UEFI.

Note: Another article will follow in the next few days on how to make this.

Set up networking

A virtual machine (VM) in Azure must have at least one network interface attached to it. It can have as many network interfaces attached to it as the VM size supports.

Select the target network

For VMware and physical machines, and for Hyper-V (without System Center Virtual Machine Manager) virtual machines, you can specify the target virtual network for individual virtual machines. For Hyper-V virtual machines managed with Virtual Machine Manager, use network mapping to map VM networks on a source Virtual Machine Manager server and target Azure networks.

  1. Under Replicated items in a Recovery Services vault, select any replicated item to access the settings for that replicated item.
  2. Select the Compute and Network tab to access the network settings for the replicated item.
  3. Under Network properties, choose a virtual network from the list of available network interfaces.

Modifying the target network affects all network interfaces for that specific virtual machine.

For Virtual Machine Manager clouds, modifying network mapping affects all virtual machines and their network interfaces.

Set up recovery plan

Create a recovery plan

  • In the Recovery Services vault, select Recovery Plans (Site Recovery) > +Recovery Plan.
  • In Create recovery plan, specify a name for the plan.
  • Choose a source and target based on the machines in the plan, and select Resource Manager for the deployment model. The source location must have machines that are enabled for failover and recovery.

Note: A recovery plan can contain machines with the same source and target. VMware and Hyper-V VMs managed by VMM can’t be in the same plan. VMware VMs and physical servers can be in the same plan, where the source is a configuration server.

  • In Select items virtual machines, select the machines (or replication group) that you want to add to the plan. Then click OK.
    • Machines are added default group (Group 1) in the plan. After failover, all machines in this group start at the same time.
    • You can only select machines are in the source and target locations that you specified.
  • Click OK to create the plan.

Run a test failover

You run a test failover to validate your replication and disaster recovery strategy, without any data loss or downtime. A test failover doesn’t impact ongoing replication, or your production environment. You can run a test failover on a specific virtual machine (VM), or on a recovery plan containing multiple VMs.

  • In Site Recovery in the Azure portal, click Recovery Plans > recoveryplan_name > Test Failover.
  • Select a Recovery Point to which to fail over. You can use one of the following options:
    • Latest processed: This option fails over all VMs in the plan to the latest recovery point processed by Site Recovery. To see the latest recovery point for a specific VM, check Latest Recovery Points in the VM settings. This option provides a low RTO (Recovery Time Objective), because no time is spent processing unprocessed data.
    • Latest app-consistent: This option fails over all the VMs in the plan to the latest application-consistent recovery point processed by Site Recovery. To see the latest recovery point for a specific VM, check Latest Recovery Points in the VM settings.
    • Latest: This option first processes all the data that has been sent to Site Recovery service, to create a recovery point for each VM before failing over to it. This option provides the lowest RPO (Recovery Point Objective), because the VM created after failover will have all the data replicated to Site Recovery when the failover was triggered.
    • Latest multi-VM processed: This option is available for recovery plans with one or more VMs that have multi-VM consistency enabled. VMs with the setting enabled fail over to the latest common multi-VM consistent recovery point. Other VMs fail over to the latest processed recovery point.
    • Latest multi-VM app-consistent: This option is available for recovery plans with one or more VMs that have multi-VM consistency enabled. VMs that are part of a replication group fail over to the latest common multi-VM application-consistent recovery point. Other VMs fail over to their latest application-consistent recovery point.
    • Custom: Use this option to fail over a specific VM to a particular recovery point.
  • Select an Azure virtual network in which test VMs will be created.
    • Site Recovery attempts to create test VMs in a subnet with the same name and same IP address as that provided in the Compute and Network settings of the VM.
    • If a subnet with the same name isn’t available in the Azure virtual network used for test failover, then the test VM is created in the first subnet alphabetically.
    • If same IP address isn’t available in the subnet, then the VM receives another available IP address in the subnet. Learn more.
  • If you’re failing over to Azure and data encryption is enabled, in Encryption Key, select the certificate that was issued when you enabled encryption during Provider installation. You can ignore this step if encryption isn’t enabled.
  • Click on OK
  • Track failover progress on the Jobs tab. You should be able to see the test replica machine in the Azure portal.
  • To initiate an RDP connection to the Azure VM, you need to add a public IP address on the network interface of the failed over VM if you do not have a Site-to-Site VPN from your on-premises environment to Azure.

Note: As I have a Site-to-Site VPN I can use the internal IP to test connection

  • When everything is working as expected, click Cleanup test failover. This deletes the VMs that were created during test failover.
  • In Notes, record and save any observations associated with the test failover.

When a test failover is triggered, the following occurs:

  1. Prerequisites: A prerequisites check runs to make sure that all conditions required for failover are met.
  2. Failover: The failover processes and prepared the data, so that an Azure VM can be created from it.
  3. Latest: If you have chosen the latest recovery point, a recovery point is created from the data that’s been sent to the service.
  4. Start: This step creates an Azure virtual machine using the data processed in the previous step.

Failover timing

In the following scenarios, failover requires an extra intermediate step that usually takes around 8 to 10 minutes to complete:

  • VMware VMs running a version of the Mobility service older than 9.8
  • Physical servers
  • VMware Linux VMs
  • Hyper-V VM protected as physical servers
  • VMware VM where the following drivers aren’t boot drivers:
    • storvsc
    • vmbus
    • storflt
    • intelide
    • atapi
  • VMware VM that don’t have DHCP enabled , irrespective of whether they are using DHCP or static IP addresses.

In all the other cases, no intermediate step is not required, and failover takes significantly less time.

Run a failover to Azure

Prerequisites

  1. Before you do a failover, do a test failover to ensure that everything is working as expected.
  2. Prepare the network at target location before you do a failover.

Run a failover

  • Select Recovery Plans > recoveryplan_name. Click Failover
  • On the Failover screen, select a Recovery Point to failover to. You can use one of the following options:
    1. Latest: This option starts the job by first processing all the data that has been sent to Site Recovery service. Processing the data creates a recovery point for each virtual machine. This recovery point is used by the virtual machine during failover. This option provides the lowest RPO (Recovery Point Objective) as the virtual machine created after failover has all the data that has been replicated to Site Recovery service when the failover was triggered.
    2. Latest processed: This option fails over all virtual machines of the recovery plan to the latest recovery point that has already been processed by Site Recovery service. When you are doing test failover of a virtual machine, time stamp of the latest processed recovery point is also shown. If you are doing failover of a recovery plan, you can go to individual virtual machine and look at Latest Recovery Points tile to get this information. As no time is spent to process the unprocessed data, this option provides a low RTO (Recovery Time Objective) failover option.
    3. Latest app-consistent: This option fails over all virtual machines of the recovery plan to the latest application consistent recovery point that has already been processed by Site Recovery service. When you are doing test failover of a virtual machine, time stamp of the latest app-consistent recovery point is also shown. If you are doing failover of a recovery plan, you can go to individual virtual machine and look at Latest Recovery Points tile to get this information.
    4. Latest multi-VM processed: This option is only available for recovery plans that have at least one virtual machine with multi-VM consistency ON. Virtual machines that are part of a replication group failover to the latest common multi-VM consistent recovery point. Other virtual machines failover to their latest processed recovery point.
    5. Latest multi-VM app-consistent: This option is only available for recovery plans that have at least one virtual machine with multi-VM consistency ON. Virtual machines that are part of a replication group failover to the latest common multi-VM application-consistent recovery point. Other virtual machines failover to their latest application-consistent recovery point.
    6. Custom: If you are doing test failover of a virtual machine, then you can use this option to failover to a particular recovery point.
  • If some of the virtual machines in the recovery plan were failed over in a previous run and now the virtual machines are active on both source and target location, you can use Change direction option to decide the direction in which the failover should happen.
  • If you’re failing over to Azure and data encryption is enabled for the cloud (applies only when you have protected Hyper-v virtual machines from a VMM Server), in Encryption Key select the certificate that was issued when you enabled data encryption during setup on the VMM server.
  • Select Shut-down machine before beginning failover if you want Site Recovery to attempt to do a shutdown of source virtual machines before triggering the failover. Failover continues even if shut-down fails.
  • You can follow the failover progress on the Jobs page. Even if errors occur during an unplanned failover, the recovery plan runs until it is complete.
  • After the failover, validate the virtual machine by logging-in to it. If you want to switch to another recovery point of the virtual machine, then you can use Change recovery point option.
  • Once you are satisfied with the failed over virtual machine, you can Commit the failover. Commit deletes all the recovery points available with the service and Change recovery point option is no longer available.

When a failover is triggered, it involves following steps:

  1. Prerequisites check: This step ensures that all conditions required for failover are met
  2. Failover: This step processes the data and makes it ready so that an Azure virtual machine can be created out of it. If you have chosen Latest recovery point, this step creates a recovery point from the data that has been sent to the service.
  3. Start: This step creates an Azure virtual machine using the data processed in the previous step.

Validation

During the Failover, my on-premises StoreFront Server was powered off (LAB-CSF-02), my primary was already powered off.

So if I try to connect it should connect to my Citrix ADC and then to my Storefront in Azure, let’s test this.

Note: To be able to validate this, I have added my 2 Azure Storefront servers in my LB Service group on my ADC.

In our next and last article we will review the Failback.

Stay tuned!