UPDATE – Have a look here for some of the new features that Windows Server 2012 R2 will bring to Hyper-V Replica (its even more amazing )
Hyper-v replica is one of the most highly anticipated features of Windows Server 2012. With it comes a whole new range of DR possibilities, something that would have not been possible or taken a large amount of money to achieve is now free and in the box!
The basic concept of Hyper-V replica is, as the name suggests, to replicate VMs from one site to another (or one live server to a backup server on the same site). Some of the possibilities that come to mind is the ability to replicate branch office VMs back to the main office location or from a main office up into the cloud to easily and very quickly recover in a DR situation.
How does replication work?
I have heard people when describing Hyper-v replica say ‘we can already do this with DFS’ – you can’t! DFS will only replicate a file when it has been closed and no longer in use (also Microsoft does not support using DFS to replicate VHDs /VHDXs for this purpose even if you turn the VM off).
Hyper-v replica is able to replicate the files even when they are in use in your production environment. Replication is achieved by an initial replica (IR) of your data being replicated from the primary server to the replica server (this can either be over the wire or a copy can be copied to physical media, taken to the backup server and then copied onto it. Once you have the initial copy in place, the primary server makes use of a change tracking module which keeps track of the write-operations that happen in the virtual machine.
Every 5 minutes (this is none configurable at present) a delta replication will take place. The log file being used is frozen, a new log file is created to continue tracking changes and the original log file is sent to the replica server (provided the last log file was acknowledged as being received). The changes can be seen by looking in the Hyper-V Replication Log (*.hrl) that is located in the same directory as it is associated to.
Types of delta replicas
There are a few options for the delta replicas. In the most simple case you will have selected ‘Store only the latest point for recovery’, in which case all of the replication log data is merged into the VHD file that was initially replicated to the Replica server.
The second possibility is that you have chosen to store multiple recovery points in which case when the log file is received every 5 minutes these are stored and every 1 hour / 12 log files (again this is none configurable) a snapshot is created to which the log files are written. The number of snap shots is determined by the number of recovery points you opted to keep when replication is configured. Once the limit is reached, a merge is initiated which merges the oldest snapshot to the base replica VHD.
The third possibility allows for an application consistent snapshot to be created. Application-consistent recovery points are created by using the Volume Shadow Copy Service in the virtual machine to create snapshots. The log file are sent every 5 minutes as with the two examples above but as the 12th log arrives the log files will create a snapshot (as above) and the snap shot will the app consistent (if you chose for an app consistent every 2 hours every other snapshot would be app consistent etc.)
If at any time on the Primary Server a new log cannot be created, changes continue to be tracked in the existing log and an error is registered. Replication will be suspended and a Warning is reflected in the Replication Health Report for the virtual machine.
Clustered Replica Servers
If your replica server is part of a cluster you may want to move the Replica VM from one node to another (or it may move automatically by use of VVM). To keep track of where the replica VM is the VMMS (Virtual Machine Manager Service) uses a new object called the Hyper-V Replica Broker Manager.
Hyper-V Replica Communications
The replica communications is achieved by the use of the ‘Hyper-V Replica Network transport layer’. This transport layer is responsible for authorizing access to a Replica server as well as authenticating the Primary and Replica servers. It also provides the ability to encrypt (if you are using a certificate), compress and throttle (with the use of QoS) data that is sent by the primary server.
The first connection to be made between the Primary Server and the Replica server is the ‘control channel’ the Hyper-V Replica Network Services checks to see if a control channel exists – if it does it will use it, if not it will create the connection and then transmits a control message which contains a list of files that will be sent from the Primary server to the Replica server (this is used if data transfer is cancelled mid-way through). Hyper-V Replica Network Services on the Replica server forwards the package to the Hyper-V Replica Replication Engine, which then sends a response back which contains information about which, if any, of the files already exist within a timeout interval of 120 seconds.
Once the control message has been acknowledged as being received by the Replica server data transfer can begin. This data transfer is done over a different connection to the control channel – called the ‘data channel’. The files to be transmitted will be either for an Initial Replication or for a Delta Replication. The Hyper-V Replica Network Service layer chunks the data into 2 Mb chunks and compresses it. Once the data chunks have been received by the Replica server they are decrypted and put back together before being saved to the save location specified for the replica virtual machine.
Hyper-V replica handle virtual machine migrations from one host to another and even storage migration during a data transfer. If migration of a virtual machine takes place while data transfer is in progress the Hyper-V Replica Network Service closes any open connections and will automatically re-establish connection with the Replica server once the migration is complete. The control message is used to do a comparison to see which files were missed due to the cancelled connection. The exact same procedure is used if a storage migration is carried out during a data transfer.
Configuring Hyper-V Replica
This is fairly simple – all you need is two servers capable of running the Hyper-V role. The replica site is completely hardware and storage agnostic.
Again there is not much to this – obviously Windows Server 2012 is required and also if you want to encrypt the data during transmission (defiantly recommended if you are replicating offsite to a DR center for example) you will need a certificate which can either be self-signed or provided by your PKI infrastructure.
There are two possibilities for the Replica server – either stand alone or a failover cluster.
To configure a standalone Replica server:
- Right click on the Hyper-V server on the Hyper-V Manager and ‘Hyper-V Settings’
- Click on ‘Enable this computer as a Replica server.’ You will need to do this on both the primary and Replica servers.
- Next you have a couple of options for authentication you can use Kerberos or an SSL certificate. To further enhance security you can select the servers you want to allow replication from. You can select any server or you can be more restrictive and specify servers by wildcard e.g. *.contoso.local or by actual server name MANHYP01.contoso.local.
To configure clustered Replica servers:
- Install and configure your failover cluster as you normally would and ensure you introduce enough nodes into the cluster to meet the demand.
- Once you have the cluster in place and configured as required right click on your cluster and go to ‘Configure Role’
High Availability Wizard
- You will need to specify a NETBIOS name for the broker service that you will use as the Client Access Point when configuring the VMs for replication. This will create a computer object in AD for you.
AD Computer Object
- Next, right click the Replica Broker you created and click on ‘Replication Settings’
- You will then see the configuration wizard as in the standalone configuration. You can select http or a certificate based authentication (this depends on if the remote cluster is part of the same domain or has a trust in place – if not you will need to use a certificate based approach) You can as before also select the servers that are allowed to replicate by server name or wildcard and you can select the security tag to use.
- Once the server is configured for replication you can then enable replication a per virtual machine basis. Instead of selecting the physical server to migrate to you need to select the Client Access Point e.g. in my case ‘HyperVReplica’.
Replication from this point on works in exactly the same way as described earlier with the log files being transmitted every 5 minutes. The newly created virtual machine on the Replica server will be made highly available.
Folder Structure for Hyper-V Replica
The standard folder structure you are used to with Hyper-V is created with the addition of a folder called ‘Hyper-V Replica’ with several subfolders as seen below (The Snapshots folder is only created if recovery history is enabled) with each of the virtual machines being identified by its GUID.
Networking with Hyper-V Replica
In a real world situation you would most likely be replicating your virtual machines off site to another office or to a partners DR facility over a WAN connection. The network addressing schema will obviously be different at this site and will cause problems for your users trying to access your servers. Microsoft has thought about this and has included the ability to configure different network settings at the Replica site.
To configure this you need to modify the virtual machine properties of each machine and each of the virtual adaptors connected to the machine. This is only available on synthetic network adaptors – you can’t set this for legacy adaptors. The only other pre-requisite for this to work is that your virtual machine must be running any of the following OS’s Windows Server 2012, Windows Server 2008 R2, Windows Server 2008, Windows Server 2003 SP2 (or higher), Windows 7, Vista SP2 (or higher), and Windows XP SP2 (or higher). The latest Windows Server 2012 Integration Services must be installed in the virtual machine.
Using Hyper-V Replica
Once you have all this in place and you are successfully replicating your VMs to another stand alone or cluster server you have a few ways to move over to your replicated VMs.
Planned Failover – A planned failover allows you to failover to your Replica VMs in a planned and controlled manor. This can be used if you have prior warning of an event that you know is going to cause potential problems to your primary datacenter such as a power outage or natural disaster etc.
In a planned failover reverse replication must be enabled (this is checked as a pre-requisite) so that when you failback your Primary VMs are up to date. The second pre-requisite to a planned failover is that the VMs must be shutdown prior to the failover taking place. Due to this a planned failover does require a small amount of downtime but no data will be lost.
Test Failover – A test failover is a good way to test your DR plan. When you initiate a test failover a new virtual machine is created on the replica server with the name <your VM name – Test> This VM is added to a different network (You can specify a test failover network on the VM properties) this is so it will not affect your live production environment. You can add a few test workstations to this test network and check everything works as required.
Test Failover Network
This type of failover does not require any downtime of your live production machines and so can safely be carried out during the working day.
The final failover is the un-unplanned failover – the one no-one wants!
Unplanned Failover – An un-planned failover is as the name suggests. This can happen if you have a hardware problem in your main datacenter or an environmental problem – failed generator during a power outage or failed air-conditioning unit (from experience!) and no redundancy. This allows you to bring up your replica VMs and get your users up and running very quickly. When you’re primary datacenter is up and running again you can simply replicate the VMs back and get everything back to how it was.
Although this is a great additional to a DR policy by no means is it a replacement to you back routine! You MUST continue to preform your backups as you are now.