Q: How many concurrent live migrations does each pair of Hyper-V nodes in a cluster support?
April 11, 2011
A: Each set of nodes involved in a Hyper-V live migration (the source and target) supports one concurrent live migration at a time. Each Hyper-V cluster supports eight concurrent live migrations, assuming there are 16 nodes in the cluster.
There has been some discussion about why it only supports one concurrent live migration, while other hypervisors support more. If you consider what's going on during any kind of live migration, you'll understand why performing one migration at a time gives the best performance.
During a live migration, the entire memory of the virtual machine (VM) is copied to the target host while the VM is running. The copy takes a certain amount of time based on the size of memory of the VM and the network speed, but Hyper-V saturates the network link, so the network is the bottleneck during a live migration.
While the memory is being copied, the VM is still running, so certain pages of memory change, and so those pages of memory have to be copied over again. There will be a lot less memory to copy, but it still takes time, so while those changed pages are copied, other pages may change. These pages have to be copied on the next iteration. This repeats a few times until the amount of memory to copy is so small that the VM can be paused. The remaining memory, CPU, and device states are copied to the target and the VM is started on the new host. Because you saturate the network and the faster you can copy the memory, the fewer memory changes you'll get on the source, and the small your subsequent copy cycles will be.
If you performed multiple live migrations concurrently, the network bandwidth would have to be shared between all the migrations. If you did two migrations simultaneously, the initial copy of each machine's memory would take twice as long. Because it took twice as long, the amount of memory that changed on the source could double, so you'd have to copy more memory on the next copy, so it would take longer, and so on. You can see therefore, that running multiple live migrations concurrently would actually take longer than if you just queued up the live migrations and ran them sequentially. Because the network link is saturated, it's not a static amount of data to copy, and the longer it takes to copy, the more data changes and requires re-copying.
About the Author
You May Also Like