Hello everyone! Hope you’re all doing well.
My post today is about Hyper-V Checkpoints gone bad. What do I mean with bad? Well, let me sketch the situation first, but I’ll keep it short.
A customer was running Hyper-V 2012 R2, upgraded to Hyper-V 2016, in combination with Hyper-V Replica to Azure ASR. Unfortunately, the available bandwidth of the internet connection was not as it should be for the replication to go through smoothly. (The internet connection is in the process of being upgraded, so that will change shortly.)
However, combine limited bandwidth and too much Hyper-V replica’s for the available bandwidth and you will end up with replica’s being retried all the time and even re-synchronization running now and then. And the re-synchronization is where the checkpoints come in the picture. This particular situation got so bad that even the re-syncs wouldn’t go through and eventually some of the VMs ended up living on multiple checkpoints, without any trace of the checkpoints in Hyper-V Manager or Failover Cluster Manager. The only evidence was on the file system, where you could see multiple .avhdx files with names pointing to the original .vhdx files and in the configuration of the VMs, where the attached disks were the .avdhx files.
Guide to the solution
After performing multiple searches on the all knowing internet, I found some useful posts and articles to manually merge the .avhdx files into the original .vhdx files and get back to a normal situation. In this post I would like to summarize some of the most important things you have to keep in mind when you are dealing with manually merging checkpoints.
First of all: it should be clear that manual merge is not the first step you perform when dealing with this kind of problem!
Try this first!
The first step you should try, when you have a situation similar to mine (no track of checkpoints in Hyper-V Manager, only on file level and in the VM config), is to create an additional checkpoint on the VM and remove it. Hyper-V will try to clean up the entire checkpoint chain when you remove that last checkpoint.
Identify the chain
If this does not help, you’ll need to identify the checkpoint chain of your disks. On the right of your Hyper-V Manager you’ll have the action “Inspect Disk…”. Use this and browse to the .avhdx file you think is the last in the chain (start with the one specified in your VM config). Using the “Inspect Parent…” button, see if you can inspect your way up to the original .vhdx file.
If you are able to get to the .vhdx file without any hassle, you should be fine for the rest of the run. If not, you’re in a bit more trouble… (jump to the “Link to parent” part of this post).
Merge with parent disk
If you have the chain figured out, you can start merging the .avhdx files all the way up to the original .vhdx file. To do this, you’ll need the “Edit disk…” action from the Hyper-V Manager console.
Merge your way up the chain and check your VM config to point it to the .vhdx file.
Link to parent disk
When your chain is not complete or broken, due to original files being moved or renamed or merging in the wrong order, you will be presented with an error when you try to Inspect the disk.
Using the “Edit disk…” action, you can find out where the chain is broken. If you select an .avhdx file that has no link to it’s parent, you will only have the option to reconnect it.
Once you recreate the chain, you can start merging the disks again.
The conclusion is simple: monitor your environment, keep track of things like checkpoints, replications, backups… And make sure this does not happen.
If it does happen however, I hope this post will help you solve your problem.
Have a nice day! ‘Till next time.