Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Virtualization

VMWare VMDK Locked Error

Submitted by on May 21, 2016 – 10:27 pm

So, being a Unix admin on-call for the week, I just spent half of my Saturday fixing a dead Windows VM. Very annoying. The best I can tell, the issue occurred during the snapshot operation. Either the VM was powered down or powered back up by some genius while the snapshot process was still in progress. After that, the VM could not be started, or cloned, or snapped, or migrated to another cluster.

Any attempt to do so, would result in an “invalid option” or “VMDK locked” error. The only thing I could do was to migrate the VM from one ESX host to another within the same cluster in a fruitless attempt to find the host that was holding the VMDK lock. No luck there.

Checking all the hosts in the ESX cluster via SSH showed no “*.lck” files. Trying to migrate the VM to another cluster failed with the same error: some VMDK was apparently locked. Running “lsof” or “ps” on any of the cluster nodes showed no active processes holding the VMDK. VMWare KB articles, with their tiny gray-on-white fonts and full-screen width proved unhelpful. There were a bunch of <vm_name>-00000[0-9] files in the VM directory on the datastore and they were all “locked”.

The solution required two things: root SSH access to the ESX cluster host where the VM was registered. You would need to enable root SSH (from the stupid vSphere GUI) and then, if you’re feeling particularly efficient today, add your key to “/etc/ssh/keys-root/authorized_keys”

To find out which host holds your VM hostage, just SSH to each host and run:

vim-cmd vmsvc/getallvms | grep <vm_name>

The next step is to clone the main VMDK:
datastore=my_awesome_datastore
vm_name=my_awesome_vm
mkdir /vmfs/volumes/${datastore}/${vm_name}_clone
vmkfstools -i /vmfs/volumes/${datastore}/${vm_name}/${vm_name}.vmdk /
/vmfs/volumes/${datastore}/${vm_name}_clone/${vm_name}_clone.vmdk

Then, the epic fail: I had to use the vSphere Client GUI. Browse the datastore, drill down to “/vmfs/volumes/${datastore}/${vm_name}_clone”, right-click and “Add to Inventory”. Also in the GUI, edit the VM settings, remove the old “hard drive” and add the “/vmfs/volumes/${datastore}/${vm_name}_clone/${vm_name}_clone.vmdk” one.

Now, remove the original VM from the inventory and power on the clone. The stupid thing came right back up. You can move the old datastore VM folder to “something_fubar”. The last step is to rename the VM to its original name. Virtualization is for managers, GUIs are for amateurs, and Windows is for… never mind. Ugh…

Print Friendly, PDF & Email

Leave a Reply