VMware Solutions Discussions

VMDKs disappeared from vVol VMs during backups

Blissitt
3,189 Views

We had an unusual event last week that I wanted to share.  I had seven VMs on a NetApp NFS-based vVol.  Our Cohesity backups kicked off at 3:01pm and by 3:04pm, the VMDK files for six of those VMs went missing (the .vmdk descriptor files were still there, but the .vmdk file containing the actual data were somehow inaccessible).  All versions of software were supported per NetApp's IMT: VMware versions were 7.0U1, VSC was 9.7.1P1, ONTAP was 9.7P6, VAAI plugin was 2.0.  The one remaining VM on the same vVol was undamaged, and two other VMs on a separate vVol also remained intact.  We've never had trouble with Cohesity backing up hundreds of other VMs on "regular NFS" datastores and this was the first time we'd seen trouble on a vVol datastore.

 

We were able to restore the VMs from backups taken the previous day and with little/no data loss.  I currently have ticket 2008782637 in with NetApp on how to delete the suspect vVol because vCenter won't allow it.  (vCenter thinks that VMs still reside on the vVol and it's probably right - even though the old, broken VMs have been deleted, their VMDKs likely exist *somewhere* in the vVol.)  The only recent change I made was that I moved one VM to the impacted vVol and two VMs off of it in the hours before the backup.  (vVols had been working so well for us that I wanted to put another production VM on vVols and remove two non-production ones.)

 

I'll continue to use vVols, but only for test VMs for the near future.

1 ACCEPTED SOLUTION

Blissitt
3,077 Views

Thanks, Bingen.  I just got off the phone with Adam at NetApp support.  He and I verified that we can't delete the vVol with the usual commands provided by the VSC plugin and that we'll have to use some combination of ONTAP System Manager, vCenter, and the Web based CLI interface [sic.] at https://ww.xx.yy.zz:9083/ .  He's going to test this out and send me a list of tasks to do.  I supplied all requested logs with my original ticket (2008782637) in case anyone at NetApp wants to look at them.  My main focus now is just removing the broken vVol.

 

Most interesting is that there is a new 9.8 version of the VSC (with a totally different name) that's been out since *February*.  I didn't know about it since the product's name had been changed and there was no mention of this in the old download location.  For anyone who wants to upgrade, the product is now called "ONTAP tools for VMware vSphere," so go to NetApp Downloads section and you'll find it under the letter "O."

 

Given that I was running the latest/last version of the "old" software, 9.7.1P1, and would have never known there was anything newer, I would recommend that NetApp add some kind of link to the new version/newly-named software from the old location.  (Especially since this last version of the old software was what I was running when my vVol broke and others may think 9.7.1P1 is the latest and greatest...)

View solution in original post

3 REPLIES 3

ChanceBingen
3,132 Views

That's definitely a strange occurrence that I've never seen before.

 

From the VASA Provider Web UI you can do things like find missing vVols and delete them, so the support team should be able to help you with that.

 

I'd recommend collecting a support bundle from the VSC settings menu in vCenter, or even from the administrative web page, before any logs roll off so you can get an RCA on what happened.

 

Within the VASA Provider, the CXF log captures every VASA API call that is sent from vCenter, so RCA should be possible if you can nail down exactly what time we are looking for.

Blissitt
3,078 Views

Thanks, Bingen.  I just got off the phone with Adam at NetApp support.  He and I verified that we can't delete the vVol with the usual commands provided by the VSC plugin and that we'll have to use some combination of ONTAP System Manager, vCenter, and the Web based CLI interface [sic.] at https://ww.xx.yy.zz:9083/ .  He's going to test this out and send me a list of tasks to do.  I supplied all requested logs with my original ticket (2008782637) in case anyone at NetApp wants to look at them.  My main focus now is just removing the broken vVol.

 

Most interesting is that there is a new 9.8 version of the VSC (with a totally different name) that's been out since *February*.  I didn't know about it since the product's name had been changed and there was no mention of this in the old download location.  For anyone who wants to upgrade, the product is now called "ONTAP tools for VMware vSphere," so go to NetApp Downloads section and you'll find it under the letter "O."

 

Given that I was running the latest/last version of the "old" software, 9.7.1P1, and would have never known there was anything newer, I would recommend that NetApp add some kind of link to the new version/newly-named software from the old location.  (Especially since this last version of the old software was what I was running when my vVol broke and others may think 9.7.1P1 is the latest and greatest...)

Blissitt
2,781 Views

I just noticed that I forgot to come back here and share my solution for removing the vVol from VMware.  To recap, the NetApp plugin in vCenter wouldn't allow me to delete the vVol, and the steps shared by NetApp support to remove it manually didn't work.

 

Basically, I ended up with an orphaned vVol in vCenter with no storage behind it.  I also had a few orphaned volumes left behind on the NetApp with no associated vVols in vCenter (deleting a vVol successfully in vCenter does not delete the back-end NetApp volume).  I was able to use the Expand Storage of vVols Datastore option from the NetApp plugin to combine the two orphans together and then I was able to do a Delete Datastore from the NetApp plugin successfully.

 

At some point during the above steps, I upgraded to the 9.8 version of the now-renamed VSC to hopefully get away from the bug I experienced in 9.7.1P1.  I see that NetApp released 9.7.1P2 since I posted my previous comment and hopefully that update fixes this same problem in the 9.7.x branch.

Public