VMware Solutions Discussions

Highlighted

RDM LUNs not fully removed from VM

I have an issue where RDMs disconnected from the Guest OS (in this case, Windows 2008R2 64bit) using SnapDrive 6.3.1 are sometimes not fully removed from the virtual machine. The disk is not visible in the OS or in SnapDrive so appears to have been removed, but going to Edit Settings on the VM will still show the RDM as connected, and the LUN still exists and appears mapped on the filer. I've also noticed when viewing the storage paths on the host a number of 'dead' paths, which I assume are old RDM connections.

This happens on both ESXi4 and ESXi5 hosts.

This can cause problems if a LUN is removed from the filer, as it appears to be no longer connected to the Guest OS, when the VM still thinks it is connected.

Also posted at VMware community.

http://communities.vmware.com/thread/340362

17 REPLIES 17
Highlighted

Re: RDM LUNs not fully removed from VM

Hi,

Are you using vcenter or ESX directly with Snapdrive ?

Valéry.

Highlighted

Re: RDM LUNs not fully removed from VM

Hi,

We use vCentre and the authentication details within Snapdrive are configured correctly, as are the Transport Protocol Settings.

Thanks,

Graeme

Highlighted

Re: RDM LUNs not fully removed from VM

If you replace vcenter by ESX in Snapdrive configuration, can you confirm things go better to see if I hava had the same problem as you.

Highlighted

Re: RDM LUNs not fully removed from VM

or restart the vcenter service also do it.

Highlighted

Re: RDM LUNs not fully removed from VM

I will try authenticating with ESX host rather than vCentre.

Restarting the vCentre service isn't really a solution as this happens quite regularly

Thanks

Highlighted

Re: RDM LUNs not fully removed from VM

Hi,

OK, tell me if that resolves temporarly the problem, I will take a look into the internal bug database to see if there is an explanation and a correction.

De : c-xdl-communities

Envoyé : Thursday, December 22, 2011 09:24 AM

À : Loiseau, Valery

Objet : - Re: RDM LUNs not fully removed from VM

<https://communities.netapp.com/index.jspa>

Re: RDM LUNs not fully removed from VM

created by GRAEMEOGDEN<https://communities.netapp.com/people/GRAEMEOGDEN> in Virtualization - View the full discussion<https://communities.netapp.com/message/70418#70418>

I will try authenticating with ESX host rather than vCentre.

Restarting the vCentre service isn't really a solution as this happens quite regularly

Thanks

Reply to this message by replying to this email -or- go to the message on NetApp Community<https://communities.netapp.com/message/70418#70418>

Start a new discussion in Virtualization by email<mailto:discussions-community-products_and_solutions-virtualization@netappcommunity.hosted.jivesoftware.com> or at NetApp Community<https://communities.netapp.com/choose-container.jspa?contentType=1&containerType=14&container=2160>

Highlighted

Re: RDM LUNs not fully removed from VM

In VCenter, have you tried Rescanning the Datastores?

Highlighted

Re: RDM LUNs not fully removed from VM

Hi Peter,

Rescanning the datastores will remove the dead paths, however we have an automated process which maps/unmaps LUNS via snapdrive every hour to update some databases. Whenever a LUN is removed there's chance of a dead path remaining. Eventually these accumulate and seem to cause performance issues.

Of course I could manually rescan the datastores every week or so but a root cause would be nice!

Thanks,

Graeme

Highlighted

Re: RDM LUNs not fully removed from VM

This exact thing is also happening to us.  The rescan drops the dead luns, but manually scanning isn't a good enough solution.  We have had hosts disconnect because of the Snapdrive mounting the LUNS to the ESX's local datastore too when using Snapdrive 6.3.1.  I've had opened many tickets with Netapp and Vmware but still no concrete solution.

At first it was the version of Snapdrive we were using, things clear up and then it happens again out of the blue.  I have now been able to recreate the Host disconnection issue, and it happens during a Snapmanager backup for SQL.  As soon as VMware scans the HBA's the Host disconnects. I haven't tried to connect Snapdrive to the host, because of HA, I wouldn't think you'd want to do that if the machine running Snapdrive migrated off that host, right?

Highlighted

Re: RDM LUNs not fully removed from VM

Would be interested to know how you were able to recreate the issue so I could open support cases with Netapp/VMware.

At the moment I'm thinking of scheduling a powershell script to rescan the HBAs every month or so to try work around this problem.

Highlighted

Re: RDM LUNs not fully removed from VM

After looking in our bug database this bug is know as APD condition or issue (all path down)...

The APD discussion appears in the VMWARE kb 1016626, 1015084 and those related to this problem knowned by VMware...

and in our bug database : 346071, 515927 ... and a lot of them which are public or not.

But open a case so as your customers appears impacted by this problem and be associated to this problem and confirm that it will be corrected.

Highlighted

Re: RDM LUNs not fully removed from VM

Interesting that http://kb.vmware.com/kb/1016626 suggests this was resolved in ESXi 4.1 Update 1 as we're seeing this on ESXi 5!

Not sure if it's related but I have noticed that although the LUN is disconnected, the vmdk pointer file sometimes isn't removed from the datastore. All mounts/dismounts are being performed by Snapdrive CLI which I thought removed RDMs fully....?

I'll try raising a support case and see if they can help, but it's very hard to simulate!

Highlighted

Re: RDM LUNs not fully removed from VM

Maybe still affect the ESX5 ? And the fact that VMDK pointer files is still there after is something that need to be investigating ? Maybe a interruption in the Snaprive / Vcenter execution of RDM lun remove ?

Documented in SDW 6.3.1R1 release notes under Known issues.

http://now.netapp.com/NOW/knowledge/docs/snapdrive/relsnap631r1/pdfs/rnote.pdf

title: Removing a LUN from an ESX host causes multipath

software to report that all the paths to the LUN are down

Issue: When you remove a LUN from an ESX 4.0 or 4.1 host, the multipath software

reports that all the paths to the LUN are down.

Corrective action: For hosts running ESX 4.0 Update 3 or ESX 4.1 Update 1, before removing the

LUN, perform the following steps:

1. Enable VMFS3.FailVolumeOpenIfAPD.

2. Remove the LUN.

3. Perform a complete rescan.

4. Disable VMFS3.FailVolumeOpenIfAPD

Highlighted

Re: RDM LUNs not fully removed from VM

You should give a try to use SDW

https://now.netapp.com/NOW/download/software/snapdrive_win/6.4/

see in the release notes :

https://now.netapp.com/NOW/knowledge/docs/snapdrive/relsnap64/pdfs/rnote.pdf

bug 515927 - Removing a LUN from an ESX host causes multipath software to report that all the paths to the LUN are down.

View solution in original post

Highlighted

Re: RDM LUNs not fully removed from VM

Looks like this version of SnapDrive will solve the issue, thanks! I'll get this tested and rolled out soon.

Check out the KB!
Knowledge Base
All Community Forums