Microsoft Virtualization Discussions

set-nahostdisk fails for FC LUN

drdabbles
6,630 Views

I'm trying to replace the functionality of SDCLI in a set of PowerShell scripts I've written but I'm running into a problem when attaching the LUN to the IGroups for the cluster nodes.

My script looks like this...

$sanname = "netapp-01.example.com"

$sancontroller = connect-nacontroller $sanname

$snapshot = get-nasnapshot -name VOLUME | where { $_.Name -like "sqlsnap*" } | sort-object AccessTimeDT -descending | Select -First 1

$clone = new-navolclone -parentvol VOLUME -clonevol VOLUME_CLONE -SpaceReserve none -parentsnapshot $snapshot.Name

$lun = get-nalun -Path ("/vol/" + $clone.Name + "/*")

$lun = Set-NaLunSignature -Path $lun.Path -GenerateRandom -Confirm:$false

$lun = $lun | set-nalun -Online -Force

$lun = $lun | add-nalunmap -InitiatorGroup "igroup1"

$lun = $lun | add-nalunmap -InitiatorGroup "igroup2"

Start-NaHostDiskRescan

$hostdisk = Wait-NaHostDisk -Timeout 10000 -ControllerLunPath $lun.Path -ControllerName $sancontroller.Name.Split(".")[0]

$hostdisk = set-nahostdisk -DiskIndex $hostdisk.Disk -Online -Force

$hostvolume = $hostdisk | get-nahostvolume

$hostvolume = $hostvolume | Mount-NaHostVolume -MountPoint "Q:\"

The bold red line is where I'm hitting the problem. I get the following error

Set-NaHostDisk : The media is write protected. (Exception from HRESULT: 0x80070013)

I get the same error if the LUN is mapped to only a single system, so it doesn't seem to be related to the fact that it's a cluster or two hosts in general. The only other thing I can imagine it being is Fiber channel instead of iSCSI. Does this sound reasonable? The host clearly sees the attached LUN, because the Wait-NaHostDisk returns the object containing the correct LUN's info.

Does anybody have experience bringing a LUN online on a server this way? I'd really love to hear from anyone that does. SDCLI simply isn't cutting it anymore, so I'm willing to do the extra work in PowerShell to add the resources to the Windows cluster if it means I don't have to call SDCLI to attach the LUN to my servers.

1 ACCEPTED SOLUTION

timothyn
6,630 Views

You're definitely right, it's the "DiskIdGuid" cluster parameter and the HostGptGuid.  I think you are also correct that the curly braces is a problem with the disk ID.  Keep in mind that this is not the same as the volume Guid (although they are both Guids).

For setting the cluster parameter, try this:

$res | Set-ClusterParameter "DiskIdGuid" $hostdisk.HostGptGuid.ToString("b")

View solution in original post

8 REPLIES 8

sizemore
6,630 Views

Can you online and mount the volume manually from Disk Management once the lun has been presented?

timothyn
6,633 Views

That looks like you are on the right track.  And it's a great use of the new disk management cmdlets. 

I suspect the cluster service is taking control of the disk, since it's mapped to multiple nodes, and leaving it in a reserved state.  Note that manually onlining a disk isn't normally necessary (cluster or otherwise) unless it was specifically offlined.

I assume you intend to bring this online as a cluster disk?  In that case I would just change your script starting right after Wait-NaHostDisk:

Import-Module failoverclusters

$hostdisk = Wait-NaHostDisk -Timeout 10000 -ControllerLunPath $lun.Path -ControllerName $sancontroller.Name.Split(".")[0]

#move the available storage group to the local node so we can operate on the cluster disk

Move-ClusterGroup "Available Storage" $env:COMPUTERNAME

#Create the new cluster resource

$res = Add-ClusterResource -Name "NewClusterDisk" -Group "Available Storage" -ResourceType "Physical Disk"

#Set the disk identifier, if you are using GPT you need to use "DiskIdGuid" $hostdisk.HostGptGuid.ToString("b")

$res | Set-ClusterParameter "DiskSignature" $hostdisk.HostMbrSignature

$res | Start-ClusterResource

#Remount the volume with a new drive letter

$hostdisk | Get-NaHostVolume | Dismount-NaHostVolume -Confirm:$false | Mount-NaHostVolume -MountPoint "Q:"

Let me know if that works for you.  Cheers!

Eric

drdabbles
6,633 Views

This is exactly what I needed, except for one small problem. When I try to start the cluster resource ($res | Start-ClusterResource), it fails. I get an error in the event log:

Cluster physical disk resource 'TestClusterDisk' cannot be brought online because the associated disk could not be found. The expected signature of the disk was '{00000000-0000-0000-0000-000000000000}'. If the disk was replaced or restored, in the Failover Cluster Manager snap-in, you can use the Repair function (in the properties sheet for the disk) to repair the new or restored disk. If the disk will not be replaced, delete the associated disk resource.

When I look at the disk's HostVolume parameter, I get "\\?\Volume{00000000-0000-0000-0000-000000000000}\". Obviously this is where the signature error is coming from, but the HostGptGuid returns "779a06f2-f203-46d5-bf90-d784c5ee3ea4". The same information appears on both cluster nodes, so it doesn't appear to be an ownership problem. And the disk appears as "reserved" on both cluster nodes' Disk Management utilities.

sizemore
6,633 Views

If it's a GPT disk you need to use the GPT GUID... Try:

Import-Module failoverclusters

$hostdisk = Wait-NaHostDisk -Timeout 10000 -ControllerLunPath $lun.Path -ControllerName $sancontroller.Name.Split(".")[0]

Move-ClusterGroup "Available Storage" $env:COMPUTERNAME

$res = Add-ClusterResource -Name "NewClusterDisk" -Group "Available Storage" `

    -ResourceType "Physical Disk"

$res | Set-ClusterParameter "DiskSignature" $hostdisk.HostGptGuid

$res | Start-ClusterResource

$hostdisk | Get-NaHostVolume | Dismount-NaHostVolume -Confirm:$false |

    Mount-NaHostVolume -MountPoint "Q:"

drdabbles
6,633 Views

Same problem. Also, I assume you meant DiskIdGuid and not DiskSignature. Signature will not accept a GUID as a valid input, which stands to reason if it's expecting a MBR signature.

If I get-clusterparameter from a working GPT disk resource, I get:

Object                        Name                          Value                         Type

------                        ----                          -----                         ----

WorkingVolName          DiskIdType                    1                             UInt32

WorkingVolName          DiskSignature                 0x0                           UInt32

WorkingVolName          DiskIdGuid                    {880987f7-9954-4eb8-9100-2... String

WorkingVolName          DiskRunChkDsk                 0                             UInt32

WorkingVolName          DiskUniqueIds                 {16, 0, 0, 0...}              ByteArray

WorkingVolName          DiskVolumeInfo                {1, 0, 0, 0...}               ByteArray

WorkingVolName          DiskArbInterval               3                             UInt32

WorkingVolName          DiskPath                                                    String

WorkingVolName          DiskReload                    0                             UInt32

WorkingVolName          MaintenanceMode               0                             UInt32

WorkingVolName          MaxIoLatency                  1000                          UInt32

WorkingVolName          CsvEnforceWriteThrough        0                             UInt32

WorkingVolName          DiskPnpUpdate                 {0, 0, 0, 0...}               ByteArray

If I run the same against the disk resource that won't come online, I get:

Object                        Name                          Value                         Type

------                        ----                          -----                         ----

NewClusterDisk                DiskIdType                    1                             UInt32

NewClusterDisk                DiskSignature                 0x0                           UInt32

NewClusterDisk                DiskIdGuid                    a435b71b-d677-445a-933d-3b... String

NewClusterDisk                DiskRunChkDsk                 0                             UInt32

NewClusterDisk                DiskUniqueIds                 {}                            ByteArray

NewClusterDisk                DiskVolumeInfo                {}                            ByteArray

NewClusterDisk                DiskArbInterval               3                             UInt32

NewClusterDisk                DiskPath                                                    String

NewClusterDisk                DiskReload                    0                             UInt32

NewClusterDisk                MaintenanceMode               0                             UInt32

NewClusterDisk                MaxIoLatency                  1000                          UInt32

NewClusterDisk                CsvEnforceWriteThrough        0                             UInt32

NewClusterDisk                DiskPnpUpdate                 {}                            ByteArray

Right away, I'm noticing the working disk resource uses proper GUID notation with "{" and "}" characters. I also notice it contains volume information, presumably because the volume is online and reachable.

If I set the GUID manually on the command line with $clusterres | Set-ClusterParameter "DiskIdGuid" "a4f3edf4-a070-48a8-a801-6f8490adf83a", I will get an error that gives the correct GUID, but the same error about repairing the volume.

This is the one thing I'm not excited by with this setup- There doesn't seem to be much rhyme or reason to failures like these. But, for kicks I removed the disk resource from the cluster and tried to bring the LUN online as a disk in the Windows disk management tool, and it took right off. I assigned it a drive letter and was able to browse with no problem.

So, to me, it looks like NetApp Powershell toolkit has presented the LUN to the cluster nodes successfully. Now it looks like the Microsoft Clustering Powershell toolkit is having a problem bringing that LUN into the cluster as a working resource. When I manually add the disk resource through the Failover cluster management tool, it works like a CHARM. So, perhaps I'm missing some critical step with the cluster toolkit.

My other scripts that work with the cluster toolkit work fine, but I'm using SDCLI to bring the volumes online in that case. (I've been told that I was not the use case when NetApp initially conceived of the PS toolkit...because I use literally none of the GUI tools ) I don't have to do anything special to add dependencies or anything, so I don't think I'm missing a step here. Any other ideas?

drdabbles
6,633 Views

In case it's helpful, here's some output of several commands...

PS C:\Users\Administrator.PROD> get-navol

Name                      State       TotalSize  Used  Available Dedupe  FilesUsed FilesTotal Aggregate

----                      -----       ---------  ----  --------- ------  --------- ---------- ---------

{snip}

testvol            online        93.8 GB   87%    12.5 GB  True         109         5M aggr0

{snip}

PS C:\Users\Administrator.PROD> get-nalun

Path                                      TotalSize   SizeUsed Protocol     Online Mapped  Thin  Comment

----                                      ---------   -------- --------     ------ ------  ----  -------

{snip}

/vol/testvol/testlun                  110.0 GB    86.2 GB windows_2008  True   True   True  Lun Comment

{snip}

PS C:\Users\Administrator.PROD> Get-NaLunMap "/vol/testvol/testlun"

Name       : IGROUP-01

Initiators : {20:01:00:24:e8:6e:4f:52, 20:02:00:24:e8:6e:4f:52}

Path       : /vol/testvol/testlun

LunId      : 13

Name       : IGROUP-02

Initiators : {20:01:00:24:e8:6e:4f:5f, 20:02:00:24:e8:6e:4f:5f}

Path       : /vol/testvol/testlun

LunId      : 13

PS C:\Users\Administrator.PROD> Get-NaHostDisk

HostDrivePath                  Disk       Size ControllerPath

-------------                  ----       ---- --------------

{snip}

\\?\Volume{a6dbbd2d-4b00-11...   17   110.0 GB netapp-01:/vol/data4_deleteme/data4

PS C:\Users\Administrator.PROD> Get-NaHostDisk 17 | select *

 

Disk                 : 17

Size                 : 118115020800

ControllerPath       : netapp-01:/vol/testvol/testlun

ClusterGroup         : Available Storage

ClusterResource      : Cluster Disk 1

ClusterNode          : node-04

HostDrivePath        : \\?\Volume{a6dbbd2d-4b00-11e1-b532-0024e86e4f5d}\

HostDiskName         : \\?\PhysicalDrive17

HostVolume           : \\?\Volume{a6dbbd2d-4b00-11e1-b532-0024e86e4f5d}\

HostVolumeIsCsv      : False

HostGptGuid          : a435b71b-d677-445a-933d-3b925df27057

HostMbrSignature     :

HostDiskIndex        : 17

DiskSerialNumber     : P3LdOZhxZCS/

DiskSize             : 118115020800

DiskWmiPath          : \\node-04\root\cimv2:Win32_DiskDrive.DeviceID="\\\\.\\PHYSICALDRIVE17"

ControllerName       : netapp-01

ControllerIgroup     : IGROUP-02

ControllerAddresses  : {xxx.xx.x.xx}

ControllerLunPath    : /vol/testvol/testlun

ControllerVolumeName : testvol

timothyn
6,631 Views

You're definitely right, it's the "DiskIdGuid" cluster parameter and the HostGptGuid.  I think you are also correct that the curly braces is a problem with the disk ID.  Keep in mind that this is not the same as the volume Guid (although they are both Guids).

For setting the cluster parameter, try this:

$res | Set-ClusterParameter "DiskIdGuid" $hostdisk.HostGptGuid.ToString("b")

drdabbles
6,633 Views

Perfect! The GPT GUID had to be formatted as a GUID, and by default the format is just a plain string. I would have figured MS would do some sanity checking on the input and attempted to convert the type, but that's not an issue for this forum.

Thank you very much Eric! I knew it would be something small and easily overlooked and you spotted it!

Public