Microsoft Virtualization Discussions
Microsoft Virtualization Discussions
I'm trying to replace the functionality of SDCLI in a set of PowerShell scripts I've written but I'm running into a problem when attaching the LUN to the IGroups for the cluster nodes.
My script looks like this...
$sanname = "netapp-01.example.com"
$sancontroller = connect-nacontroller $sanname
$snapshot = get-nasnapshot -name VOLUME | where { $_.Name -like "sqlsnap*" } | sort-object AccessTimeDT -descending | Select -First 1
$clone = new-navolclone -parentvol VOLUME -clonevol VOLUME_CLONE -SpaceReserve none -parentsnapshot $snapshot.Name
$lun = get-nalun -Path ("/vol/" + $clone.Name + "/*")
$lun = Set-NaLunSignature -Path $lun.Path -GenerateRandom -Confirm:$false
$lun = $lun | set-nalun -Online -Force
$lun = $lun | add-nalunmap -InitiatorGroup "igroup1"
$lun = $lun | add-nalunmap -InitiatorGroup "igroup2"
Start-NaHostDiskRescan
$hostdisk = Wait-NaHostDisk -Timeout 10000 -ControllerLunPath $lun.Path -ControllerName $sancontroller.Name.Split(".")[0]
$hostdisk = set-nahostdisk -DiskIndex $hostdisk.Disk -Online -Force
$hostvolume = $hostdisk | get-nahostvolume
$hostvolume = $hostvolume | Mount-NaHostVolume -MountPoint "Q:\"
The bold red line is where I'm hitting the problem. I get the following error
Set-NaHostDisk : The media is write protected. (Exception from HRESULT: 0x80070013)
I get the same error if the LUN is mapped to only a single system, so it doesn't seem to be related to the fact that it's a cluster or two hosts in general. The only other thing I can imagine it being is Fiber channel instead of iSCSI. Does this sound reasonable? The host clearly sees the attached LUN, because the Wait-NaHostDisk returns the object containing the correct LUN's info.
Does anybody have experience bringing a LUN online on a server this way? I'd really love to hear from anyone that does. SDCLI simply isn't cutting it anymore, so I'm willing to do the extra work in PowerShell to add the resources to the Windows cluster if it means I don't have to call SDCLI to attach the LUN to my servers.
Solved! See The Solution
You're definitely right, it's the "DiskIdGuid" cluster parameter and the HostGptGuid. I think you are also correct that the curly braces is a problem with the disk ID. Keep in mind that this is not the same as the volume Guid (although they are both Guids).
For setting the cluster parameter, try this:
$res | Set-ClusterParameter "DiskIdGuid" $hostdisk.HostGptGuid.ToString("b")
Can you online and mount the volume manually from Disk Management once the lun has been presented?
That looks like you are on the right track. And it's a great use of the new disk management cmdlets.
I suspect the cluster service is taking control of the disk, since it's mapped to multiple nodes, and leaving it in a reserved state. Note that manually onlining a disk isn't normally necessary (cluster or otherwise) unless it was specifically offlined.
I assume you intend to bring this online as a cluster disk? In that case I would just change your script starting right after Wait-NaHostDisk:
Import-Module failoverclusters
$hostdisk = Wait-NaHostDisk -Timeout 10000 -ControllerLunPath $lun.Path -ControllerName $sancontroller.Name.Split(".")[0]
#move the available storage group to the local node so we can operate on the cluster disk
Move-ClusterGroup "Available Storage" $env:COMPUTERNAME
#Create the new cluster resource
$res = Add-ClusterResource -Name "NewClusterDisk" -Group "Available Storage" -ResourceType "Physical Disk"
#Set the disk identifier, if you are using GPT you need to use "DiskIdGuid" $hostdisk.HostGptGuid.ToString("b")
$res | Set-ClusterParameter "DiskSignature" $hostdisk.HostMbrSignature
$res | Start-ClusterResource
#Remount the volume with a new drive letter
$hostdisk | Get-NaHostVolume | Dismount-NaHostVolume -Confirm:$false | Mount-NaHostVolume -MountPoint "Q:"
Let me know if that works for you. Cheers!
Eric
This is exactly what I needed, except for one small problem. When I try to start the cluster resource ($res | Start-ClusterResource), it fails. I get an error in the event log:
Cluster physical disk resource 'TestClusterDisk' cannot be brought online because the associated disk could not be found. The expected signature of the disk was '{00000000-0000-0000-0000-000000000000}'. If the disk was replaced or restored, in the Failover Cluster Manager snap-in, you can use the Repair function (in the properties sheet for the disk) to repair the new or restored disk. If the disk will not be replaced, delete the associated disk resource.
When I look at the disk's HostVolume parameter, I get "\\?\Volume{00000000-0000-0000-0000-000000000000}\". Obviously this is where the signature error is coming from, but the HostGptGuid returns "779a06f2-f203-46d5-bf90-d784c5ee3ea4". The same information appears on both cluster nodes, so it doesn't appear to be an ownership problem. And the disk appears as "reserved" on both cluster nodes' Disk Management utilities.
If it's a GPT disk you need to use the GPT GUID... Try:
Import-Module failoverclusters
$hostdisk = Wait-NaHostDisk -Timeout 10000 -ControllerLunPath $lun.Path -ControllerName $sancontroller.Name.Split(".")[0]
Move-ClusterGroup "Available Storage" $env:COMPUTERNAME
$res = Add-ClusterResource -Name "NewClusterDisk" -Group "Available Storage" `
-ResourceType "Physical Disk"
$res | Set-ClusterParameter "DiskSignature" $hostdisk.HostGptGuid
$res | Start-ClusterResource
$hostdisk | Get-NaHostVolume | Dismount-NaHostVolume -Confirm:$false |
Mount-NaHostVolume -MountPoint "Q:"
Same problem. Also, I assume you meant DiskIdGuid and not DiskSignature. Signature will not accept a GUID as a valid input, which stands to reason if it's expecting a MBR signature.
If I get-clusterparameter from a working GPT disk resource, I get:
Object Name Value Type
------ ---- ----- ----
WorkingVolName DiskIdType 1 UInt32
WorkingVolName DiskSignature 0x0 UInt32
WorkingVolName DiskIdGuid {880987f7-9954-4eb8-9100-2... String
WorkingVolName DiskRunChkDsk 0 UInt32
WorkingVolName DiskUniqueIds {16, 0, 0, 0...} ByteArray
WorkingVolName DiskVolumeInfo {1, 0, 0, 0...} ByteArray
WorkingVolName DiskArbInterval 3 UInt32
WorkingVolName DiskPath String
WorkingVolName DiskReload 0 UInt32
WorkingVolName MaintenanceMode 0 UInt32
WorkingVolName MaxIoLatency 1000 UInt32
WorkingVolName CsvEnforceWriteThrough 0 UInt32
WorkingVolName DiskPnpUpdate {0, 0, 0, 0...} ByteArray
If I run the same against the disk resource that won't come online, I get:
Object Name Value Type
------ ---- ----- ----
NewClusterDisk DiskIdType 1 UInt32
NewClusterDisk DiskSignature 0x0 UInt32
NewClusterDisk DiskIdGuid a435b71b-d677-445a-933d-3b... String
NewClusterDisk DiskRunChkDsk 0 UInt32
NewClusterDisk DiskUniqueIds {} ByteArray
NewClusterDisk DiskVolumeInfo {} ByteArray
NewClusterDisk DiskArbInterval 3 UInt32
NewClusterDisk DiskPath String
NewClusterDisk DiskReload 0 UInt32
NewClusterDisk MaintenanceMode 0 UInt32
NewClusterDisk MaxIoLatency 1000 UInt32
NewClusterDisk CsvEnforceWriteThrough 0 UInt32
NewClusterDisk DiskPnpUpdate {} ByteArray
Right away, I'm noticing the working disk resource uses proper GUID notation with "{" and "}" characters. I also notice it contains volume information, presumably because the volume is online and reachable.
If I set the GUID manually on the command line with $clusterres | Set-ClusterParameter "DiskIdGuid" "a4f3edf4-a070-48a8-a801-6f8490adf83a", I will get an error that gives the correct GUID, but the same error about repairing the volume.
This is the one thing I'm not excited by with this setup- There doesn't seem to be much rhyme or reason to failures like these. But, for kicks I removed the disk resource from the cluster and tried to bring the LUN online as a disk in the Windows disk management tool, and it took right off. I assigned it a drive letter and was able to browse with no problem.
So, to me, it looks like NetApp Powershell toolkit has presented the LUN to the cluster nodes successfully. Now it looks like the Microsoft Clustering Powershell toolkit is having a problem bringing that LUN into the cluster as a working resource. When I manually add the disk resource through the Failover cluster management tool, it works like a CHARM. So, perhaps I'm missing some critical step with the cluster toolkit.
My other scripts that work with the cluster toolkit work fine, but I'm using SDCLI to bring the volumes online in that case. (I've been told that I was not the use case when NetApp initially conceived of the PS toolkit...because I use literally none of the GUI tools ) I don't have to do anything special to add dependencies or anything, so I don't think I'm missing a step here. Any other ideas?
In case it's helpful, here's some output of several commands...
PS C:\Users\Administrator.PROD> get-navol
Name State TotalSize Used Available Dedupe FilesUsed FilesTotal Aggregate
---- ----- --------- ---- --------- ------ --------- ---------- ---------
{snip}
testvol online 93.8 GB 87% 12.5 GB True 109 5M aggr0
{snip}
PS C:\Users\Administrator.PROD> get-nalun
Path TotalSize SizeUsed Protocol Online Mapped Thin Comment
---- --------- -------- -------- ------ ------ ---- -------
{snip}
/vol/testvol/testlun 110.0 GB 86.2 GB windows_2008 True True True Lun Comment
{snip}
PS C:\Users\Administrator.PROD> Get-NaLunMap "/vol/testvol/testlun"
Name : IGROUP-01
Initiators : {20:01:00:24:e8:6e:4f:52, 20:02:00:24:e8:6e:4f:52}
Path : /vol/testvol/testlun
LunId : 13
Name : IGROUP-02
Initiators : {20:01:00:24:e8:6e:4f:5f, 20:02:00:24:e8:6e:4f:5f}
Path : /vol/testvol/testlun
LunId : 13
PS C:\Users\Administrator.PROD> Get-NaHostDisk
HostDrivePath Disk Size ControllerPath
------------- ---- ---- --------------
{snip}
\\?\Volume{a6dbbd2d-4b00-11... 17 110.0 GB netapp-01:/vol/data4_deleteme/data4
PS C:\Users\Administrator.PROD> Get-NaHostDisk 17 | select *
Disk : 17
Size : 118115020800
ControllerPath : netapp-01:/vol/testvol/testlun
ClusterGroup : Available Storage
ClusterResource : Cluster Disk 1
ClusterNode : node-04
HostDrivePath : \\?\Volume{a6dbbd2d-4b00-11e1-b532-0024e86e4f5d}\
HostDiskName : \\?\PhysicalDrive17
HostVolume : \\?\Volume{a6dbbd2d-4b00-11e1-b532-0024e86e4f5d}\
HostVolumeIsCsv : False
HostGptGuid : a435b71b-d677-445a-933d-3b925df27057
HostMbrSignature :
HostDiskIndex : 17
DiskSerialNumber : P3LdOZhxZCS/
DiskSize : 118115020800
DiskWmiPath : \\node-04\root\cimv2:Win32_DiskDrive.DeviceID="\\\\.\\PHYSICALDRIVE17"
ControllerName : netapp-01
ControllerIgroup : IGROUP-02
ControllerAddresses : {xxx.xx.x.xx}
ControllerLunPath : /vol/testvol/testlun
ControllerVolumeName : testvol
You're definitely right, it's the "DiskIdGuid" cluster parameter and the HostGptGuid. I think you are also correct that the curly braces is a problem with the disk ID. Keep in mind that this is not the same as the volume Guid (although they are both Guids).
For setting the cluster parameter, try this:
$res | Set-ClusterParameter "DiskIdGuid" $hostdisk.HostGptGuid.ToString("b")
Perfect! The GPT GUID had to be formatted as a GUID, and by default the format is just a plain string. I would have figured MS would do some sanity checking on the input and attempted to convert the type, but that's not an issue for this forum.
Thank you very much Eric! I knew it would be something small and easily overlooked and you spotted it!