Re: ZapiRetryCount PowerShell parameter in WFA4?

jauling_chou · ‎2017-06-26

I attempted to use -ZapiRetryCount <INT> as an additional parameter for Connect-WfaCluster, but that doesn't seem to work anymore when used with WFA4. Is there a new equivalent undocumented parameter that will silently retry?

We have a few far-reaching CDOT clusters that sometime exhibit connectivity issues with WFA. While we could increase the timeout, I'm worried that the packets were "lost" in transit, and a retry would be more necessary.

mbeattie · ‎2017-06-26

Hi,

The ZapiRetryCount parameter does not apply to all CmdLets (EG Connect-NcController or Connect-NaController) so unfortunately you can't set a default ZAPI retry for the cluster connection, however what you can do is create custom commands that define the ZapiRetryCount as an input parameter then set a constant in your workflow and apply the constant to the command variable values. For example:

Here is the source code for the command as an example:

Param(
   [Parameter(Mandatory = $True, HelpMessage = "The name or IP Address of the cluster")]
   [String]$ClusterName,
   [Parameter(Mandatory = $True, HelpMessage = "The name of the vserver")]
   [String]$VserverName,
   [Parameter(Mandatory = $True, HelpMessage = "The path of the directory to read")]
   [String]$Path,
   [Parameter(Mandatory = $False, HelpMessage = "The maximum number of ZAPI retry attempts")]
   [Int]$ZapiRetryCount
)
#'------------------------------------------------------------------------------
#'Connect to the cluster
#'------------------------------------------------------------------------------
Connect-WFACluster $ClusterName
#'------------------------------------------------------------------------------
#'Create the command to read the directory.
#'------------------------------------------------------------------------------
If(-Not($Path.Contains("/vol/"))){
   [String]$Path += "/vol/$Path"
}
[String]$command = "Read-NcDirectory -Path ""$Path"" "
If($ZapiRetryCount){
   [String]$command += "-ZapiRetryCount $ZapiRetryCount "
}
[String]$command += "-VserverContext $VserverName -ErrorAction Stop"
#'------------------------------------------------------------------------------
#'Read the directory to ensure it exists.
#'------------------------------------------------------------------------------
Try{
   Invoke-Expression -Command $command -ErrorAction Stop
   Get-WFALogger -Info -Message "Executed Command`: $command"
   Get-WFALogger -Info -Message "The directory ""$Path"" exists on vserver ""$VserverName"""
}Catch{
   Get-WFALogger -Info -Message $("Failed Executing Command`: $command. Error " + $_.Exception.Message)
   Throw "The Directory ""$Path"" does not exist on vserver ""$VserverName"""
}
#'------------------------------------------------------------------------------

Note: I'd recommend setting the ZapiRetryCount as a non-mandatory parameter within your commands.

/Matt

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

jauling_chou · ‎2017-06-27

While I can see you're trying to be helpful, I don't see how your example is applicable to my issue.

Connect-WfaCluster (as I know now), does not support the ZapiRetryCount, and therefore its not possible to silently have this command retry X number of times until it succeeds. This is how I understand this parameter is used, hopefully I have this correct. I suppose its inherently difficult to clarify undocumented parameters heh.

I believe what I need to do instead is put Connect-WfaCluster into a while loop, and either retry forever, or until a threshold is reached. I think that would help alleviate the connection timed out issues that we see now and then. From a scalability perspective, is the only way to implement this is to identify all the WFA commands I use that execute Connect-WfaCluster, and to add this looping mechanism? Seems, like a lot of work... is there a way to add this in the backend on a global scale?

sinhaa · ‎2017-06-28

@jauling_chou

I believe what I need to do instead is put Connect-WfaCluster into a while loop, and either retry forever, or until a threshold is reached.

-------

I don't think this is of much use. If connection timesout in the first attempt re-trying with same timeout is likely to fail as well. You can change the default timeout of the concerned cluster in WFA->Credentials. WFA4.0 and above , you can override the dafault timeout for this very cluster

Try this .

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

jauling_chou · ‎2017-06-29

This is interesting, I did not know there was a timeout in the credentials paramters in WFA, thanks @sinhaa! The default of 60s seems reasonable, but I bumped it up to 100s to see if it has any effect. Network congestion and/or packet loss is a difficult situation to workaround as @mbeattie mentioned as well.

What's interesting of note is that even clusters that are in the same physical facility as WFA sometime fail workflows with "Failed to connect to cluster node: " message. One of these clusters is only 5 hops away according to traceroute. I'm suspecting that the web services in that cluster might be slow to respond because a specific LIF might be overworked? I realize this particular concern is slightly off topic now...

mbeattie · ‎2017-06-28

Hi,

I'd agree with @sinhaa on this. Increasing your connection timeout is certainly an appropriate option. Alternatley you could place an additional WFA server within your remote site in close proximity to your remote cluster to reduce network latency. I know that's not ideal as it's an additional system to manage but it might be worth considering if this is impacting your production environment? Ultimately it sounds like an intermittent network issue.

/Matt

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.