I do WFA development and testing in a number of different lab and customer environments. I noticed in one customer's environment that a 5-step workflow that takes in 15-20 seconds to run in my laptop VM environment takes 2-3 minutes to run on a customer's lab system. I compared log files and found:
- Each WFA command step startup took 22 seconds between the first and second lines of the log (assuming this is when poSH is launching)
- It regularly takes 15 to 20 seconds between the lines saying "attempting to connect to controller" and the "now connected" line
Both those steps take 1 to 2 seconds in my laptop environment. These delays are what add up to the 2-3 minutes ... the rest of the workflow flys by. I've confirmed the customer's environment is running a valid configuration (a VMware VM with 4GB of memory, 4-cores of ~ 2.2Ghz).
This looks like a case where PoSH startup itself (initial 22 seconds for each step) and possibly the first dataontap PoSH toolkit cmdlet execution are what takes too long. I've been advised these slow downs could be contributed to things like:
In order to change the Internet Properties such that the WFA services do not check certificate revocation, I created a ‘labad\svcwfa’ service account in the domain with local Administrative permissions. I then shut-down the WFA database and server services, modified the services to run as user ‘labad\svcwfa’, from the "Control Panel", I updated the Internet Options for ‘labad\svcwfa’ as described above, and then re-started the WFA services.
We re-ran the the workflow and did get improvement in the run-time. It took 83 seconds ( 1m 23s) to run. Previously it took 171s (2m 51s) to run the same workflow.
We have basically cut the run time in half which is good progress; however that still seems like a long time for the workflow. I will look at some of the other performance issues.
DavidSpano and I are working on the same environmental issue. We eliminated the time delays that occurred at the start of every PowerShell Command, but there are still delays on the the order of 20 seconds for each network connection to arrays/controllers. Some details from the environment:
11:36:59.598 INFO [Create volume] ### Command 'Create volume' ###
11:27:13.595 INFO [Create volume] Connected to controller
==== AND THIS ==== (later in the same workflow)
11:27:46.372 INFO [SWA Create export] Using cached controller connection
11:28:04.729 INFO [SWA Create export] Removing NFS export : /vol/pdc_oracle
INDICATES there is a 17 to 18 seconds delay in connecting to a controller ... even when it says "Using cached controller connection" since it still takes the about 18 seconds before the first command executes.
Are there specific ways to improve this or suggestions for how to isolate what is causing this?
- WFA 2.2 GA
- 7-Mode controllers running 7.3.4 and 8.1.3P1
Re: Suggestions for improving workflow performance?
This is typically due to windows certificate revocation checks. I have been battling with this on a customer site (up to 100s delays starting up commands) and finally found an approach that works consistently (the suggestions that worked on other sites failed in this instance - I think it's due to their network configuration... connections to the internet are dropped rather than rejected, so they take time to timeout).
Now that I've read some more of the context of the latest issues, let me suggest checking DNS settings on the filer for slow command startup - it's possible that it's trying to do a reverse lookup on your address for logging purposes.