A Windows server 2016 has a drive mapped to a filer and after a week it disconnects. Like clockwork, about every 7 days, and it happens every time so I now have to do a pre-emptive reboot of the windows client once a week. Attempting to reconnect from the windows command line with a net use command gives the error message “System error 8 has occurred. Not enough storage is available to process this command”. A reboot always fixes it.
Searching for that error always gives the advice to increase IRPStackSize parameter, but I think that means on the target machine. and if the target machine was a Windows machine, not a NetApp. We did increase that on the client though, with no change.
This all started when the OnTap was upgraded in October 2019. I'll try to get from what to what, I don't have that info right now. Before that the same setup ran a year straight with no problems whatsoever.
The C: drive has plenty of free space. I did try to dive into desktop heap though, since that came up in some google searches, but it got too deep for me. There was a recommendation for a setting change, but since this is an important production server I am hesitant to mess with that.
I agree the problem is with the windows system, but it was apparently started off by some change in the ONTap from version to version. At one point I experimented with a different batch file to transfer files under a different user, and the system did act quite a bit differently, spawning many new processes that eventually required a reboot in more like three to five days. At least currently that does not happen, the number of processes stays nice and constant at a nice low 81. So I think there is a good chance it is related to the environment that a windows command file runs in and desktop heap.
Along with the info @Ontapforrum requested. for the next time it happens - I would very much like to see the windows systems/security event logs at the time, ONTAP EMS, and a packet trace (which I guess will be a problem sharing here).
One other thing. Is there any odd network devices in the path? WAN optimizer (Riverbed?), application aware firewall, DLP, some sort of non-standard VPN?
I have seen this issue when the filer receives too many session requests from the same user on one TCP connection. you should see the following errors in ems.
Nblade.cifsMaxSessPerUsrConn:error]: Received too many session requests from the same user on one TCP connection
Nblade.cifsMaxSessPerUsrConnNotice:notice]: Received xxxx session requests, nearing the configured limit of
Corrective action: Inspect the application running on the client using this TCP connection. The client might be operating incorrectly due to the application running on it. Rebooting the client might also be helpful. In some cases, clients are operating as expected but require a higher threshold, which you can set using the (privilege: advanced) "cifs option modify -max-opens-same-file-per-tree" command.
The default setting is 800 and could not be enough to your requirements. If increasing to 2000 does not fix your issue, you will need to troubleshoot the rogue client causing the issues.