2018-02-03 11:41 AM
Recently, in our environment one of our application groups has described a situation where one of their application servers running Windows 2012 R2 randomly loses connectivity to a NetApp filer running CDOT 8.2.3. P3. The share they use becomes unavailable by name but remain accessible by IP address. The share is available by name and IP on other clients. This seems to be a localized issue on this server. The server in question is a VM, and is running on an ESXi 5.5 update 3b is using version 9 of the Vmware hardware and tools.
I haven't heard of this problem from any other application teams or business unit. Again seems to be isolated to this server. As mentioned the server is Windows 2012 R2 and our domain controllers are all Windows 2012 and two are Windows 2012 R2. I've looked into the posts around SID Compression and the KDC but we are running a version of code on this NAS that has that particular bug 649280 patched.
I've created a script that the does several tests including netstats, pings and walking the file structure from a mapped drive and via UNC from calling a powershell command. The issue happens infrequently, about every 12-14 days and seems to exist from within an application or when the UNC is accessed via Windows Explorer (however I've yet to see this). I'm wondering if anyone out there has seen this issue in their environment.
Thank you for your insights.
2018-02-04 02:39 PM
Is the server in question and storage cluster\vserver registered in DNS with both A & PTR records? Given it's still available via IP it would suggest a name resolution issue. Is the DNS configuration on your server consistent with other servers in your environment that don't have the issue?
2018-02-05 01:05 PM
Thank you for your response, much appreciated! Yes, I've done a ping (resolves fine), nslookups on all DNS servers defined (2x prod and 2xDR) all resolve the A record, and a reverse lookup that works as well. I had thought DNS as well however, it works almost all the time. It simply stops working every 12-14 days. The Vserver configuration does appear to be similar if not exact to other vservers on this ONTAP cluster.
Another thing that caught my attention was SID compression since this is a Windows 2012 R2 server, however we're running a vesrion of CDOT where this particular issue has been patched (8.2.3 P3, and yes I know this is an older version. I'm working on it...)
If I do figure this out I'll be sure to post the resolution as I'm sure others out there have experienced similar issues.