We've been tracing down for weeks the cause of a serious performance decline of our filer. We have narrowed it down to a python script using the latest NMSDK.
Symptoms:
- Sustained high CPU that persist after the NMSDK calls
- NFS RPC call take up to 20 times longer, affecting user experience
- we have no visibility of memory but a reboot alleviates the issue. So does a cluster takes over.
Method for reproduction:
Make a call for nfs-exportfs-modify-rule-2 using an NaElement previously retrieved via a call for nfs-exportfs-list-rules-2.
In order to modify the export ruleset, we retrieve a list with by calling nfs-exportfs-list-rules-2 and build upon it.
Notes on reproduction:
- Creating an nfs-exportfs-modify-rule-2 from scratch does not reproduce the issue, an NaElement from a previous nfs-exportfs-list-rules-2 needs to be used
- The new ruleset being applied with the nfs-exportfs-modify-rule-2 call needs to be different from the existing ruleset.
Code for reproduction: (see attached file)
- edit netapp.py
- update line 11 to point to a filer
- update line 13 with the proper credentials
- launch ./test.py