I am running 9.3P8. I have a volume that is 1.98TB with 12 million files.
I am getting the following error listed below. Quesitons: Is part of bug 1055262? How can I tell is this is a sparce file issue or a large directory issue? Would this impact permforance?
If this is a sparce file issue, would a volume move be to fix the problem?
wafl.readdir.expired: A READDIR file operation has expired for the directory associated with volume BBBB/@vserver:82848f81-2b18-11e5-a0ee-00a0986a1129 Snapshot copy ID 0 and inode 21505.
This message occurs when a READDIR file operation has exceeded the timeout that it is allowed to run in WAFL(R). This can be the result of very large or sparse directories, and corrective action is recommended
You can find information specific to recent directories that have had READDIR file operations expire by using the 'diag' privilege nodeshell CLI command: "wafl readdir notice show" If a directory is indicated as sparse, it is recommended that you copy the contents of the directory to a new directory to remove the sparseness of the directory file. If a directory is not indicated as sparse and the directory is large, it is recommended that you reduce the size of the directory file by reducing the number of file entries in the directory.
It looks like this is happening. If your ONTAP version is still earlier than 9.5, why not try an update? Since volume move is ineffective because it moves the block, it may be necessary to move it back to another with SnapMirror of XDP.
From looks of it, seems like combination of 'large file count on a single volume' and performance related issue. It depends on number of factors such as : Hardware you got (Physical resources), what is the filer load profile ? How much inode-intensive work is going on ? Software defect (Bug) can also play a role here.
Have a look at this KB, there are multiple bugs been referred. Ensure the minimum recommended ONTAP version for any of those relevant bugs are being applied.