ONTAP Discussions
ONTAP Discussions
Guys, I have a large CIFS volume (2TB+) with snapshots enabled and lots of files/directories. I'd like to try and estimate how many of each I have. Is there any accounting for this on the cluster? Perhaps the iNode count or the like? I'd rather not try and run something like TreeSize against it! Will take months!
Solved! See The Solution
The perf increase is due to multithreading.
I'd suggest using the NFS version to scan though. It's a bit more robust and, honestly, works a little better than the SMB version. Set up a Linux VM and install XCP there. In XCP 1.6, we have File Systems Analytics, which does a regular poll of the file system to report on things like file counts. This is readonly and won't change any data.
https://blog.netapp.com/xcp-data-migration-software
https://whyistheinternetbroken.wordpress.com/2020/07/28/xcp-161/
https://whyistheinternetbroken.wordpress.com/2019/08/30/using-xcp-to-delete-files-en-masse/
Have a look at NetApp's XCP :
There are some options to XCP to allow you to do what you ask. Plus it is pretty fast
Thanks! Looks cool. Since it run client side though will it not be just trawling the CIFS volume over the network or does it do something clever with an API or the like?
Please have a look through the README and Docs:
https://mysupport.netapp.com/documentation/docweb/index.html?productID=63525&language=en-US
Specifically:
Core Engine Innovations (XCP SMB Features)
Supports Windows, CLI only
Extreme performance (~25x comparable tools)
Multiple layers of granularity (qtrees, subdirectories, criteriabased filtering)
Easy deployment (64-bit Windows host-based software)
Thanks. Still not sure if the x25 perf increase is due to multi-threading client side or some specific integration though. Since my last scan took ~ 40 hrs a 25 times perf increase would be awesome!
It may be that since NetApp wrote XCP and they wrote the CIFS server you are trying to query, there may be some built-in efficiencies taking advantage of the protocol
The perf increase is due to multithreading.
I'd suggest using the NFS version to scan though. It's a bit more robust and, honestly, works a little better than the SMB version. Set up a Linux VM and install XCP there. In XCP 1.6, we have File Systems Analytics, which does a regular poll of the file system to report on things like file counts. This is readonly and won't change any data.
https://blog.netapp.com/xcp-data-migration-software
https://whyistheinternetbroken.wordpress.com/2020/07/28/xcp-161/
https://whyistheinternetbroken.wordpress.com/2019/08/30/using-xcp-to-delete-files-en-masse/
Thanks for the information guys. Multi-threading is still a big boost I'd imagine so I will test and see. It's a pity we can't get at Master File Table and use something like TreeSize.
I believe TreeSize would work, but it would likely crawl the entire filesystem over the protocol and that would be single threaded.
Hi,
If you just want to get a file count on the volume including snapshots...
cluster1::> volume show -vserver vserver1 -volume cifs_data_001 -fields files-used
vserver volume files-used
-------- ------------- ----------
vserver1 cifs_data_001 118
/Matt
Thanks! Unfortunately I need to exclude snapshots as want the 'live count but thanks for the info.