My management has tasked me with searching all of the CIFS shares on all of my NAS devices for certain executable files, MP3s, ITunes files, etc. This potentially could be a very time consuming job. I've looked at a number of utilities, both paid and free, and have found all of them to be lacking or incapable for one reason or another. I really don't like the idea of creating an index, and most of the "fast search" products try to create an index in RAM. This blows out when you're searching TBs of data.
Has anyone else been tasked with this type of job? If so, what did you use for searching? Any recommendations for products/utilities/etc?
It's hard to tell without knowing your requirements, but... you may want to look into NetApps XCP. It would also offer:
Scan - discovery and statistics of files and directories
Starting from any directory, XCP recursively reads all the subdirectories and can produce listings and reports in human-readable and machine-readable formats. Thanks to the matching and formating capabilities, the reports can be highly customized to match any reporting needs. Any file attribute such as the access time, owner, group, size, etc. are eligible criterias for filtering out the files in a report. Output formats include CSV, HTML or plain text listing.
ECX from Catalogic Software is your tool in that case. I am afraid you can't workaround not using an index. The advantage of an index (catalog) is that you can offline run all your reports and file analytics. Indexing of millions of objects can take some time, but normally you only need these kind of reports only once a months. So let the index policy run during the weekend and run your reports during the day without any impact on the Storage Controllers.
When you are dealing with that amount of data, and I am assuming millions of files, nothing will be speedy. I use Powershell for this type of thing, and I would probably load the results into a hash table for fast lookups on the results. For an easy, affordable, third party product I would look at Treesize Pro by Jam Software.