Software Development Kit (SDK) and API Discussions

Build list of files on Infinite Volume

uzimmermann

Due to technical reasons in other parts of our company we are ending up with "orphaned" files on an Infinite Volume. To be able to find these files and archive or purge them, I have to build a list of files on a regular basis. We have a MD5 tree (/[0-9a-f]/[0-9a-f]/[0-9a-f]) and each directory on the bottom has over 50,000 files at this point. On a cold cache our 2-node 8040 cluster takes on a good time around 30 seconds minimum to read the directory (ls), but I have seen as bad as 5 minutes. So even at 30 seconds per directory it takes 34 hours to build a list in a best case scenario, usual it would take multiple days.

 

A perl script which runs 16 threads and is doing ls on a single directory still would take a large amount of time.

 

Can anyone think of a better way to extract a list of file names? We don't need any other attributes such as size, last modified etc.

 

 

0 REPLIES 0
Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public