Object Storage

Using S3 Select for StorageGRID-side filtering of CSV content


This isn't really a question - I just wanted to share a practical example of using S3 Select in StorageGRID 1 1.6 to find certain NetApp product configurations that meet my criteria.


When you have many possible configurations and want to find a few that meet your criteria and then pick one, you could put them all in Excel and use filtering, but if there's 100,000 of them, that may be slow.


S3 Select lets you do this by querying StorageGRID API endpoint with S3 Select SQL queries. It's simple and anyone can do without installing any software (even the script itself could be on StorageGRID and that's the only thing you'd have to download).


Anyway, here's an example: find me all configs which need less than 25 units and contain between 120 and 130 of something.

(This units to be storage products  and the numbers represent capacity.) Depending on the query you can get back just a handful of results and review them even without fancy formatting.




For this purpose I usually get 10-15 results (out of possible 12,000) so I don't sort or group script output, although that would be possible. The green cricle below shows a result I liked most, 24 "small" nodes that meet my capacity conditions.




At the very bottom you can see two important factoids:

- The entire file is 600kB, but my query returned only 4kB, saving egress bandwidth

- It took just 1.7 seconds to fetch these results. If I tried to do this in a sizing tool, it'd probably take me 5 minutes to run 2-3 scenarios based on educated guesses.


It would be easy to improve that script or even build a single page JavaScript-driven app that queries SG and creates fancy output. 



Thank you very much for sharing @elementx !




Team NetApp

Team NetApp