As you rightly pointed out, FabricPool works at the WAFL "block level" and therefore it does not tier the cold files/folders in it's entirety, rather cold files/folder's underneath WAFL Blocks. i.e WAFL works at 4KB BLOCKS, FabricPool works at 4MB OBJECTS (1024 x 4KB WAFL Blocks).
Do I really need to know where my file /folder is at granular level ? B'cos for the user it makes no difference. If cold blocks are read and made hot, only the requested blocks in the 4MB object are fetched. Neither the entire object nor the entire file is written back. Only the necessary blocks are written.
Few interesting features of FP:
1) By default, tiering to the cloud tier only happens if the local tier is >50% full. There is little reason to tier cold data to a cloud tier if the local tier is being underutilized. From ONTAP 9.5, even the 50% tiering fullness threshold is adjustable.
2) If the tiered cold data is read by random reads, cold data blocks on the cloud tier become hot and are moved to the local tier. Hence, it is seamless and the file remains where it is as far as it's file-system location is concerned.
3) If read by sequential reads such as those associated with index and antivirus scans, cold data blocks on the cloud tier stay cold and are not written to the local tier which is acceptable.
I am using the tiering policy to tier older data to Storagegrid. The performance to read data in SG is not that good, so, that is why I am wondering if I can know in advance if the file or directory I am restoring is located in SG or Performance tiers, then to estimate how long the restoration may take.
i can guess based on minimum cooling days, but still because tiering is blocks based, so a file or directory may partially in one tiered, although much older data may have more chance being located in SG. Those are all my guess. I am wondering if there is way I can tell more acutely?
I am with you on this, and totally understand your concern and it's valid, if the user's experiencing slowness, then it could be number of things:
I believe we could see this as :
Will: 1) Tweaking number of cooling days (31 days) : To see if that makes any difference, in that case, atleast the "file" stays on local tier (SSD) for long enough to service any demand. Beyond that point, may be all or some block ranges are cold, and it's tiered off. Still not bad.
2) According to Fabric Pool concepts (Neither I have implemented nor I am expert, but I am also getting aware through TR) : It says if the access request pattern for the 'requested' blocks are random, then it's moved back (to SSD), if not, then it stays where its. This is interesting b'cos, technically you have no control over 'files/folders' (at logical file-system level), as the BLOCKS are being moved here. Now, whatever algorithm is utilized here, we cannot really get the hang of it, and that's not in our control as Admins. The only thing we can do is : Raise a ticket with NetApp and treat it as 'FabricPool' performance incident. Support will have their tools to get to the bottom of this (based on : cooling days, file-access pattern, network load, system load, size etc) and we go from there.
The tiering minimum cooling period is 2 days. You can modify the default setting for the tiering minimum cooling period with the -tiering-minimum-cooling-days parameter in the advanced privilege level of the volume create and volume modify commands. Valid values are 2 to 63 days.
Understand it is all done by OnTap/FabricPool, Admin doesn't have control over it. I am just hoping there would be commands / tools to indicate what file/directories of % of them are located on what tierer instead of blindly guessing.
I experimented some older or newer files, above or below cooling days. Some very old data which should be old enough to be all tiered, but the speed of first time restoring is fast as if they were in Performance tier, so, that made me to raise the question.
My StorageGRID hardware is being installed as I write this, but I've been asked if there's a way to determine how much egress/ingress we're getting and what volumes those are in.
For example, it would be helpful to know that a volume needs a policy tweak because of a weekly or monthly job that causes a large ingress to take place. Or if a user is complaining about performance, and it's because we're processing data from SG instead of SSD.
We're obviously okay with some back and forth, but don't want it to be excessive.
What tools do we have to measure this? Something like an ongoing bytes in, bytes out per volume along to go along with the total volume and percentage migrated to SG would be helpful.
the storage grid can give you a report of egress/ingress rates per site/node. if you are interested analyzing the amount data that shall be tiered as part of Fabric pool enable inactive data reporting in the aggregates.
volume show -fields performance-tier-inactive-user-data,performance-tier-inactive-user-data-percent