We are getting alerts from UM about volumes running low on inodes. We have volumes that have 10's of millions of documents and we generally just increase the inode count by 20% in proportion to the volume size. Earlier this month, my co-worker called Netapp support and asked for guidance on this same exact thing. He was told the following:
increasing the iNode count should be option #2, whereas option #1 would be to create a new volume and/or move data off of the volume. With that said, increasing the iNode count is supported, and it is OK to increase the iNode count 2 times, by 20% each time.
Have any of you have had issues increasing inode count more than twice. Anyone know why option 2 which is increasing inodes is not recommended.
Sometimes there are guidelines (such as KBs), which provides a generic recommendations which is so b'cos it may lead to performance issues later on. Now, the question is what type of performance issue? Which is generally covered in the Internal Kb (I guess).
Technically, you can have maximum 1 inode per WAFL 4K block (i.e max upper limit for inodes in a single volume), however by-default the ratio is : 1 inode per 32K blocks.
Depending upon the size of the volumes, inodes are allocated, but if the file (small-size) consumption is very high then it will run out of the inodes even before the volume runs out of 4KB blocks.
My personal experience : I have increased the inodes count for couple of specific oracle volumes many times, sometimes 5%, 20%, and probably more times and have had no issues at all. As you may already know, as the inodes increases so does the inode-file size , inode-file is like a special meta-data file that stores entry of the all allocated inodes. For a typical linux-like system, OS will only load the in-core inodes in the memory which means ONLY files that are opened, so I assume the same for 'Ontap', so considering not all the 'trillion' files will be opened at the same time, we can rule out memory issues (for caching in-core inode file) and depending upon the Model, this could be very trivial.
However, the performance issue could come in to effect, if you are dealing with NDMP backups/restores or when the system is shooting through inode-file to determine which files needs to be backed up or which files have changed between snapshots. This is one area where a large inode-file (especially if it's not spread well across directories) can slow your backups/restores considerably. I don't know which other areas it could impact but apart of this, I don't see any issues in increasing the inodes count size (Unless the volume usage itself is between 80-to-100%).
We asked this of NetApp in the past and the guidance we've received is that on SSD-based volumes, there is no performance impact and it's okay to make your inode count large. We have a couple of volumes with >200M files used and another half dozen with >100M.
Breaking the data up into multiple volumes provides no benefit if your application has to search all of the volumes looking for the files. The best thing you can do to maximize your application performance is to properly lay out your file system to minimize the total sizes of the directories you need to search. A single directory of 10M files will probably give you bad performance even though your overall inode size is low.
It really depends on the workload. If it's more stale data, with low IOP profile and not much throughput it can be ok. But just because you *can* do something, doesn't mean you should.
If you're having to increase inode size, this might be a good canidate for Flexgroups. You *can* upgrade to 9.7 to get the FlexGroup conversion, but it needs to not be a full volume and have enough ingestion of new data to replace existing data. It's a very specific workload and you need to be really talking to the account team about it.
Converting to a FlexGroup will result in some features to be no longer available (eg, logical space enforcement).
FlexGroup conversion is extremely limited with no ability to balance the space unless you want to rewrite. You basically can't do it at all unless you have the ability to drastically increase the space available to the user, trust the user to not use that space, rewrite the majority of the data, and then shrink your constituents. They're also not for small volumes - they fundamentally break down if the constituent sizes are too small.
FlexGroups have a use case but they're not a general-purpose tool today and the conversion process even less so.
Great point EWILTS. This is a great reason to get to Flexgroups, but there are a few caveats so it's best to work with the account team to set this up to ensure you get the best use out of this wonderful technology.