My question is about the different CP types triggers on a NetApp filer. I have looked a lot and found good description for most of them but some explanations are a bit general.
Here is the list of the CP types (as shown at sysstat command) along with an explanation to the ones I already know. Please help me understand the rest (and correct me if I got anything wrong):
T - Time. CP occurs every 10 seconds since the last CP if no other trigger has caused it.
F - Full NVLog. The NVRAM is divided into two sections (4 when working in an HA pair configuration - half is a mirror of the HA partner) - if one is filled up the CP occurs and the data is flushed to disks, in the meantime the other half is used for incoming writes.
B - Back to back. While a CP is commited, the second half of the NVLog is full and needs to flush before the first one finished. This situation causes for latency problems and means that the filer is having hard times keeping up with the write loads.
b - I need help from you guys about this one, all the places I read only declare that this is also back to back that is worse than B but no one explains exactly what is the difference and when this is shown instead of the other.
S - Snapshot. Right before the filer is taking snapshot it is committing CP so it will be in a consistent state.
Z - I need your help for this one as well, everything I found just says that this is CP that happens in order to sync the machine and happens before snapshots. So, what is the need for this one if we have the S? what is the difference between them?
H - High water mark. I AM NOT SURE I GOT THIS ONE CORRECT BUT - When there is a lot of changed data in the memory buffers (RAM not NVRAM!) the filer is committing CP in order to flush and get the buffers clean.
L - Low water mark. I AM NOT SURE I GOT THIS ONE CORRECT BUT - When there is low space left on the memory buffers (RAM not NVRAM!) he filer is committing CP in order to flush and get the buffers clean. So the difference between L CP and H CP is that H CP is about changed data threshold and L CP is about data in buffers as a whole (if I got it right).
U - flUsh. When application using asynchronous writes asks that it's data will be flushed down to a persistent storage.
V - low Virtual buffers. I have no idea what that one means, help?
M - low Mbufs. I have no idea what that one means, help?
D - low Datavects. I have no idea what that one means, help?
N - max entries on NVLog. What the hell is the difference between this one an F?
So, in summary I need help at:
Difference between B and b (and a real one - not that b is worse)
Difference between S and Z
Difference between F and N
Any information about V, M & D types
A validation that I got things right, specifically L, H and U will be appreciated
T - timer. Yup - you hit that on the head - force one every 10 seconds if one hasn't happened for another reason.
F - Full NVLog - this is measured in terms of capacity, which is based on the size of the requests being captured.
B - Back to back - bad, as you describe, cuz the flushing of the NVRAM isn't fast enough before something else happens to require another flush. Could happen due to disk performance like say a bad aggregate layout where one disk is so hot that it limits total IO to a raid group or it could be the system is just driving way to hard on IO input. Typical that it's something on the back end though.
b - "Deferred" back to back - worse, cuz the NVRAM wanted to flush again but couldn't right away due to some other issue. "B" is bad enough - I just finished a flush and I'm immediately asked to start another one. "b" is more like I have multiple conditions trying to trigger a CP but I still can't go fast enough. This could be CPU or other domain bottlenecks as well. Basically if you are getting a bunch of "B's" and it gets even worse it starts throwing "b's" and if you aren't calling for help at this point you're watching certain parts of your kit start to glow with the overload.
S - snapshot. Easy one.
Z - internal sync. Basically the same thing as S but not caused explicitly by someone or the system doing a snapshot operation. I know why have two, but this way you can call out the difference between totally general system operations and snapshot operations especially those that are user invoked. If you are getting massive CP's due to snapshot, you might start with tools that are requesting those 250 snapshots per second across the whole system as opposed to looking first toward internal operations for the Z type CP.
H - High water mark
L - Low water mark - Easy to get these confused, because the high and low points are measured against different things. H is high against "modified" buffers. So if we've modified a large number of RAM buffers, we flush. L is low against "available" buffers. So if we've used a lot of buffers, but not necessarily modified a lot of them, this one flushes to at least make those we have modified available for reuse. The Low water mark lets the system keep read data alive in RAM buffers just a bit longer if some of the modified buffers can be made available before the cache has to discard data.
U - yup - something doing a flush
V - just another buffer type. Couldn't explain it better.
M - mbufs are network buffers (I think)
D - Datavecs - yet another type of buffering 😄
N - Log entries. NVRAM has two limits - a physical size maximum and the number of individual entries - consider a table of contents. F is the physical size, this one is the table of contents.
All those buffer type differentials are to help support track down specifically where they should start looking rather than have just one type saying a buffer filled.
U is a flush operation in response to a client request through a data protocol, for example the "fsync" function on a file handle.
Z is an internal storage system generated "flush" if you will. It covers pretty much any reason that DOT wants to do a flush to disk due to logical processing needs not directly tied to the other specific external triggers. For example, let's say a data commit to a volume requires more space than is available and the volume is set to try to delete snapshots before auto-expanding the volume. If a CP is needed during the delete snapshot processing a "Z" type commit point would be used. This is different than the user requesting a specific snapshot be created which would be the "S" type.
Need to add this disclaimer:
The underlying thing behind all CP types is that they really don't matter until you have a performance issue. Typically you'll feel/see a performance issue, then you'll analyze performance of which only one element is looking at CP rates. If CPs are going on like crazy, then the type can point you to a potential trouble spot. But, watching CP's for patterns only and using that pattern to define a "problem" is backwards. CP's could be happening more than once per second and you may not have a performance problem - just means you are well utilizing capabilities of your storage! Potential problem issues are going to manifest more readily in workload throughput and/or latencies which can then be investigated and remediated perhaps using CP frequency/type as just one of the things to check.
Correct me if I'm wrong but it seems that I got the L and H thing right, have I? Because from reading your explanation it seems so but you still bothered explaining so I thought perhaps I am missing something.
If you do happen to know what the D and V buffers are for I would be happy to understand them.
One thing doesn't seems to be right to me thuogh, what is the connection between network buffers and a CP?
Does the network modules holds RAM on them and it is used as a buffer instead of the main RAM? If not then I really do not have any idea what you meant.
I have read the manual of sysstat and it is not really informative. I continued to search from other sources and when I failed to found (although I did found good explaniations to some that I haven't understood before) the information I posted this discussion.
I also thought it would be a good thing that a good explanation for all of the types will be at the same place for the use of others.
Yes - on re-reading you do have the H/L items right in the original post. You see how it's easy to get confused by those two?
For the D/V I really don't have any more details other than they are some kind of buffer. If I knew that detail at the appropriate level I probably couldn't really talk about it. I know that the "D" type buffers were added sometime in the 7.x DOT code line, perhaps early on in 7.x. The concept of a "vFiler" or "virtual Filer" was in the 7.x code line as well - perhaps the V type buffers have something to do with that? total guess on my part.
My comment on "mBufs" perhaps having something to do with Network is based in history - UNIX like systems used "mBufs" as the name of kernel IPC memory buffers, typically used in support of network and socket communications. Logically, if you are receiving a lot of data and the mBufs are filling as a result and getting low you might want to commit NVRAM early so you can dynamically adjust priority to push the data through quickly to the freed NVRAM - especially if the allocated mbufs are larger than current available space in NVRAM to accept the data. Again - just another way of identifying a very specific use case for what might be backing up in the system as opposed to well, we're full now so start a CP because we're over-committed due to not yet processed incoming network traffic as opposed to some other reason.
In practice, the only CP type I typically see besides T or F even in high IO configurations are Z and S. Sometimes "B" of course - those are generally a mess. I can't recall any times I've had to deal with the other detail types directly. Systems that pay close attention to best practices will not have to worry about CP types until they just don't have the actual horsepower/space to keep going. Having a whole lot of CP's going is never really a bad thing - means you're system is being used! When you get to the CP induced CPs though (B and b) then you have a serious concern, and you'll likely find a best practice wasn't used in laying out the disk structure, or more typically wasn't followed after some later change/addition to the system.
The only down side with the manual pages is that they tease a little - they tend to give you just enough to whet one's appetitie, but in this case the detail information isn't specifically useful unless you also have a deep understanding of the performance analysis needed to drive down into underlying issues. For instance, generate a perfstat, then read through it and understand everything that's in there. If you are comfortable with that, then the CP information is very useful and also very understandable based on what's in the sysstat manual page. But without Netapp's internal handy tools to break it all down for you - not exactly easy reading. But that's the level we're on with CP types - they guide the next level of investigation, but aren't really something to try to configure around from a day to day perspective. Definitely one of the chicken and egg type situations.
So that last is my (usually) wordy way of saying that @ekashpureff is right - the info in the sysstat man page is really all you need. He generally is on these things - he's been doing and teaching this stuff for quite a while now.