Hello all - I was wondering what you would recommend for the snap reserve setting on a LUN that will be protected by SMVI. If you read all the TR's, they all say to set snap reserve to 0 for the LUN datastores. But, if you add SMVI into the mix, wouldn't you want a snap reserve?
Also, now that I think about it. What differences would there be in think vs. thin LUNs?
For what it's worth, the 2 day NCIE-SAN prep class by Steve Botkin is great for this (covers thin provisioning, volume auto-grow, snapshot auto-delete, frac reserve, etc. in great detail with discussion).
And...I'd say the answer really depends on the customer....items like....
if doing thin provisioning, they're probably already trying to maximize space so going with a good OM alerting setup and no snap reserve would probably make sense (have to do thin provisioning in some way anyway for LUNs if using dedup....part of why I like NFS)
if somewhat cautious, can do some inspection around change rates (i.e. how many snapshots do they want to keep, let that run for a week and see how much space used, then set snap reserve somewhere above that)
There's some other possibilities there but they're not springing to mind right now. Mainly just a conversation to have.... (higher space utilization but more admin work, lower space utilization but less admin watching).
Now....dunno if I shouldn't be answering this right now or not....
Andrew and Eric - Thank you both for your replies. It was a bit of thinking out loud for me. I am in the process of writing a blog article and as I was doing so, this thought popped into my head.
The more I work with all of this, the more I am of the mind set to thin provision everything and only manage the aggr space (with the proper tools of course!). Thick provisioning, while the most conservative way, seems like over kill to me for most instances.
Thank you again for your replies!
BTW Andrew, I'm jealous you went to Botkin's class! I wasn't able to make it because of a scheduling conflict. Maybe next time!
Hey Radek - Correct me if I'm wrong but Frac Res only kicks in on LUNs if space reserved is checked. What happens to frac res when it isn't checked? From what I've read, Frac Res. is out the window at that point. I've gone over this a million times and read 10 million things on it but I still don't have all of it in my in my head.
Let's take an example, if I want to set up a volume to hold VMWare VM's in LUNs and I want to use SMVI and dedupe. Tell me if this is the optimal way to set it all up. Yes, you are running a risk of going offline, that is why you will be smart and use all the right tools to monitor it!
You would create the volume with no guarantee and to make it easy, let's put snap reserve at 0%. We'll go ahead and set an auto-grow/auto-delete policy on it and manage the space at the aggr level.
When creating the LUN, you would uncheck space reserved. What happens to Frac Reserve at this point? Filer View will say it is 100% but that is wrong, the value is ignored. What if I modified it to something small, say 5%-10% to give me a buffer. Again, will it matter in this scenario? I don't THINK so, but I don't know for sure, I haven't tested it. Again, all what I've read. Let's assume it is ignored no matter what the value is.
Now that I've got a volume and a LUN and I'm set for max thin-provisioning, I'm ready to set up my de-dupe and snap shot schedules. Ideally, you want to run de-dupe before a snap shot because de-dupe will only affect the active file system, it will not de-dupe snap-shot blocks. Because of this, you want youe de-dupe and snap shot schedules to be as close together as possible. Say, once a day run de-dupe, then take a snapshot.
What if the customer wants to take 3-4 snaps a day with SMVI? Can you run dedupe more than once a day? Even if you could, I certainly wouldn't recommend it because of the overhead during the actually process.
So, maybe you set up SMVI with a 3x a day to take snap shots but you run dedupe every night before the last one. I believe you are as efficient as possible yet not overwhelming the controller running de-dupe a bunch of times.
What do you think?
One more thing after talking with Don Mann about this. Snapshot auto-delete will break SMVI because you will be deleting the snaps outside of the application. But, if you are out of space and you couldn't auto-grow, you already have big problems!
So in a nutshell thin provisioning of both volume & LUNs doesn't bring any negatives in my opinion (other than the need for careful aggregate free space monitoring).
Yeah, one more thing...
Some SnapManager products (not SMVI though) don't like disappearing snapshots, so setting snap auto-delete on a volume level is not an option. Small FR is required (purely for free space monitoring) to allow deleting snaps from within SnapManager, should the volume run out of space.
So the question is: what happens if we have this 'theoretical' 100% FR on a thinly provisioned volume with thinly provisioned LUNs?
Radek - Very good point! I will be doing an SMVI install sometime next month so I'll be sure to test this scenario in my lab ASAP. We don't have SMVI in our lab yet but we will be getting in soon. I am also curious what will happen if you did set an autodelete policy anyway. Sure, SMVI will freak out and probably need some repairs but if your volume is full (and you aggr in this scenario) you have bigger problems. I would rather have SMVI break if my aggr is full than have everything go offline. Again, all in theory, I haven't tested all this.
In the meantime... Any smart people out there done this and have any idea what would happen?
I have been "preaching" this config for a while, we have got it in prod. and we also implemented SMVI in the space of 12 days here in June. We went from 0 daily snaps to 7 daily snaps in 12 days. What we did is we set thin prov. on volume and LUN so that all free blocks get pushed into aggr. which facilitates space management. We then added 1 SMVI snap everyday to see what impact what on aggr. (we turned off aggr. snaps and aggr. snap reserve first). After 12 days and a bit of storage reshuffling we had 7 daily snaps. This enables us to backup +500 hosts in 45 minutes and restore times of 2 minutes for a host or a datastore. pretty massive improvement coming from a backup solution (VCB) that never worked with a backup window of 19 hours..
At the same time we also have vol autogrow and snap autodelete on. We have not had any issues with this since we implemented in June.
There is one caveat of course: if your volume hits its max growth size then snaps will start to disappear. Snaps at this point should be considered
as backups so when your backups disappear its not the best situation. I recommend mirroring or vaulting your SMVI snaps offsite if you can as
This enables us to backup +500 hosts in 45 minutes and restore times of 2 minutes for a host or a datastore. pretty massive improvement coming from a backup solution (VCB) that never worked with a backup window of 19 hours..
Good stuff - this is priceless to see something working not in the lab, but in the real-life & rather chunky environment.