Subscribe

Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

Hi,

I have a 600GB volume that conatins all CIFS data for an environment including home directories, roaming profiles, TS profiles and general business data. I'm planning to move all this data to an ASIS-enabled volume, but I am unsure whether there would be any benefit from a de-duplication point of view in splitting the volume into multiple (i.e. one for home directories, one for profiles, one for business data).

From my understanding, there is no need to do this (surely the data would de-duplicate just as well whichever volume it is in?), but a colleague has recommended that I do.

There are obviously other considerations such as snapshot frequency, future snapmirror options, flexclones etc ... but from a de-dupe point of view, is there any benfit to splitting the like-data into separate volumes?

Any suggestions welcome!

Jon

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

So....a few thoughts here....

  • You'll get the same overall space savings whether the data is in a single volume or multiple volumes.
  • Given that, reasons to split out would be....
    • you're approaching the maximum volume size for deduplicated volumes (doesn't look like an issue here -- 600 GB is well within the 7.3.x max for all models)
    • dedup is beating up your disks too much while running each night (i.e. b/c it's have to work through a larger data set all at once)
    • dedup is running too long each night (kind of a corollary to the point above but distinct I think)

I would be curious to know your colleague's thinking though....

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

Thanks for the reply Andrew and for confirming my understanding.

Re: the points you raise - size of volume is well within the dedupe limit for a FAS3040 on 7.3. as you say, and I'm not concerned about the disk performance impact or how long it will take as a single volume rather than split out - we have the performance and time window to spare ...

I will probably look at moving all the data to a new volume anyway, so that I can keep the old volume with snapshots until they expire, but I don't see the need to go split the data into multiple volumes.

Thanks again,

Jon

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

Hi Jon,

You dont have to move the volume to enable SIS. I am not sure why you would want to go through that excercise, is there a reason?

You can simply turn on SIS and keep it running until all your snaps have expired, once they have rotated you ll start to see real gains.

before that your gains will be lesser.

If you have got snapmirror/other snap products running schedule them to occur AFTER sis is done running. That will make sure you push

less data across the wire and propagate your SIS gains across to your snapmirror destinations etc. Having said this, snapmirror sync is

not supported with SIS last I heard. Also if your volume has got read allocation on its not supported with SIS. If you are serving CIFS from

a vfiler its only supported as of ontap 7.3.x

Cheers,

Eric

PS: The bigger the volume to dedupe the better I think as you can only have X number of dedupe running at the same time.

Message was edited by: eric.barlier

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

Hi Eric, thanks for the reply.

The main reason for moving to a new volume was because we hold 14 nights of snapshots and did not want to wait that long to see the space savings. This is more of a political reason than a technical one, I know (particularly since the net effect would be to increase the aggregate usage until we deleted the firts volume), but we were keen to be able to report the space savings as soon as possible after the change.

Thanks for your points abut reallocate and snapmirror. While I understand that having reallocate running on an ASIS-enabled volume doesn;t make sense, I was still considering running a one-off physical reallocation on the containing aggregate, on the basis that it had previously come very close to cpacity and was therefore likely to have significant fragmentation.

I have yet to run a measure to determine how bad the fragmentation is, but assuming it is highly fragmented, would you agree that would be a sensible thing to do?

Cheers,

Jon

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

Hi john,

You could and probably should run re alloc. once in a while. No harm running it before you enable SIS. I might have to look into this myself at my end ;-)

Eric

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

For what it's worth, the general maximum number of simultaneous SIS/dedup operations is 8 (lower with the 2020 I believe though can't remember offhand).

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

I may be missing a good thread on this elsewhere in the communities and my apologies if so but....

Do you have any good rules of thumb for running sis and reallocate on the same volume? I have one customer where reallocate greatly helped with performance but would like to run sis as well (VMware VM's via NFS with a good bit of empty space inside the vmdk's/duplicate OS files).

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

I dont think its supported. I cant find the official doc. right now but its in my checklist as : not supported.

We are on 7.2.5.1 though.

Eric

Re: Whether to split CIFS volume into multiple volumes of like-data for de-dupe purposes?

Gotcha....thanks for the info. Any further details as you have them would be great....this is an item I've been meaning to track down for a while but just haven't gotten to it yet.

The customer is on 7.2.6.1P8 for what it's worth as well.