Subscribe

reallocate vs. extent vs. read_realloc

Today there are at least three mechanisms to ensure volume is defragmented. I would appreciate discussion about relative merits and applicability of each. If somebody could share real life experience, it is really appreciated.

This is not about snapshots impact on defragtmentation because it is the same for all three methods.

1. reallocate (-p).

Pros:

  • can be scheduled at most suitable time to avoid performance impact. Also can be scheduled immediately before sequential read activity (e.g. backup or database verification) to get the best defragmentation level.

Cons:

  • additional administrative overhead. Has to monitor and adjust reallocation schedule vs. controller load and other activities
  • potential performance impact due to extra read/write overhead (seems to be low enough)

2. vol options extent (space_optimized)

There is not much information what it actually does and how it is implemented. From available information I believe extent is implemented by reallocating (if neccessary) on write as opposed to read_realloc, which reallocates on read. Could somebody confirm it; and if I'm wrong, sched light on what extent options actually does? Assuming it does what I believe ...

Pros:

  • best defragmentation level. If write would cause fragmentation, data is immediately layed out sequentially. So it is always in best shape.
  • autotuning; set once and forget about it.

Cons:

  • extra overhead - both read and write - during production operation. Not sure how to measure it.
  • potentially more stress on cache

3. vol options read_realloc (space_optimized)

Pros:

  • autotuning; set once and forget about it.
  • as data is already (at least partially) read off disks, less stress on IO subsystem; extra read overhead is avoided

Cons:

  • data may become fragmented next time it is needed (i.e. - we read fragmented data, notice that it needs reallocation and do it; but when we try to read it next time data may become fragmented again so reallocation attempt just caused more IO without any use)
  • extra overhead - at least write - during production operation. Not sure how to measure it.

Comments? Anyone using extent or read_realloc in production? Could you estimate overhead of these options?

Thank you!

Re: reallocate vs. extent vs. read_realloc

I'd be quite curious for thoughts on deduplication interaction as well.

Re: reallocate vs. extent vs. read_realloc

We use read_reallocate on a few of our database volumes, and we really haven't seen any kind of performance hit to speak of.  It was my understanding this was implemented as a work-around to the sequential read after random write issue that is present on the NetApp's due to how WAFL works.  It's a nice engineering fix to an issue that I think plagued really a very small percentage of customers.  I would love someone from NetApp though to break down all the options listed above in detail though.

Re: reallocate vs. extent vs. read_realloc

I guess I should note that read_realloc only optimizes the data after non-contiguous blocks are read once.  As for vol options extent, from NetApp's documentation: "You enable logical extents for volumes that contain Microsoft Exchange data only."

http://now.netapp.com/NOW/knowledge/docs/ontap/rel7311/html/ontap/sysadmin/tuning/concept/c_oc_tun_ms-exchange-logical-extents-when.html#c_oc_tun_ms-e...