Network and Storage Protocols

Why would NDMP/SMTAPE of volume of LUNs be so slow while volume of CIFS is so fast??

pclayton99
5,842 Views

This past weekend was spent examining why the throughput of NDMP/SMTAPE operations varied so much.

I do not have an answer as yet just more mystery.

The configuration is three LT05 tape drives connected over 4GB SAN fabric to a FAS3240 and Dell R815 (quad processor, 12 core, 256GB memory) with the NetApp being able to perform NDMP/SMTAPE operations direct to tape.

What has been found is:

  • Using NetBackup V7 with NDMP/SMTAPE operation, the backup operation can go directly from the filer to tape without talking with the R815 server
  • If the source volume and dozens of snapshots within it is a CIFS share with over 8 million files can be put to a LT05 tape drive at upwards of 113MB/sec.
  • If the source volume and dozens of snapshots within it contains LUNs which are used by our Exchange 2010 servers the writing to a LT05 tape drive does not get above 10MB/sec. I have found it to be as low as 3MB/sec.
  • Both volumes can be on the same or different controllers. Does not make a difference.
  • Both volumes can be in the same aggregate. Does not make a difference.
  • The volumes can reside on 7.2K SATA or 15K SAS. Does not make a difference.
  • The volume sizes have ranged from hundreds of GB through 7TB. Does not make a difference.
  • This same ratio happens with multiple volumes of CIFS and Exchange data.
  • The NDMP/SMTAPE commanding can originate from NetBackup or from the filer command line using 'smtape backup ...'. Does not make a difference.
  • Under the covers it looks like NDMP is using snapmirror functionality to perform the data transport to tape.
  • There is no 'throttle' option for NDMP/SMTAPE operations. An error message is displayed stating such. I was thinking the limit was due to the 'options replication.*' values I had set.

From what I can tell the NDMP/SMTAPE operation causes a new snapshot to be taken for a static version of the data to then be analyzed and sent to tape.

The controllers are not 'beat', nothing glaring (that I could think to examine) on the filer.

I have opened a support case and sent them this information and perfstats output to try and solve this puzzle.

The question is why would there be such a drastic difference in the throughput due to having CIFS versus LUNs within the volume?

I have no current answer and am wondering if others have found the same thing and maybe the answer/solution to getting great throughput all the time?!

Thanks.

pdc

11 REPLIES 11
Public