ONTAP Discussions

Performance of NDMP backups

glen_eustace
8,749 Views

We have finally managed to get our NDMP backups through EMC Network to an LTO-3 Ultirum jukebox doing something 🙂

I would really apppreciate some idea of what we can expect in terms of throughput or performance.  We have no experience with this environment and are simply trying to set expectations based on other peoples experience.

With a single group running against an IBM N-Series 6040, backing up a signle volume, we are seeing the tape drive writting at a rate of ~ 7MB/s, this seems a triffle slow and at this rate a 1TB volume is going to take a very long time to backup to tape.

Any comments appreciated.

13 REPLIES 13

glen_eustace
8,722 Views

Surely others are using NDMP to backup volumes on their filers, the complete absence of comments is dissappointing.

We tried a test by backing up the same volume, firstly using NDMP and then using a CIFS share.  In both cases the backup infrastructure was exactly the same.  No other activity on the filer or backup system.

NDMP -> 5-7MBytes per sec.

CIFS -> 70-80MBytes per sec.

Are we doing something wrong ? Surely NDMP should perform better than this

pradeeps
8,722 Views

Hello Glen,

Can you pls share information on the version of Data ONTAP & the kind of data set being backed up (ie. lots of small files, large files, home directory files etc..). Also have you been able to try this backup with an LTO4 tape drive?

pradeeps
8,722 Views

Glen,

Also please let us know the version of NetWorker Software & whether the tape device is directly connected to the N Series (local backup) or to the NetWorker Server (Remote Backup).

venkatk
8,722 Views

The backup performance is very much data set specific.

The backup performance of 7 MB/Sec is pretty slow even for a LTO3. But we do see these numbers in volumes with high file count.

Let us say we backup a volume of 20 million small files adding up to 80 GB we see a 10 MB/Sec throughput.

The following information would be helpful:

Version of ONTAP:

Dataset profile:

     Size - Vol Size

     File Count - Number of inodes

     Average File Size

backup.log file from /etc/log (name of the volume that is getting backed up)

stephan_troxler
8,722 Views

Hi Glen

How is the library connected to the filer? Directly via switch or via backup server? Can you ensure that the NDMP traffic flows over FC?

5-7MB/s is indeed very poor but as the other posters mention this depends on the data mix. I suggest you do some tests with big files (LUNs) for clear results.

80-100MB/s on LTO3 should be normal.

Stephan

glen_eustace
8,722 Views

OK, I suppose I should have put more information into the original post.

Filer: OnTap 7.3.5.1, IBM N-Series 6040, 1TB SATA Drives with PAMII

Networker: 7.6.1.3, LT03 connected to storage node.

Data Network: 10Gbit NICs on filer -> Cisco Nexus 5010 -> Cisco4900M -> Cisco4948 -> 1 Gbit -> Storage Node. We will be building a 4x1Gbit etherchannel to the storage node shortly. (NB: The CIFS backup did max out the network and we got the expected throughput of around 70Mbyes/sec).

Volumes: being backed up are all Read-Only snapmirror targets. We have tried both a few large files and more smaller files. But it doesn't seem to make any difference. None of the targets contain large numbers of small files.

pradeeps
8,722 Views

Hi Glen,

Please open a support case with NetApp Global Support to get an expert’s opinion on what is happening in your backup environment since NDMP performance can be affected by a variety of factors, including topology (local, remote etc.), directory structure, file size etc..

stevedegroat
8,721 Views

I have the same concerns regarding a lack of information around NDMP performance.  I currently have an open case for my issue - 3160 cluster running v8.02P3 that is the dedicated for snapmirror destinations.  I have controller A running strictly NFS volumes off to tape (2xFC ports configured through Cisco switches to LTO3) and she is seeing a max throughput of 145MB/s and am happy with these.  Controller B on the other hand writes strictly VMware mirrors to tape, but only sees 50-75MB/s using the same FC configuration.  I have 24 x 1.5TB deduplicated volumes I'm trying to send to tape.  I had to break the jobs to run 4-5 volumes per day, Wed-Sun (these are Fulls only).  Each volume takes 20-30 hours to complete.  Have been gathering perfstats and we know there is misalignment within the VMware volumes (currently being corrected). 

Looking for any counters in Performance Advisor or DFM to measure tape writes (sysstat shows this so it must be somewhere).  I can measure the throughput from the Cisco switches but it would be nice to have data from the filer to show my management that NDMP is not a scalable solution.  Would like to justify moving to SnapVault offsite to eliminate tape.

scottgelb
8,721 Views

are there big differences in file counts in the volumes?  That can make a big difference...or overall workload on each controller.  With 8.0 and prior NDMP runs out of one core, but with 8.1 people are going to see better performance since NDMP runs across multiple cores (moved from kahuna domain) so if a cpu bottleneck that could help down the road when 8.1 is GA and you upgrade.  I'm not sure about DFM counters for tape writes... but you could also dump to null to see the max speed the controller can write (bypassing tape) dump 0f /vol/volname

stevedegroat
6,396 Views

The VMware volumes have very few files compared with the straight NFS volumes on the other head. The VM volumes have less than 1000 files whereas the NFS volumes range from 800-2million. 

We are using TSM v6.3.1, but have DAR and TOC disabled. 

Looking forward to some relief from the single-threaded approach - here are my CPUs with 5 running NDMP sessions for the VM volumes (along with normal mirror operations).

ANY  AVG  CPU0 CPU1 CPU2 CPU3

100%  49%   18%  39%  42%  99%

100%  49%   16%  38%  42% 100%

100%  51%   18%  42%  43% 100%

100%  48%   16%  38%  39%  99%

100%  47%   14%  37%  39% 100%

100%  47%   15%  35%  37% 100%

100%  48%   16%  38%  39%  99%

100%  48%   16%  38%  39% 100%

100%  49%   17%  39%  41% 100%

100%  49%   15%  40%  41% 100%

100%  49%   17%  40%  41% 100%

100%  49%   16%  39%  40%  99%

scottgelb
6,397 Views

Makes sense... VMs are much less inodes, but load on that system might be higher as well as disk utilization from production running on the system.  A perfstat or even just a quick look at disk utilization in statit would be useful to see if that is a bottleneck as well.  But the first CPU being pegged is common with NDMP... 8.1 will alleviate this a bit.  But the disk contention could be another bottleneck to consider depending on VM workload.

stevedegroat
6,397 Views

This controller has no production load, she is only the mirror destination.  I have a perfstat uploaded to Support for review.  Would you happen to know any counters that can be activated to measure/graph the tape writes in Performance Adviser?  Sysstat shows the values but I haven't be able to find anything in active in PA (there is a NDMP selection under custom views but it has no active counters).

scottgelb
6,397 Views

Deswizzling may be a factor since a mirrored target.

Sent from my iPhone 4S

Public