ONTAP Discussions

Slow NDMP backups

hardy_maxa
14,217 Views

Hi All.

First, I hope I'm in the right area - I dont this too often.

I have a FAS2240 dual head filer, and an EMC Networker backup system with FC attached LTO6 Quamtum i80 library.

I run NDMP backups of the filer volumes, via the Networker server, and cannot get the backup rates above 150MBytes/sec.  I figure I should get this at least on each tape drive (there are 2 in the library)

To overcome the LAN speed limit, I have configured 2 cables directly from the filer to the backup server, and configured the system so that the NDMP backup is sent via this aggregate (which is running in round robbin mode from the filers side - the load on these 2 interfaces is sharing correctly).

The FC connection from Networker server (dedicated) to the library is via 4GBit HP/Brocade SAN/Switch.  This device never shows utilisation above about 20%, so I figure its not part of the problem.

Without the load balancing net, the backups limit at abut 140MBytes/sec

The NDMP client in Networker is configured to do smtape type volume backups.  'dump' type also works and is a little slower.

During the backup, the Netapp CPU doesn't get above about 40%-50% max.

I figure that with 2 ethernet lines, I should be able to backup at least about 180MBytes/sec.  MTU on this link is set to 9000.

Anyone have any ideas - I think I played with every possible setting by now.

Thanks

Hardy

14 REPLIES 14

scottgelb
14,159 Views

To test the fastest you can read off of disk with ndmp you can run "dump 0uf null /vol/volname" and see the speed since it writes to null.  Clean up snapshot after and delete it...or ctrl-c when you get a good sample of the read speed.  You can also run "snapmirror store volname null" for a similar test to null with smtape (depending on your ontap version the snapmirror to tape command may be different).

hardy_maxa
14,159 Views

Thanks Scott.

Testing the dump > null on one of the large volumes, the Disk IO has no

probs getting up to >200MBytes/sec  (Peeking >300MBytes/sec)

CPU during this test was about 50%.

If I could get all that piped through to a tape drive, I'd be right.

It seems snapmirror store is replaced now, but using

smtape backup /vol/data null  I think is doing the same thing.

Disk IO rates are similar.  CPU seems a bit less with smtape.

So - how can I create a test path using the etherchannel aggegate > null

on the backup server.  Any ideas ?

Thanks,

Hardy

scottgelb
14,159 Views

The backup is a single stream per volume. If gigabit then a single stream is throttled at the gig network. Even if 2 backups concurrently each individual one can't go faster than one wire each. The link aggregation won't take one backup stream over both interfaces concurrently unfortunately.

Sent from my iPhone 5

scottgelb
14,159 Views

Do you have a mezz card in the 2240? Can you add the dual port 10gbe card for ndmp filer to server, or the 8gb fc mezz card and direct attach to tape.

hardy_maxa
14,159 Views

Not sure on the exact hardware config on the 2240.  Currently I have 4 Gig eth ports and 2 x 8Gbit FC ports on each filer head.  In the backup room where this system is though, the san switch is only 4 Gbit model.

I guess 10GBit eth is an option, but I'm trying not to spend more than I have to.  and upgrading to 10GBit is not going to give me x10 performance because the LTO6 drives themselves dont do much more than 160MBytes/sec (depending on compression).

NDMP direct to the drive limits me to one session/drive and I wasn't getting great backup rates with that either.  Also - it ties the drives to one filer, so I have no option to backup the few volumes from the primary system that dont get snapmirrored to the backup system (where the backups are running from).

hardy_maxa
14,159 Views

BTW Scott - I'm running a test now - single saveset / stream.  Peek LAN throughput 120MBytes/sec, load balancing over 2 ether ports on the Netapp.

Average is just under 100MBytes/sec though.  Almost looks like a TCP tuning pattern though.

I wonder if we can override this ?  There shouldn't be bandwidth limit of 1Gb on a load balance bond.

scottgelb
9,988 Views

It is limited to a single interface. Aggregation won't use both ports for one backup. Need to run a second backup to use the other interface. This is the bottleneck.

Sent from my iPhone 5

scottgelb
14,159 Views

Only one mezz card so can't add 10G

Sent from my iPhone 5

aborzenkov
14,159 Views

Is it single stream or you have multiple backups running in parallel?

hardy_maxa
14,159 Views

I'd like to be runnnig multiple streams, as I just mentioned.  I have 2 filers.  Primary has most of the volumes snapmirrored to the backup.  The Networker system is trying to NDMP to the backup system.

Regards

scottgelb
9,988 Views

Ndmp doesn't multiplex so source to target is a single stream to the drive. Sounds like gigabit is the bottleneck. But might be worth checking if networker can write to a null device on the media server to check throughput before tape.

Sent from my iPhone 5

hardy_maxa
9,988 Views

Yes - so that confirms it I think.  Using NFS and a 50GB file to test the ethernet speeds shows the following.

Bond interface via switch - max 110MB/sec (the switch will not round robbin) - roughly same as 1 GBit link.

Bond interface direct to the backup server, 160MB/sec max.

If I run both at the same time, the filer has no trouble getting to 260MB/sec.

I think the only solution to my backup rates is to try and configure Networker to backup the same device from 2 or 3 different IPs.

Over to you EMC...

Thanks all for your time and feedback.

Hardy.

scottgelb
9,988 Views

Good to know and thank you for posting the results. Will help others running ndmp over gigabit.

Sent from my iPhone 5

Vegahman76
9,807 Views

We're having a similar issue; what was your resolution please?

Public