Network and Storage Protocols

Slow NDMP Backup to Tape After Upgrading to FAS8040 Cluster and 8.3

JRGLENNIE

Hello All-

 

We have been using Symantec NetBackup with our NetApp for doing NDMP backups to tape for quite some time.  Recently, we migrated off of our old 7-mode filer to a new 8040 cluster running 8.3.  I have been trying to get our tape backups running again and was able to set things up in NetBackup, but now whenever I try and run a full backup on any of the volumes, the backups only write at 150kb-200kb a second and the jobs eventually fail.  I am using the same FC switch as before and updated the configuration on the switch and library to allow communication to the new FC adapters on the NetApp.  I have 2 (although I recently disabled one of the adapters to see if that was causing the problem) FC adapters connected to node 1 on the NetApp to the FC switch, and I am using the Cluster management LIF as my NDMP host in NetBackup according to the following guide:

https://www.veritas.com/support/en_US/article.000025335

 

Does anyone have any ideas?  I've found some documentation from NetApp discussing how to troubleshoot poor backup performance, but I can't seem to set up a working "dump to null" command, which seems pretty integral to most of their troubleshooting steps.  Still, I don't think the performance issue is caused by an overloaded controller. 

 

 

Here's some of the output I am seeing from a given backup job in the NetBackup admin console:

 

10/19/2016 10:14:55 - Info nbjm (pid=6560) starting backup job (jobid=53570) for client CLUSTERMGMT, policy POLICY1, schedule SCHEDULE1
10/19/2016 10:14:55 - Info nbjm (pid=6560) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=53570, request id:{3A3732BC-A6DD-4FB4-8C35-2C57512B093A})
10/19/2016 10:14:55 - requesting resource backup_svm-SCHEDULE1
10/19/2016 10:14:55 - requesting resource NB-HOST.NBU_CLIENT.MAXJOBS.CLUSTERMGMT
10/19/2016 10:14:55 - requesting resource NB-HOST.NBU_POLICY.MAXJOBS.POLICY1
10/19/2016 10:14:55 - granted resource  NB-HOST.NBU_CLIENT.MAXJOBS.CLUSTERMGMT
10/19/2016 10:14:55 - granted resource  NB-HOST.NBU_POLICY.MAXJOBS.POLICY1
10/19/2016 10:14:55 - granted resource  101323
10/19/2016 10:14:55 - granted resource  IBM.ULTRIUM-TD3.004
10/19/2016 10:14:55 - granted resource  NB-HOST-hcart3-robot-tld-3-CLUSTERMGMT
10/19/2016 10:14:56 - estimated 0 kbytes needed
10/19/2016 10:14:56 - Info nbjm (pid=6560) started backup (backupid=CLUSTERMGMT_1476886495) job for client CLUSTERMGMT, policy POLICY1, schedule SCHEDULE1 on storage unit NB-HOST-hcart3-robot-tld-3-CLUSTERMGMT
10/19/2016 10:14:56 - started process bpbrm (pid=11208)
10/19/2016 10:14:57 - Info bpbrm (pid=11208) CLUSTERMGMT is the host to backup data from
10/19/2016 10:14:57 - Info bpbrm (pid=11208) reading file list for client
10/19/2016 10:14:57 - connecting
10/19/2016 10:14:57 - Info bpbrm (pid=11208) starting ndmpagent on client
10/19/2016 10:14:57 - Info ndmpagent (pid=12152) Backup started
10/19/2016 10:14:57 - Info ndmpagent (pid=12152) PATH(s) found in file list = 1
10/19/2016 10:14:57 - Info ndmpagent (pid=12152) PATH[1 of 1]: /backup_svm/volume_to_backup
10/19/2016 10:14:57 - Info bptm (pid=10756) start
10/19/2016 10:14:57 - Info bptm (pid=10756) using 30 data buffers
10/19/2016 10:14:57 - Info bptm (pid=10756) using 65536 data buffer size
10/19/2016 10:14:57 - connected; connect time: 0:00:00
10/19/2016 10:14:58 - Info bptm (pid=10756) start backup
10/19/2016 10:14:58 - Info bptm (pid=10756) Waiting for mount of media id 101323 (copy 1) on server NB-HOST.
10/19/2016 10:14:58 - mounting 101323
10/19/2016 10:14:59 - Info ndmpagent (pid=12152) CLUSTERMGMT: Session identifier: 28042
10/19/2016 10:15:44 - Info bptm (pid=10756) media id 101323 mounted on drive index 8, drivepath /NODE1/nrst3a, drivename IBM.ULTRIUM-TD3.004, copy 1
10/19/2016 10:15:44 - Info ndmpagent (pid=12152) CLUSTERMGMT: SCSI: TAPE READ: short read for nrst3a
10/19/2016 10:15:44 - mounted 101323; mount time: 0:00:46
10/19/2016 10:15:44 - positioning 101323 to file 2
10/19/2016 10:15:47 - Info ndmpagent (pid=12152) NDMP 3Way - Data Affinity 13102a5c-7740-11e5-8b3a-f34bfadd9084 is not equal to Tape Affinity d1c11ad3-7740-11e5-b678-5fd0506b00a8
10/19/2016 10:15:47 - positioned 101323; position time: 0:00:03
10/19/2016 10:15:47 - begin writing
10/19/2016 10:15:49 - Info ndmpagent (pid=12152) CLUSTERMGMT: Session identifier for Mover : 28042
10/19/2016 10:15:49 - Info ndmpagent (pid=12152) CLUSTERMGMT: Session identifier for Backup : 30232
10/19/2016 10:15:49 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Using "/backup_svm/volume_to_backup/../4hours.2016-10-19_0800" snapshot.
10/19/2016 10:15:49 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Using Full Volume Dump
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Using 4hours.2016-10-19_0800 snapshot
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Date of this level 0 dump snapshot: Wed Oct 19 08:00:00 2016.
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Date of last level 0 dump: the epoch.
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Dumping /backup_svm/volume_to_backup to NDMP connection
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: mapping (Pass I)[regular files]
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Reference time for next incremental dump is : Wed Feb  3 09:15:02 2016.
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: mapping (Pass II)[directories]
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: estimated 84603127 KB.
10/19/2016 10:15:51 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: dumping (Pass III) [directories]
10/19/2016 10:19:18 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: dumping (Pass IV) [regular files]
10/19/2016 10:20:52 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Wed Oct 19 10:20:52 2016 : We have written 173211 KB.
...lines repeating as dump progresses
10/19/2016 16:33:13 - Info ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Wed Oct 19 16:33:13 2016 : We have written 11166693 KB.
10/19/2016 16:36:04 - Error nbjm (pid=6560) nbrb status: LTID reset media server resources
10/19/2016 16:36:14 - Error ndmpagent (pid=12152) terminated by parent process
10/19/2016 16:36:14 - Info ndmpagent (pid=0) done
10/19/2016 16:36:14 - Info ndmpagent (pid=12152) Received ABORT request from bptm
10/19/2016 16:36:14 - Error ndmpagent (pid=12152) NDMP backup failed, path = /backup_svm/volume_to_backup
10/19/2016 16:36:14 - Error ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Write to socket failed
10/19/2016 16:36:14 - Error ndmpagent (pid=12152) CLUSTERMGMT: DUMP: DUMP IS ABORTED
10/19/2016 16:36:14 - Warning ndmpagent (pid=12152) CLUSTERMGMT: DUMP: Total Dir to FH time spent is greater than 15 percent of phase 3 total time. Please verify the settings of backup application and the network connectivity.
10/19/2016 16:36:14 - Error ndmpagent (pid=12152) CLUSTERMGMT: DATA: Operation terminated (for /backup_svm/volume_to_backup).
10/19/2016 16:36:15 - Error ndmpagent (pid=12152) CLUSTERMGMT: BACKUP: job aborted
10/19/2016 16:36:15 - Error ndmpagent (pid=12152) CLUSTERMGMT: BACKUP: BACKUP_NET IS ABORTED
10/19/2016 16:36:15 - Info ndmpagent (pid=12152) CLUSTERMGMT: MOVER: Tape writing operation terminated
10/19/2016 16:37:45 - Info ndmpagent (pid=0) done. status: 150: termination requested by administrator
10/19/2016 16:37:45 - end writing; write time: 6:21:58
client process aborted  (50)

1 ACCEPTED SOLUTION

JRGLENNIE

Nevermind, looks like I figured it out.  The new cluster has 16/8/4 capable FC adapters and the library was using 4/2/1.  The tape library was connected at 4g and the NetApp FC adapters were connecting at 8.  I couldn't find a way to force the speed to 4g on the NetApp side (only had the options for FC adapters in target mode, these were in initiator so I could connect to library) but I was able to set the port speed on the switch to 4g.  After doing that, NDMP jobs now write at ~78,119 kbps (~268mb per hour).   The jobis able to run to completion now and I no longer receive any of the NDMP error messages. 

View solution in original post

1 REPLY 1

JRGLENNIE

Nevermind, looks like I figured it out.  The new cluster has 16/8/4 capable FC adapters and the library was using 4/2/1.  The tape library was connected at 4g and the NetApp FC adapters were connecting at 8.  I couldn't find a way to force the speed to 4g on the NetApp side (only had the options for FC adapters in target mode, these were in initiator so I could connect to library) but I was able to set the port speed on the switch to 4g.  After doing that, NDMP jobs now write at ~78,119 kbps (~268mb per hour).   The jobis able to run to completion now and I no longer receive any of the NDMP error messages. 

View solution in original post

Announcements
Register for Insight 2021 Digital

INSIGHT 2021 Digital: Meet the Specialists 2

On October 20-22, gear up for a fully digital, totally immersive virtual experience with a downright legendary lineup of world-renowned specialists. Tune in for visionary conversations, solution deep dives, technical sessions and more.

NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public