Have a 48 disk 2400-2 that is not busy - it is a snapmirror destination from production, but has no activity other than incoming snapmirror updates, which are very small, and backups, which can be busy.
Some of the volumes have many small files (20-50 million), and are deduplicated and compressed at the source.
Using NDMP on some of the volumes works pretty well, the initial phases take a while, but backups generally complete in under 12 hours for most of them. This is done through Netbackup, and the speeds are equally fast to tape, disk and deduplication pools. When they reach phase 4 (pass IV) they all send data from 30-90MB/s, which is all acceptable to me.
One of them is around 50 million files, and for some reason this one takes 4 days. It seems to take exponentially longer than the others to get to the fourth phase and actually start moving blocks. Once it does, it runs around 30-60MB/s, which is ok for me, I just don't understand why it takes 8 times longer to get through twice as many files and directories in the first three phases.
So I am trying to use SMTAPE to see if that can improve the speed. I don't mind restoring the whole volume if I need to get something back as these copies to Netbackup are just for long term retention.
I was excited to learn about SMTAPE, because Snapmirror locally is very fast when its local from filer to filer - we normally see 100-200MB/s. But for some reason SMTAPE maxes out at 35MB/s and its not consistent. We don't have the waiting to get through all the phases, but its still very poor compared to what I would expect.
Any ideas on how to speed up the SMTAPE transfer? This is going straight to staging disk, so its not anything in the netbackup system. The staging disk regularly ingests 200MB/s. I know the filer can move data very quickly, as evidenced by the 120MB/s NDMP backups we get for certain jobs. It really seems to be something related to the SMTAPE process.
Filers are Ontap 8.1.4.P3 7 mode. I have changed the TCP window size for snapmirror to be in the 8MB range (whatever the max is) and performance is no different. Any help or sugggestions would be appreciated.