I have observed it on multiple installations, now I have chance to capture it.
NetWorker 7.5 and 7.6, Data ONTAP 7.2 and 7.3, directly connected tape or 3-way backup via DSM, tape drive or backup on disk - in all cases I see exactly the same problem. Backup speed is quite good, it corrresponds to backend (i.e. over 100MB/s for tape drive or 1Gb/s network interface for 3-way backup). But recovery is miserable - it is approximately 8MB/s!
In all cases recovery was done via file base recover (nwrecover - browsing files, selecting them, starting recover, or direct call to recover).
As mentioned this happens in very different combinations of versions and backjup technology, so it appears to be inherent limitation of either NetApp or NetWorker. I attach screenshot (sorry, being connected to customer via read-only VNC session so can only do it) of recover session. Observe the throughput
This is not high file count environment. It is NFS database with relatively low number of files. Total size is slightly above 1TB. And this is really reading data, after directory phase was already done.
Any hints, pointers and ideas appreciated. I probably has to (recommend to) try save set recovery next time. But it is frustrating ...
In general recovery is slower than backup with NetWorker. To isolate the problem, you may want to capture some perfstat on NetApp controller, and if the controller is not busy then the problem is somewhere else.
After I had slept on it a bit I relalized that all documented cases of slow restore were using 3-way backup (with either another NetApp or NetWorker DSA as tape server). So I checked what was going on in 1TB restore session running right now. It appears there is indeed quite a bit of round trip overhead ... caused by NetApp.
From NDMP server log:
Jul 16 12:32:26 GMT+04:00 [ndmpd:109]: Message NDMP_NOTIFY_DATA_READ sent
So it appears NetApp is issuing 10*128KB synchronous reads. Each read results in NDMP_NOTIFY_DATA_READ request from NDMP server to DMA (NetWorker) and NDMP_MOVER_READ request from DMA (NetWorker) to NDMP mover. So it is indeed quite a bit of round trip ... there are appr. 10 such requests per second which accounts for ~10MB/s restore performance.
Is it possible to change NetApp read size? Personally I'd rather issue single NDMP_NOTIFY_DATA_READ request for the whole dump size ... or at least once per file.
I have to test what NetApp does for direct tape restore.
Packets numbers 665+667 are for NDMP_NOTIFY_DATA_READ (for some reasons request is split by NetApp between two packets). As can be seen NetWorker immediately issues data mover request which is completed promptly (packets 669 - 673). After this we have 100ms pause before NetApp issues next NOTIFY_READ (674) ...
Unfortunately NDMP logs on NetApp do not offer sub-second granularity making it impossible to dig into it.
I also have seen description of exactly the same problem using Commvault with 3-way backup (using NRS which is analog of NetWorker DSA) - extremly slow restore without any obvious bottleneck anyhere.
The options manpage should be a logical reference point for such questions:
ndmpd.tcpnodelay.enable Enables/Disables the TCPNODELAY configuration parameterfor the socket between the storage system and the DMA. When set to true , the Nagle algorithm is disabledand small packets are sent immediately rather than held and bundled with other small packets. This optimizes the system for response time rather than throughput.
The default value is false.
This option becomes active when the next NDMP sessionstarts. Existing sessions are unaffected.
This option is persistent across reboots.
You are a NetApp employee? ... Google will help you with the general socket option and effects of tcpnodelay...
It's quite some time since that problem was opened and I would be very curious to know if you solved that issue! I have the exact same problem, and we passed through the same phases... Activating tcpnodelay helped a bit, but we still have very poor single file restore performance...