In general, 'Data protection' is nothing but taking data located in one location and making a copy of it in a different location to serve two use cases:
• Backup : The objective is to restore from the secondary to the primary with no intention of failing over
to the secondary. This implies that the primary purpose of the secondary is archival storage. Therefore, you might have more data in the secondary than in the primary.
At our place:
In Primary: Irrespective of Protocols/application, we keep 30 days worth of data
In Vault : 3 Months, and for exchange it is beyond (I think it is governed by standard data retention required by the Exchange compliance policy).
• Disaster recovery (DR) : An exact replica or copy is maintained in the secondary and used for fail-over from the primary to the secondary if there is failure at the primary site. This is actual DR copy : Mirror Image of the Primary file-system. If DR systems must be brought online, SnapMirror relationship can be broken, which makes the destination volumes read/write and ready to use. SnapMirror allows you to synchronise the original source with the changes made at the destination and then re-establish the original SnapMirror relationship.
This is the latest TR (Mar,2020) that covers some useful information:
https://www.netapp.com/us/media/tr-4015.pdf
Wrt this query : for long term snapvault, if the destination cluster is on remote site, will it cause too long to recover a big amount of data? Well, it all depends upon your WAN connection/speed (as far as pipe is concerned), especially with network compression etc it is not too bad. To be honest, it been used by thousands of customers and it is a solid dependable technology. However, testing is very important to simulate such events. Half yearly testing is not aggressive and achievable with proper planing . Keep reading and keep looking, you may find lot of useful information/experiences on netapp as well as other netapp forums where customers have shared their views.
Thanks!