Community Related Discussions

Inquiry about metadata and amount of data transfers supported by Cloud Sync

Kang
470 Views

Hi teams,This is Kenji Kang at Japan Osaka SE team.


I have a technical question from a customer who wants to use Cloud Sync.

But I researched the document and could not find an answer.

Could you please to support me with customer questions?

 

Background

The customer uses an SMB file server in-house with a storage service called Panzura.

They have decided to replace Panzura with Amazon FSxN.

And they are considering how to move data from Panzura to Amazon FSxN.

Technical questions

1. Could you please to share me information about metadata that can be copied with Cloud Sync?

 

2. Could you please to share me information on the amount of data transfer supported by Cloud Sync?

For example, Cloud Sync can support to XX million files of data transfer.

Thank you for your kindly supporting.

Kang_0-1669646614027.png

 

 

1 ACCEPTED SOLUTION

elementx
419 Views

1) There are no S3-style "metadata" on SMB shares, there's just permissions and the rest is dates

https://docs.netapp.com/us-en/cloud-manager-sync/task-copying-acls.html

"Date created" can be preserved with Robocopy's `/mir` option.

 

2) CloudSync doesn't care how much data there is, it just copies files that it sees from A to B. What probably matters to customer is how much time they can afford for the final sync while Source is read-only (i.e. offline) and that will depend on the amount of data and changes that happened since the last sync. To estimate that you probably need the total number of files, the rate of change, and average file size.

View solution in original post

2 REPLIES 2

elementx
420 Views

1) There are no S3-style "metadata" on SMB shares, there's just permissions and the rest is dates

https://docs.netapp.com/us-en/cloud-manager-sync/task-copying-acls.html

"Date created" can be preserved with Robocopy's `/mir` option.

 

2) CloudSync doesn't care how much data there is, it just copies files that it sees from A to B. What probably matters to customer is how much time they can afford for the final sync while Source is read-only (i.e. offline) and that will depend on the amount of data and changes that happened since the last sync. To estimate that you probably need the total number of files, the rate of change, and average file size.

elementx
410 Views

CloudSync KBs are pretty good, this one gives an example of a very high file count:

https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Cloud_Sync/What_type_of_workloads_are_not_suitable_for_data_migration_using_CloudSync%...

 

This one shows how concurrency of scanning and transfering in Data Brokers can be increased:

https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Cloud_Sync/How_to_change_the_process_and_concurrency_limits_of_scanner_and_transferrer...

You may need more than one Data Broker in place if one Data Broker gets maxed out.

 

If you have one of those workloads, it's a good idea to do a simple PoC (e.g. scan a representative sub-set of all data and sync to an on-prem ONTAP Select, for example):

https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Cloud_Sync/What_type_of_workloads_are_not_suitable_for_data_migration_using_CloudSync%...

During that PoC you could capture diag data from Data Brokers (maybe best to ensure that before PoC).

Also use it to test date metadata sync if you need that.

 

Once you do a sync to Dst, Dst should not be modified. Instead, Src should be worked on until cut-over and Data Sync can run at a lower rate to not impact production. Once they're ready to switch, Src should be taken off-line (for writes at least) and another sync at max performance of NAS & Data Brokers should be done.

I think the amount of scanning should be the same (for similar number of files) but time required to transfer files should be much less as not all files would be updated).

https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Cloud_Sync/Does_CloudSync_Support_staged_cutover

Settings for the final sync could be modified according to customer's requirements:

https://docs.netapp.com/us-en/cloud-manager-sync/task-managing-relationships.html#changing-the-settings-for-a-sync-relationship

After switch-over Src can be left online as read-only for a while, just in case.

 

Public