Data Infrastructure Management Software Discussions

Highlighted

Performance data migration fails between OnCommand Performance Manager and Unified Manager 7.2

Hello,

 

after installing a new Windows Server 2016 and the Unified Manager 7.2P1 I migrated the old Unified Manager without issues and it works fine.

However we have a VM appliance for Performance Manager upgraded to the 7.1P3 version.

 

Since many days I'm trying to migrate the perfomance data to the new Unified Manager but it fails every time.

At first I thought about the 390 days retention. So I changed to 6 months.

 

Then I had the "java.io.EOFException: unexpected end of stream" error (https://kb.netapp.com/app/answers/answer_view/a_id/1005077)

So, I followed this KB, restarted the servers and tried again.

 

But it failed again even if I have different errors either within the "maintance_console" and the log file.

 

 

"Cluster Name : s-jrciprnacl01p

OnCommand Performance Manager IP Address or Host Name : 10.171.253.248

Migration Start Time :  06/04/2018 08:13:06

Migration End Time :  06/07/2018 09:49:57

Migration Status : FAILED

Error : Cannot connect to the OnCommand Performance Manage

Error Details : lun giu 04 08:12:49 CEST 2018 Migration Scheduled

                lun giu 04 08:13:06 CEST 2018 Data migration started.

                lun giu 04 08:13:07 CEST 2018 - Data migration completed for netapp_model database

                lun giu 04 08:13:08 CEST 2018 - Data migration completed for Optional[sample_nfsv41]

                lun giu 04 08:13:08 CEST 2018 - Data migration completed for Optional[sample_cluster]

                lun giu 04 08:13:17 CEST 2018 - Data migration completed for Optional[sample_wafl]

                lun giu 04 08:13:14 CEST 2018 - Data migration completed for Optional[sample_node]

                lun giu 04 08:13:27 CEST 2018 - Data migration completed for Optional[sample_vserver]

                lun giu 04 08:13:27 CEST 2018 - Data migration completed for Optional[sample_nic]

                lun giu 04 08:14:03 CEST 2018 - Data migration completed for Optional[sample_processor]

                lun giu 04 08:13:28 CEST 2018 - Data migration completed for Optional[sample_nfsv4]

                lun giu 04 08:14:11 CEST 2018 - Data migration completed for Optional[sample_qos_service_center_1]

                lun giu 04 08:14:42 CEST 2018 - Data migration completed for Optional[sample_volumeVserver]

                lun giu 04 08:14:52 CEST 2018 - Data migration completed for Optional[sample_fcpport]

                lun giu 04 08:16:31 CEST 2018 - Data migration completed for Optional[sample_qos_workload_detail_1]

                lun giu 04 08:18:55 CEST 2018 - Data migration completed for Optional[sample_qos_volume_workload_1]

                lun giu 04 08:19:30 CEST 2018 - Data migration completed for Optional[sample_nfs]

                lun giu 04 08:17:13 CEST 2018 - Data migration completed for Optional[sample_networklif]

                lun giu 04 08:20:47 CEST 2018 - Data migration completed for Optional[sample_aggregate_1]

                lun giu 04 08:20:08 CEST 2018 - Data migration completed for Optional[sample_networklifVserver]

                lun giu 04 08:21:09 CEST 2018 - Data migration completed for Optional[sample_nfsv3]

                lun giu 04 08:20:19 CEST 2018 - Data migration completed for Optional[sample_cifsvserver]

                lun giu 04 08:21:36 CEST 2018 - Data migration completed for Optional[sample_qos_workload_queue_nblade_1]

                lun giu 04 08:23:01 CEST 2018 - Data migration completed for Optional[sample_qos_workload_queue_dblade_1]

                lun giu 04 08:37:15 CEST 2018 - Data migration completed for Optional[sample_disk_1]

                lun giu 04 08:37:16 CEST 2018 - Data migration completed for Optional[summary_daily_qos_volume_workload_1]

                lun giu 04 08:37:16 CEST 2018 - Data migration completed for Optional[summary_cluster]

                lun giu 04 08:37:30 CEST 2018 - Data migration completed for Optional[summary_vserver]

                lun giu 04 08:37:31 CEST 2018 - Data migration completed for Optional[summary_wafl]

                lun giu 04 08:37:31 CEST 2018 - Data migration completed for Optional[summary_node]

                lun giu 04 08:37:55 CEST 2018 - Data migration completed for Optional[summary_qos_service_center_1]

                lun giu 04 08:38:22 CEST 2018 - Data migration completed for Optional[summary_processor]

                lun giu 04 08:38:25 CEST 2018 - Data migration completed for Optional[summary_nic]

                lun giu 04 08:38:38 CEST 2018 - Data migration completed for Optional[summary_networklif]

                lun giu 04 08:38:38 CEST 2018 - Data migration completed for Optional[summary_fcpport]

                lun giu 04 08:38:38 CEST 2018 - Data migration completed for Optional[summary_nfsv41]

                lun giu 04 08:39:23 CEST 2018 - Data migration completed for Optional[summary_volumeVserver]

                lun giu 04 08:39:27 CEST 2018 - Data migration completed for Optional[summary_nfsv4]

                lun giu 04 08:39:42 CEST 2018 - Data migration completed for Optional[summary_networklifVserver]

                lun giu 04 08:38:35 CEST 2018 - Data migration completed for Optional[summary_qos_workload_detail_1]

                lun giu 04 08:41:11 CEST 2018 - Data migration completed for Optional[summary_nfsv3]

                lun giu 04 08:41:23 CEST 2018 - Data migration completed for Optional[summary_aggregate_1]

                lun giu 04 08:41:27 CEST 2018 - Data migration completed for Optional[summary_cifsvserver]

                lun giu 04 08:41:27 CEST 2018 - Data migration completed for Optional[summary_qos_workload_queue_nblade_1]

                lun giu 04 08:41:29 CEST 2018 - Data migration completed for Optional[summary_qos_volume_workload_1]

                lun giu 04 08:41:39 CEST 2018 - Data migration completed for Optional[summary_nfs]

                lun giu 04 08:43:43 CEST 2018 - Data migration completed for Optional[summary_qos_workload_queue_dblade_1]

                lun giu 04 08:56:15 CEST 2018 - Data migration completed for Optional[summary_disk_1]

                lun giu 04 08:56:15 CEST 2018 - Data migration completed for netapp_performance database

Do you want to retry the data migration from cluster s-jrciprnacl01p (FAILED) Y/N ?"

 

the log file reports after days of progressing:

"018-06-07 09:47:57,118 ERROR [main] c.n.d.u.m.i.m.BaseMigrator (BaseMigrator.java:127) - Error during data migration. java.net.SocketException: Software caused connection abort: socket write error"

"2018-06-07 09:47:57,604 INFO  [main] c.n.d.u.m.i.m.o.OpmBaseMigrator (OpmBaseMigrator.java:76) - Migrating data from table - continuous_event_participant_stats failed in 240.808 sec"

 

I will ask to Netapp support about but I want to share this issue even here.

Also because I was wondering about a way to export and import manually the informations needed for this migration directly with MySQL commands. Is that possible? Is there any procedure from Netapp to do this?

This might solve many issues related to timeouts,network errors and so on.

 

Many thanks

Mark

3 REPLIES 3
Highlighted

Re: Performance data migration fails between OnCommand Performance Manager and Unified Manager 7.2

hi


when i worked on it with support last year we kept doubling the net_write_timeout from the original 600 up to 4800 and eventually it worked.
note that when it completed a table it will skip it in the next time. so essentially it was processing every time in my case.

another option was to migrate the VM first to be closer to the other VM as in my case it was on a separate datacenter)

Gidi

Gidi Marcus (Linkedin) - Storage and Microsoft technologies consultant - Hydro IT LTD - UK
Highlighted

Re: Performance data migration fails between OnCommand Performance Manager and Unified Manager 7.2

Hi Marcus,

 

thank you for your answer.

actually I already followed the advise from NetApp support KB in order to change those timeouts:

net_write_timeout=9200
wait_timeout=31536000
interactive_timeout=31536000

 

And the VMs are already close to each other within the same datacenter and group.

 

I checked better what has been migrated and I noticed that in the maintenance_console (as attached above) there are no specific errors while there are errors in the log file.

Also checking the historical data in the "new" Unified Manager I was able to see the historical data migrated from the Perfomance Manager. 

So something works.

But how can I be sure of what has been transferred and what has not;  It's likely that some information has been lost...

 

Cheers,

 

 

Highlighted

Re: Performance data migration fails between OnCommand Performance Manager and Unified Manager 7.2

The experience I had with it is that after re-running it again and again eventually it completed. The migration script skips all the tables it already succeed in.

One more thing. Your timeout seems to hit 200 something seconds (sorry writing from Mobile can’t check exactly). Are you sure your changes taken effect? I think the KB have service restart in it...
Gidi Marcus (Linkedin) - Storage and Microsoft technologies consultant - Hydro IT LTD - UK
Check out the KB!
Knowledge Base
All Community Forums