Subscribe

OCUM Core not collecting info from VSMs

Hello,

I've updated OCUM Core/DFM to 5.2R1 yesterday and it's not showing volume and LUN data from VSMs.

It shows data from cluster, nodes and even root vols. Is it the same bug from described on http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=814072 ?

Thanks,

Re: OCUM Core not collecting info from VSMs

HI,

Please maintain the MTU at 1500. The BURT clearly indicates that there are issues in discovery with non default MTU size.

If the MTU is different than 1500 then set it back to 1500 and see if the issue goes off.

If not, Could you check in /opt/NTAPdfm/dfmmonitor.log for any discovery errors or issues.  Could you paste here if any.

Thanks,

  Arun

Re: OCUM Core not collecting info from VSMs

This bug will be to blame for UM 5.x Cluster-Mode monitoring issues when any of the network interfaces between (and including) the UM server to the cluster management LIF have an MTU other than 1500.

Thanks,

Kevin

Re: OCUM Core not collecting info from VSMs

The log states:

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-01:5a present in API but not SNMP

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-01:5b present in API but not SNMP

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-01:6a present in API but not SNMP

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-01:6b present in API but not SNMP

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-02:5a present in API but not SNMP

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-02:5b present in API but not SNMP

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-02:6a present in API but not SNMP

Jun 25 10:21:58 [DFMMonitor: WARN]: [2752:0xa80]: HDI.hdi.br: Adapter HDI-02:6b present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-01:5a present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-01:5b present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-01:6a present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-01:6b present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-02:5a present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-02:5b present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-02:6a present in API but not SNMP

Jun 25 10:27:59 [DFMMonitor: WARN]: [2752:0x1324]: HDI.hdi.br: Adapter HDI-02:6b present in API but not SNMP

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: vs_progress:vs_progress_fc6a_lif3. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: vs_progress:vs_progress_fc5a_lif1. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: HDIST03:HDI-02_fc_lif_3. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: HDIST03:HDI-02_fc_lif_4. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: vs_sql:vs_sql_fc5a_if1. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: vs_sql:vs_sql_fc6b_if2. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: vs_vmware:HDI-02_fc5b_lif_2. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: HDISTD:vs_hdistd_fc5b_if1. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: HDISTD:vs_hdistd_fc5b_if2. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: vs_progress:vs_progress_fc5b_lif4. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: vs_progress:vs_progress_fc6b_lif2. Lif may be down.

Jun 25 10:32:32 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: vs_sql:vs_sql_fc6a_if3. Lif may be down.

Jun 25 10:32:33 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: vs_vmware:HDI-02_fc6a_lif_3. Lif may be down.

Jun 25 10:32:33 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: HDIST03:HDI-01_fc_lif_3. Lif may be down.

Jun 25 10:32:33 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: HDIST03:HDI-01_fc_lif_4. Lif may be down.

Jun 25 10:32:33 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-02) details for the the Lif: vs_sql:vs_sql_fc5b_if4. Lif may be down.

Jun 25 10:32:33 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: vs_vmware:HDI-01_fc5a_lif_1. Lif may be down.

Jun 25 10:32:33 [DFMMonitor:ERROR]: [2752:0x116c]: ifmon_report_one_lif: Failed to find Home node (HDI-01) details for the the Lif: vs_vmware:HDI-01_fc6b_lif_4. Lif may be down.

So, I believe it's the MTU size since DFM isn't finding LIF status. I'll change it and let you guys know.

Thanks!

Bruno

Re: OCUM Core not collecting info from VSMs

Hi,

Please set the MTU to 1500 and then do the discovery. Please let us know if you are seeing issues even after that.

Thanks,

Arun

Re: OCUM Core not collecting info from VSMs

Hi there,

I double checked all interfaces with role configured as data and if_groups and all is set to 1500. Only cluster interfaces are set to 9000.

net port show

  (network port show)

                                      Auto-Negot  Duplex     Speed (Mbps)

Node   Port   Role         Link   MTU Admin/Oper  Admin/Oper Admin/Oper

------ ------ ------------ ---- ----- ----------- ---------- ------------

HDI-01

       a11b   data         up    1500  true/-     auto/full   auto/10000

       a12b   data         up    1500  true/-     auto/full   auto/10000

       e0M    node-mgmt    up    1500  true/true  full/full   auto/100

       e0a    data         up    1500  true/true  full/full   auto/1000

       e0b    data         down  1500  true/true  full/half   auto/10

       e1a    cluster      up    9000  true/true  full/full   auto/10000

       e1b    data         up    1500  true/true  full/full   auto/10000

       e2a    cluster      up    9000  true/true  full/full   auto/10000

       e2b    data         up    1500  true/true  full/full   auto/10000

HDI-02

       a21b   data         up    1500  true/-     auto/full   auto/10000

       a22b   data         up    1500  true/-     auto/full   auto/10000

       e0M    node-mgmt    up    1500  true/true  full/full   auto/100

       e0a    data         up    1500  true/true  full/full   auto/1000

       e0b    data         down  1500  true/true  full/half   auto/10

       e1a    cluster      up    9000  true/true  full/full   auto/10000

       e1b    data         up    1500  true/true  full/full   auto/10000

       e2a    cluster      up    9000  true/true  full/full   auto/10000

       e2b    data         up    1500  true/true  full/full   auto/10000

Re: OCUM Core not collecting info from VSMs

Please post "net int show" as well.

Thanks,

Kevin

Re: OCUM Core not collecting info from VSMs

Here it is:

net int show

  (network interface show)

            Logical    Status     Network            Current       Current Is

Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home

----------- ---------- ---------- ------------------ ------------- ------- ----

HDI

            cluster_mgmt up/up    10.7.3.66/22       HDI-02        e0a     false

HDI-01

            clus1        up/up    169.254.130.125/16 HDI-01        e1a     true

            clus2        up/up    169.254.104.141/16 HDI-01        e2a     true

            mgmt1        up/up    10.7.3.67/22       HDI-01        e0M     true

            smv_lif1     up/up    10.7.3.73/22       HDI-01        a11b    true

HDI-02

            clus1        up/up    169.254.117.169/16 HDI-02        e1a     true

            clus2        up/up    169.254.44.6/16    HDI-02        e2a     true

            mgmt1        up/up    10.7.3.68/22       HDI-02        e0M     true

            smv_lif2     up/up    10.7.3.75/22       HDI-02        a21b    true

HDIST02_NW

            HDIST02_NW_lif1

                         up/up    10.7.3.74/22       HDI-02        e0a     false

            HDIST02_NW_lif2

                         up/up    10.7.3.230/22      HDI-01        e0a     false

HDIST03

            HDI-01_fc_lif_3

                         up/down  20:0b:00:a0:98:40:39:56

                                                     HDI-01        6a      true

            HDI-01_fc_lif_4

                         up/down  20:0a:00:a0:98:40:39:56

                                                     HDI-01        5b      true

            HDI-02_fc_lif_3

                         up/down  20:0f:00:a0:98:40:39:56

                                                     HDI-02        6a      true

            HDI-02_fc_lif_4

                         up/down  20:0e:00:a0:98:40:39:56

                                                     HDI-02        5b      true

            HDIST03_data_lif1

                         up/up    10.7.3.69/22       HDI-02        a21b    false

HDIST04

            HDIST04_cifs_lif1

                         up/up    10.7.3.70/22       HDI-02        e0a     false

HDISTD

            HDISTD_cifs_nfs_lif1

                         up/up    10.7.0.40/22       HDI-01        a11b    true

            Logical    Status     Network            Current       Current Is

Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home

----------- ---------- ---------- ------------------ ------------- ------- ----

HDISTD

            vs_hdistd_fc5b_if1

                         up/up    20:07:00:a0:98:40:39:56

                                                     HDI-01        5b      true

            vs_hdistd_fc5b_if2

                         up/up    20:09:00:a0:98:40:39:56

                                                     HDI-02        5b      true

vs_progress

            vs_progress_fc5a_lif1

                         up/up    20:00:00:a0:98:40:39:56

                                                     HDI-01        5a      true

            vs_progress_fc5b_lif4

                         up/up    20:03:00:a0:98:40:39:56

                                                     HDI-02        5b      true

            vs_progress_fc6a_lif3

                         up/up    20:02:00:a0:98:40:39:56

                                                     HDI-02        6a      true

            vs_progress_fc6b_lif2

                         up/up    20:01:00:a0:98:40:39:56

                                                     HDI-01        6b      true

            vs_progress_lif1

                         up/up    10.7.3.71/22       HDI-02        a21b    false

vs_sql

            vs_sql_fc5a_if1

                         up/up    20:1e:00:a0:98:40:39:56

                                                     HDI-01        5a      true

            vs_sql_fc5b_if4

                         up/up    20:21:00:a0:98:40:39:56

                                                     HDI-02        5b      true

            vs_sql_fc6a_if3

                         up/up    20:20:00:a0:98:40:39:56

                                                     HDI-02        6a      true

            vs_sql_fc6b_if2

                         up/up    20:1f:00:a0:98:40:39:56

                                                     HDI-01        6b      true

            vs_sql_mgmt  up/up    10.7.3.76/22       HDI-02        a22b    true

            vs_sql_ndmp01

                         up/up    10.7.0.80/22       HDI-01        e0a     false

vs_vmware

            HDI-01_fc5a_lif_1

                         up/up    20:04:00:a0:98:40:39:56

                                                     HDI-01        5a      true

            HDI-01_fc6b_lif_4

                         up/up    20:0c:00:a0:98:40:39:56

                                                     HDI-01        6b      true

            Logical    Status     Network            Current       Current Is

Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home

----------- ---------- ---------- ------------------ ------------- ------- ----

vs_vmware

            HDI-02_fc5b_lif_2

                         up/up    20:10:00:a0:98:40:39:56

                                                     HDI-02        5b      true

            HDI-02_fc6a_lif_3

                         up/up    20:06:00:a0:98:40:39:56

                                                     HDI-02        6a      true

            vs_vmware_mgmt

                         up/up    10.7.3.72/22       HDI-02        a21b    true

36 entries were displayed.

Re: OCUM Core not collecting info from VSMs

Very good - the cluster_mgmt is 1500 MTU, so if the network and UM server are also set to 1500 then 814072 is not cause of your problem.

You should consider opening a support case.

Thanks,

Kevin

Re: OCUM Core not collecting info from VSMs

OK, I'm gonna open a case right now.

Thank you guys!