Active IQ Unified Manager Discussions

What Events does Operations Manger Generate for C-Mode(8.0/8.0.1) in 4.0/4.0.1

adaikkap
4,168 Views

Recently there is been a couple of request on what do we monitor and alert in OM for C-Mode.

Below is the list of new events that we generate for C-Mode Specific Objects(Starting OM 4.0 and later)

  • Info [Info] Cluster Discovered (cluster-discovered):      A cluster was discovered.
  • Normal [Normal] Cluster Reachable (cluster-reachable):      A cluster was reachable from DataFabric Manager network.
  • Critical [Critical] Cluster Not Reachable (cluster-unreachable):      A cluster was not reachable from DataFabric Manager network.
  • Info [Info] Cluster Renamed (cluster-renamed):      A cluster got renamed.
  • Info [Info] Cluster Node Added (cluster-node-added):      A node was added to a cluster.
  • Info [Info] Cluster Node Removed (cluster-node-removed):      A node was removed from a cluster.
  • Normal [Normal] Port Status Up (port-status-up):      A cluster port status is up.
  • Error [Error] Port Status Down (port-status-down):      A cluster port status is down.
  • Normal [Normal] Port Status Undefined (port-status-undef):      A cluster port status is undefined.
  • Normal [Normal] Port Status Unknown (port-status-unknown):      A cluster port status is unknown.
  • Info [Info] Port Role Changed (port-role-changed):      A cluster port role has changed.
  • Normal [Normal] Logical Interface Status Up (logical-interface-status-up):      A logical interface status is up.
  • Error [Error] Logical Interface Status Down (logical-interface-status-down):      A logical interface status is down.
  • Normal [Normal] Logical Interface Status Unknown (logical-interface-status-unknown):      A logical interface status is unknown.
  • Warning [Warning] Logical Interface Migrated (logical-interface-migrated):      A logical interface migrated to a different node.
  • Info [Info] Vserver Discovered (vserver-discovered):      A vserver was discovered.
  • Info [Info] Vserver Deleted (vserver-deleted):      A vserver was deleted.
  • Info [Info] Vserver Renamed (vserver-renamed):      A vserver was renamed.

Below is the same in table format with eventclass.

Event Name                                          

Severity    

Class

cluster-discovered

Information

cluster.discovered

cluster-node-added

Information

cluster.node.added

cluster-node-removed

Information

cluster.node.removed

cluster-reachable

Normal

ping.status

cluster-renamed

Information

cluster.renamed

cluster-unreachable

Critical

ping.status

port-role:changed

Information

port.roleChange

port-status:down

Error

port.status

port-status:undef

Normal

port.status

port-status:unknown

Normal

port.status

port-status:up

Normal

port.status

logical-interface-status:down

Error

lif.status

logical-interface-status:unknown

Normal

lif.status

logical-interface-status:up

Normal

lif.status

logical-interface:migrated

Warning

lif.migration

vserver-deleted

Information

vserver.deleted

vserver-discovered

Information

vserver.discovered

vserver-renamed

Information

vserver.renamed

vserver-running

Information

vserver.running

vserver-stopped

Information

vserver.stopped

Below is the list of commons event that are generated for C-mode objects as well as 7G/7Mode

Event   Group

Events

volume

volume-almost-full

volume-clone:deleted

volume-clone:discovered

volume-full

volume-growth-rate:abnormal

volume-growth-rate:ok

volume-new-snapshot

volume-offline-or-destroyed

volume-online

volume-snapshot-deleted

volume-space-normal

inodes-almost-full

inodes-full

inodes-utilization-normal

Aggregate

aggregate-almost-full

aggregate-almost-overcommitted

aggregate-full

aggregate-not-overcommitted

aggregate-overcommitted

aggregate-snapshot-reserve-almost-full

aggregate-snapshot-reserve-full

aggregate-snapshot-reserve-ok

aggregate-space-normal

aggregate:deleted

aggregate:discovered

aggregate:failed

aggregate:offline

aggregate:online

aggregate:restricted

NVRAM

nvram-battery:discharged

nvram-battery:fully-charged

nvram-battery:low

nvram-battery:missing

nvram-battery:normal

nvram-battery:old

nvram-battery:overcharged

nvram-battery:replace

nvram-battery:unknown-status

cpu

cpu-load-normal

cpu-too-busy

Enclosures

enclosures-active

enclosures-disappeared

enclosures-failed

enclosures-found

enclosures-inactive

enclosures-ok

Fans

fans:many-failed

fans:normal

fans:one-failed

Host

host-discovered

host-down

host-login:failed

host-login:ok

host-snmp-not-responding

host-snmp-ok

host-up

host:identity-conflict

host:identity-ok

host:name-changed

host:system-id-changed

Power   supplies

power-supplies:many-failed

power-supplies:normal

power-supplies:one-failed

Snapshots

snap-count:exceeded

snap-count:ok

snapshot-full

snapshot-space-ok

snapshots:disabled

snapshots:enabled

snapshots:not-too-old

snapshots:too-old

Environmentals

temperature-hot

temperature-normal

Regards

adai.

4 REPLIES 4

mrinal
4,168 Views

Hi,

I have questions about some of the events listed above...

[Critical] Cluster Not Reachable (cluster-unreachable):      A cluster was not reachable from DataFabric Manager network.

>>> Does this refer to the 'cluster-mgmt' LIF? Can we set the node-mgmt LIF to be used in case the cluster-mgmt LIF is not available?

[Info] Cluster Renamed (cluster-renamed):      A cluster got renamed.

>>> Is this event is after the node is renamed but before the reboot then the event should be marked as pending. The action is not complete until the node is rebooted.

[Normal] Port Status Up (port-status-up):      A cluster port status is up.

[Error] Port Status Down (port-status-down):      A cluster port status is down.

[Normal] Port Status Undefined (port-status-undef):      A cluster port status is undefined.

[Normal] Port Status Unknown (port-status-unknown):      A cluster port status is unknown.

[Info] Port Role Changed (port-role-changed):      A cluster port role has changed.

>>> Do we have similar events for ports that are in other roles?

msaravan
4,168 Views

Hi Mrinal Devadas,

Find my answers inline (in italic prefixed with [Saravanan]) :

[Critical] Cluster Not Reachable (cluster-unreachable):      A cluster was not reachable from DataFabric Manager network.

>>> Does this refer to the 'cluster-mgmt' LIF? Can we set the node-mgmt LIF to be used in case the cluster-mgmt LIF is not available?

[Saravanan] Till DFM 4.0.1, you can use either 'cluster-mgmt' LIF or node-mgmt LIF as your primary address for monitoring. You can always switch over if one address is not reachable using "dfm host set <hostid> hostprimaryaddress=<>"

[Info] Cluster Renamed (cluster-renamed):      A cluster got renamed.

>>> Is this event is after the node is renamed but before the reboot then the event should be marked as pending. The action is not complete until the node is rebooted.

[Saravanan] I dont think so reboot is one of the mandatory requirement for renaming feature. If so, please share some data. I'll verify the same in DFM and let you know.

[Normal] Port Status Up (port-status-up):      A cluster port status is up.

[Error] Port Status Down (port-status-down):      A cluster port status is down.

[Normal] Port Status Undefined (port-status-undef):      A cluster port status is undefined.

[Normal] Port Status Unknown (port-status-unknown):      A cluster port status is unknown.

[Info] Port Role Changed (port-role-changed):      A cluster port role has changed.

>>> Do we have similar events for ports that are in other roles?

[Saravanan] Its for all the ports. No limiations to roles.

tulsiraj
4,168 Views

[Critical] Cluster Not Reachable (cluster-unreachable):      A cluster was not reachable from DataFabric Manager network.

>>> Does this refer to the 'cluster-mgmt' LIF? Can we set the node-mgmt LIF to be used in case the cluster-mgmt LIF is not available?

I would suggest:

1. You can configure an alternate-Ip address for Cluster Management Lif  Or

2. If your Cluster-mgmt Lif is not reachable due to port down/node rechability issues then you can always configure a failover policy for this Lif so that it can failover to any port within cluster which is available.

mrinal
4,168 Views

Thank you for the answers.

Public