We recently upgraded to 5.1. For some reason the software thought that 6 of our controllers went off line. Not sure why it felt like these went off line, but I was able to clear the alerts and it has not happened since.
The problem is that the dashboard component still things they are down. I am not sure how to clear or make the dashboard resync. I have rebooted, stopped and restarted services and no luck.
I hope you are taking about event dashboard. In your case you are seeing the event related to controller down on event dashboard because those events are in current state. OnCommand runs it's monitor in particular interval, so, in next monitor cycle OnCommand will come to know that all controllers are up and it'll generate normal event for controller up and move the controller down related events in history table. Once all controllers down related events moved to history table then event dashboard will not show those events.
And if you want to delete the event without waiting to next monitor cycle then follow the below steps.
1. Click on the event on event dashboard which one you want to delete
2. After click on event, it'll navigate to event tab, on event tab resolve the event by clicking on "Resolve" button.
Once you resolve that event then that event will not listed in event dashboard.
Sorry I should have been more, there are no events to delete that correlate to the filers that are down. However, the dashboard still "thinks" that 6 filers are down. It has been like this for more than 48 hours, so any polling or monitor cycle would have kicked in. I will see if I can post a good screen shot.
There are no critical events to clear or acknowledge.
Could you please click on controllers link on availability dashboard and verify that whether storage inventory also showing the controller state as "Down".
If storage inventory also showing the same then could you please provide below data.
- Does all the controllers pingable from the machine where DFM server is installed.
- Output of dfm host diag <contoller-name-or-ip> command for any one of the controller which is showing offline in availability dashboard.
Not quite the answer, but lead me down the right track. For whatever reason, after upgrading to 5.1, the software decided to change the ip it was using for the host. We use a pretty standard scheme. We will have a dns entry for the host xxx.yyy.com and one for the out of band connection xxx-oob.yyy.com. It switched to that ip address, which is not routable. No clue as to why it would do that, but maybe something Netapp can look in to?
This also happened for our two Vmware VC servers. The host services started to look for the oob address as well.
Again nothing was changed on our side, the software did this automatically once upgraded to 5.1
On upgrade only the db schema is changed to the appropriate version, but no data in the db tables are changed. Is your host configured for DHCP ? as OnCommand doesnt use the dns names instead always tries to reach the controllers uisng the ip address with which it was added. The same is stored in the dfm host get.
Can you confirm that there were no network changes on the controller side that had effect on the IP being changed ?
Unfortunately we upgraded about 20+ units this past weekend I think all of the activity flushed out the valid audit logs from that time period. I will let you know for sure if it happens again.
Pls let us know next time you face this issue. There is also an options in OnCommand to have the audit.log forever with out rotating them.
[root@ ~]# dfm options list |grep -i audit
Change this option auditLogForever=YES. By default its No.
1. Re-Run the SSL setup
2. Reboot the system which is hosting the 5.1
3. It should be back to normal
If it doesn't work, you need to apply the necessary patches available from NetApp.