can anyone guide on how to build a ansible playbook for cluster health check including,
Hardware faults on Nodes, Shelfs, Disks, Ports, PSU, Fan, Motherboard, Connectivity and Configuration
Event logs - Find out critical errors and highlight the same on report, check on repeated warnings
2 REPLIES 2
Ansible is not designed to be a reporting tool for Hardware faults on Nodes, Shelfs, Disks, Ports, PSU, Fan, Motherboard etc. IE you can't manage the configuration state for a hardware fault. You could run a series of commands to enumerate hardware environment issues. See:
Active IQ is designed to monitor your storage clusters which will detects and alert on such hardware related events.
If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.
Thank you for the response,
you are right, so you mean to say I should run set of ad-hoc commands to get the status and produce a report using those outputs.
Actually I am spending lot of time running health check commands on daily basis, which I need to automate, so I thought there can be some playbooks in ansible which can automate.
And AcitveIQ is dependent on asup messages, which is not showing the present status/alerts of the system, my plan is to run this script twice a day and get the report generated.
any alternative like shell script can get this done I think, do you have any idea on the same?