cDOT cluster switch monitoring

(How does one|does anyone bother to) monitor their cDOT cluster switches?


I'm 1500km from my filers, and the remote hands guys just sent me a note, "did you know your cluster switch is only single-powered?"  To which I hang my head and go, "no."


This isn't the first time stupid things like this have popped up.  Looking over the Upgrade Advisor output, it invites you to inspect your switch firmwares, but, since the filer doesn't have any administrative tie-ins to that switch, it's not helpful.  Feels like this is just a complete black-box that most admins aren't going to look at until they've dropped a whole switch and the filer notices the port die, by which point you're into danger territory.


Seems like this could be something smarter.  Like, pop in an ACP-style loop or somesuch.


Re: cDOT cluster switch monitoring

A few things we do on installs...setup email alerting from the switch (although that would not alert the switch failing or power pulled) but also setup CHSM monitoring and event route alerting.  This article covers the CN1610 https://kb.netapp.com/support/index?page=content&id=8010262&locale=en_US but also works with Cisco Nexus from ontap monitoring.

Re: cDOT cluster switch monitoring

"Single powered" sounds to me that there's a PSU off in your filer/shelf, are you sure he's talking about the switch? If a PSU fails your NetApp should definitely have sent you an email already.


If you want to manage your switches from (Clustered) DataOntap you can use the "system cluster-switch create" command. Which is best practice anyway, so I would ask the partner/company who installed your filers why they didn't bother doing that... They should have known this Smiley Frustrated

Re: cDOT cluster switch monitoring

Completely agree... finding a lot of installs we audit that should have things done.  The KB above covers cluster-switch create...with chsm setup properly it autodiscovers..very rarely do I have to create manually when setup from the get go.

Re: cDOT cluster switch monitoring

The dead PSU was definitely the switch; logged into them to verify.


system cluster-switch didn't exist before 8.2.  We did our own upgrades (been cDOT since 8.0), and neither release notes nor upgrade/revert guide mention it as "you might want to do this".


So, I'll take the blame for not knowing it.. but that this just appeared out of nowhere without a lot of fanfare doesn't make me feel bad about it.

