Matt -
All good suggestions. The single biggest constraint is time. The environment where I need this is extremely controlled and siloed. Hence deploying anything new, whether application (such as WFA) or feature (such as a non-customer standard AD group for tracking) as you suggest can be literally six to 9 months from realization, mostly because of politics. Of course the demand the customer makes must be done now.
For back story, the short version is there was a significant site event recently, and now the customer wants to layer High Availability on top of existing non-HA designed infrastructure and application deployments. They have DR, but now they want cross country HA capability, push button automation to swing entire application sets, invokable at any time by application support teams as they decide to do so, zero RPO/RTO targets (so not the DR setup already defined), and cover all parts of an application - physical and/or virtual servers, any CIFS/NFS shares, all related network components, and the over-riding application services. Oh - and have it done by end of year for 50+ major applications. No sweat, right?
From a storage perspective both MetroCluster and SyncMirror (ONTAP 9.5 - yay!) are non-starters due to distance and other technical/political reasons.
Did I mention it's also end of year freeze, so don't change anything too major.
I like all of your suggestions, and have considered what OCUM and WFA could do already. In the short term they fail due to the time constraint. API-S is partially availablein the customer setup, just not where I need it to be. The environment is highly segmented by firewalls so no single API-S or OCUM server can reach all necessary clusters to be a single entry point. I think WFA is the better solution longer term from for the NetApp components in the environment (thankfully I only have to worry about the NetApp side of the house). OCI is the only onsite system currently punched through the necessary firewalls, and hence is also the server from which any storage based automation has to run at this time. Ironially, the server which will control the "HA-lite" functionality will not itself be Highly Available in any first effort.
On the plus side very late after my two posts from yesterday I learned enough of the OCI REST "Query" API to help out in this case, though it still isn't as simple as I might have hoped. I like your AD group suggestion and will raise that as a possible long term way to work this system. Lots of late nights with that and the SDK are ahead I suspect.
Thanks Matt!
Bob Greenwald
Senior Systems Engineer | cStor
NCSE | NCIE SAN ONTAP, Data Protection