Solved: Monitoring management of cDot clusters

DARREN_REED · ‎2016-01-11

The transition from 7-Mode to cDot has brought lots of wonderfulness with all management changes being reduced to commands - no more manual changes to files.

One aspect that has not imrproved (and in some cases has become more difficult) is monitoring the state of the system(s). Which volumes, exports, etc, are active - especially "options."

In 7-Mode, the set of options that were active in a vfiler was observed with a simple "vfiler run ... options". NFS, iSCSI, etc, all there.

In cDot, that has become more complex because the options are now spread throughout the vserver. Some options are in NFS, some are in CIFS, some are in iSCSI and so on. There is no "show me all of the options in a vserver." And some aspects have become more complicated: monitoring NFS exports requires monitoring the volume, the qtree, the export policy and export policy rules. Previously all that I needed to care about was /etc/exports. Crude, yes. Simple, yes. More exposed to error, yes (the wrong use of wrfile can kill all your exports.)

The rather long command set that I use to get a "dump" of a cluster is:

set -rows 0; set -units MB; cluster show; storage disk show -fields disk,serial-number,owner,state,aggregate,usable-size-mb; storage aggregate show -fields hybrid,raidtype,state; vserver show -fields vserver,aggregate,allowed-protocols,disallowed-protocols,qos-policy-group,quota-policy,rootvolume,comment,quota-policy,admin-state; vserver peer show; volume show -fields vserver,volume,aggregate,state,type,filesystem-size,node,nvfail,policy,qos-policy-group; volume qtree show -fields volume,security-style,oplock-mode,status,vserver,export-policy; vserver export-policy show -fields vserver,policyname; vserver export-policy rule show -fields vserver,policyname,clientmatch,superuser,rorule,rwrule; volume quota policy rule show -fields vserver,qtree,type,disk-limit,soft-file-limit,file-limit,soft-disk-limit,threshold,user-mapping,volume,target; lun show -fields vserver,path,volume,qtree,lun,size,ostype,serial,state,mapped,block-size,class,qos-policy-group; network port show -fields node,port,role,link,mtu,duplex-admin,duplex-oper,speed-oper,flowcontrol-admin,flowcontrol-oper,mac,type,ifgrp-node,ifgrp-port,vlan-node,vlan-port,vlan-tag; network interface show -fields vserver,lif,data-protocol,routing-group,address-family,address,netmask,firewall-policy,failover-policy,failover-group,role,status-admin,status-oper,home-node,curr-node,home-port,curr-port; vserver cifs domain discovered-servers show; vserver cifs show -fields vserver,cifs-server,domain-workgroup,domain; vserver nfs show -fields vserver,access,udp,tcp,v4-id-domain,v3,v4.1,v4.0-acl,v4.1-read-delegation,mount-rootonly,nfs-rootonly; vserver iscsi show -fields vserver,target-name,target-alias,status-admin; lun igroup show -fields vserver,igroup,protocol,ostype,portset,initiator; snapmirror show -fields vserver,source-path,destination-path,type,policy,schedule; security login show -fields vserver,username,application,authmethod,role; security certificate show -fields vserver,common-name,serial,ca,type,expiration,organization,protocol;

Am I missing anything important? (At the moment the output from above just gets kept in an RCS managed file until I work out what to do with it.)

Something that I'm not capturing here is options that are only visible with "diag" privilege set. This represents a real problem because if an option such as NFSv4 lease timeouts is modified, there is no clue from normal administration that this change has been set and to view it requires changing to "diag" privilege - something that I don't really want to enable a user for "audit" purposes. I'm not sure what the right approach is here but being able to have "diag-readonly" (or something like that which can be assigned to a role) as an access level would be good.

What are other people do to monitor configuration of NetApp cDot systems?

Is it possible to integrate them into puppet or something similar?

You might ask "why" something like this is needed... in larger environments there are two problems: (1) more than one person being active in administration and (2) being able to defend the storage system configuration when "it doesn't work" and being able to demonstrate that nothing changed between yesterday/today so "your problem is elsewhere." It also helps to have your own history of "well, it worked yesterday but it isn't working today, what changed?"

asulliva · ‎2016-01-12

Hi Darren,

Just to be clear, "monitoring" in this context refers to recording/monitoring configuration changes on the system(s), not performance?

Puppet and Chef are definitely options, in fact we presented at Puppet Camp Charlotte not too long ago and recently published TR-4477 (Using Puppet to Manage NetApp Storage Infrastructure). If you're interested in Ansible or Salt, we have some integrations happening there as well...let me know and I can put you in touch with the respective people.

Depending on your preferred platform, PowerShell is a great option for collecting information about your system. If you're a *nix person, the Perl Bindings functionality in NMSDK 5.4 is a bit easier to use in my opinion than the other methods of invoking ZAPI.

If there's anything in particular you're interested in please let me know. Feel free to reach out using the Community site, or email (my community username @netapp.com).

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

View solution in original post

JGPSHNTAP · ‎2016-01-12

powershell

DARREN_REED · ‎2016-01-12

> powershell

Would you like to expand on this comment and provide some more details as to how it can be used, etc?

asulliva · ‎2016-01-12

Hi Darren,

Just to be clear, "monitoring" in this context refers to recording/monitoring configuration changes on the system(s), not performance?

Puppet and Chef are definitely options, in fact we presented at Puppet Camp Charlotte not too long ago and recently published TR-4477 (Using Puppet to Manage NetApp Storage Infrastructure). If you're interested in Ansible or Salt, we have some integrations happening there as well...let me know and I can put you in touch with the respective people.

Depending on your preferred platform, PowerShell is a great option for collecting information about your system. If you're a *nix person, the Perl Bindings functionality in NMSDK 5.4 is a bit easier to use in my opinion than the other methods of invoking ZAPI.

If there's anything in particular you're interested in please let me know. Feel free to reach out using the Community site, or email (my community username @netapp.com).

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

DARREN_REED · ‎2016-01-13

I will have to read up further on this but I think that this is starting down the right path. At first I wouldn't have thought of using puppet to manage NetApp devices but it does seem to be a natural extension of what puppet does. Like!

JamesIlderton · ‎2016-01-14

What we need is for OnCommand Unified Manager to bring back the Storage System Configuration feature from DFM. Scheduled backups of the system configuration, along with the ability to both do config comparisons for changes and do restores to back out changes. I wonder if that's on the roadmap for OCUM?

joelk · ‎2016-07-26

@JamesIlderton wrote:
What we need is for OnCommand Unified Manager to bring back the Storage System Configuration feature from DFM. Scheduled backups of the system configuration, along with the ability to both do config comparisons for changes and do restores to back out changes. I wonder if that's on the roadmap for OCUM?

Absolutely agree that there should be some way of capturing the configuration settings and changes made in cdot by the administrators and by netapp updates.

We used to compare diffs of the registry and options in 7 mode to analyze changes that affected the system.

Joel

Rutger · ‎2016-10-13

Hello Andrew,

I know this is an old thread but I was wondering if you could give me some examples of NetApp-Ansible integrations?

/Rutger

asulliva · ‎2016-10-13

Hello @Rutger,

The Ansible modules should be released to our GitHub site any day now and will include some examples of how to use them. Keep an eye on https://github.com/NetApp for the modules themselves, and netapp.io for code examples and release announcements.

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.