The transition from 7-Mode to cDot has brought lots of wonderfulness with all management changes being reduced to commands - no more manual changes to files.
One aspect that has not imrproved (and in some cases has become more difficult) is monitoring the state of the system(s). Which volumes, exports, etc, are active - especially "options."
In 7-Mode, the set of options that were active in a vfiler was observed with a simple "vfiler run ... options". NFS, iSCSI, etc, all there.
In cDot, that has become more complex because the options are now spread throughout the vserver. Some options are in NFS, some are in CIFS, some are in iSCSI and so on. There is no "show me all of the options in a vserver." And some aspects have become more complicated: monitoring NFS exports requires monitoring the volume, the qtree, the export policy and export policy rules. Previously all that I needed to care about was /etc/exports. Crude, yes. Simple, yes. More exposed to error, yes (the wrong use of wrfile can kill all your exports.)
The rather long command set that I use to get a "dump" of a cluster is:
set -rows 0; set -units MB; cluster show; storage disk show -fields disk,serial-number,owner,state,aggregate,usable-size-mb; storage aggregate show -fields hybrid,raidtype,state; vserver show -fields vserver,aggregate,allowed-protocols,disallowed-protocols,qos-policy-group,quota-policy,rootvolume,comment,quota-policy,admin-state; vserver peer show; volume show -fields vserver,volume,aggregate,state,type,filesystem-size,node,nvfail,policy,qos-policy-group; volume qtree show -fields volume,security-style,oplock-mode,status,vserver,export-policy; vserver export-policy show -fields vserver,policyname; vserver export-policy rule show -fields vserver,policyname,clientmatch,superuser,rorule,rwrule; volume quota policy rule show -fields vserver,qtree,type,disk-limit,soft-file-limit,file-limit,soft-disk-limit,threshold,user-mapping,volume,target; lun show -fields vserver,path,volume,qtree,lun,size,ostype,serial,state,mapped,block-size,class,qos-policy-group; network port show -fields node,port,role,link,mtu,duplex-admin,duplex-oper,speed-oper,flowcontrol-admin,flowcontrol-oper,mac,type,ifgrp-node,ifgrp-port,vlan-node,vlan-port,vlan-tag; network interface show -fields vserver,lif,data-protocol,routing-group,address-family,address,netmask,firewall-policy,failover-policy,failover-group,role,status-admin,status-oper,home-node,curr-node,home-port,curr-port; vserver cifs domain discovered-servers show; vserver cifs show -fields vserver,cifs-server,domain-workgroup,domain; vserver nfs show -fields vserver,access,udp,tcp,v4-id-domain,v3,v4.1,v4.0-acl,v4.1-read-delegation,mount-rootonly,nfs-rootonly; vserver iscsi show -fields vserver,target-name,target-alias,status-admin; lun igroup show -fields vserver,igroup,protocol,ostype,portset,initiator; snapmirror show -fields vserver,source-path,destination-path,type,policy,schedule; security login show -fields vserver,username,application,authmethod,role; security certificate show -fields vserver,common-name,serial,ca,type,expiration,organization,protocol;
Am I missing anything important? (At the moment the output from above just gets kept in an RCS managed file until I work out what to do with it.)
Something that I'm not capturing here is options that are only visible with "diag" privilege set. This represents a real problem because if an option such as NFSv4 lease timeouts is modified, there is no clue from normal administration that this change has been set and to view it requires changing to "diag" privilege - something that I don't really want to enable a user for "audit" purposes. I'm not sure what the right approach is here but being able to have "diag-readonly" (or something like that which can be assigned to a role) as an access level would be good.
What are other people do to monitor configuration of NetApp cDot systems?
Is it possible to integrate them into puppet or something similar?
You might ask "why" something like this is needed... in larger environments there are two problems: (1) more than one person being active in administration and (2) being able to defend the storage system configuration when "it doesn't work" and being able to demonstrate that nothing changed between yesterday/today so "your problem is elsewhere." It also helps to have your own history of "well, it worked yesterday but it isn't working today, what changed?"