May 2016
My name is Ezra Tingler, and I work in NetApp’s Customer-1 Storage Services organization. The Customer-1 Storage Services team is responsible for the architecture, procurement, deployment, and maintenance of the storage infrastructure that services all of our internal applications. The team is organized into service line owners, each of whom owns a particular aspect of our storage technologies. As expected, our storage infrastructure is built on the latest and greatest NetApp technologies. As a large consumer of NetApp technology, the Customer-1 Storage Services team also serves as a reference to external NetApp customers when it comes to showcasing the correct utilization of NetApp products.
On this team, I own the Storage Ecosystem Service line, which means I am responsible for storage hardware lifecycle management. My main goal is to ensure that our storage software and hardware creates a stable, functioning storage ecosystem from which other service lines and applications are provided data services. Currently, our team is migrating all data hosted on NetApp siloed storage (i.e. 7-mode) to clustered storage.
Challenge
As part of this migration project, I’ve been installing additional storage nodes and clusters. When I first started this project, the average configuration time for a high availability (HA) clustered controller pair was about four hours, spread out over two to three days. The four hours did not include the time needed to configure the cluster inter-connect switches or initialize disks; that took another two to twelve hours depending on disk type. Typical office interruptions added even more time as I had to figure out where I had left off and what I still needed to configure. This sporadic schedule resulted in missed deadlines and some configuration inconsistencies. I knew there had to be a better way to go about this project.
The Solution
I challenged myself to see if I could automate the storage configuration process to save time and reduce errors. Although I’m not a developer, I found it easy to write a configuration script using the NetApp Manageability Software Development Kit (NM SDK). I run my script once the disks are initialized, cluster setup is complete, and the cluster inter-connect switches are properly configured. All in all, the script configures 23 specific items:
- Rename the cluster nodes
- Rename the any existing interfaces to match the new node names
- Cluster interfaces and node management interfaces
- Rename the root aggregates to match the new node names
- Install licenses
- Configure the service processors
- Set the flow control to none on all 10g ports
- Create broadcast domains and assigns the proper ports
- Create interface groups and add the correct ports
- Create VLANs
- Create failover groups and assign the correct ports
- Create backup interfaces (intercluster)
- Create user roles
- Create users
- Set the RAID scrub schedule
- Create aggregates
- Disable aggregate snapshots on all aggregates
- Create cluster peers
- Configure NTP
- Configure SNMP
- Configure CDP
- Configure web services
- Configure AutoSupport
As the script runs, it reads predefined configuration information from a file I create, and applies this configuration to the cluster nodes. The only thing I need to do before running the script is to edit the config file with unique information, such as node names, IP addresses, and password keys.
After using this script, I was amazed by the results. The four-hours it used to take has been reduced to about five minutes. Using the configuration script, I can now install 24 storage nodes in two hours rather than 96 hours, a time savings of 94 hours or 2 1/2 work weeks. Errors caused by interruptions have also been eliminated, and automating this process has freed up my time so that I can work on other projects.
If you are a storage admin, you can easily create your own configuration script using the SDK. I used a tool included in the SDK called Z-Explorer that contains a complete list of all ZAPI calls for the cluster. With Z-Explorer, most of the development work is done for you. It took me just a few weeks to fully automate my clustered storage builds. This KnowledgeBase article is a good place to start.
It was a fun project because I could write the script without feeling like I had to be a developer. I wrote the scripts in Perl, but the SDK works with almost any language you are familiar with. I also used the SDK online forum to get advice from others. People on this forum were always quick to answer my questions.
The Future
I’m now using the SDK to automate and streamline other storage tasks to save time and reduce errors. My next project is a quality assurance (QA) script that will login to a cluster and verify if nodes are properly configured per NetApp IT Standards and NetApp best practice guidelines. I plan to automate the cluster interconnect switch configuration in the same way as well as create an E-Series configuration script.
If you are interested in seeing the actual scripts, I am in the process of making them available on github.com. You can look for them here. In future TechOnTap newsletters, I’ll make sure to post the URL links once they are available. I’ll also be appearing on an upcoming Tech OnTap podcast to speak more about my automation process with other subject-matter experts from across the industry.