Mastering a Strong, Stable IT Operation That Can Move as Fast as the Business Demands

By Matt Brown, IT Customer Engagement - NetApp on NetApp, NetApp IT

 

Remember those TV infomercials for Ginsu® knives that slice and dice but never need sharpening? Today’s IT is a bit like a Ginsu knife. It faces an increased demand to operate in two modes.  The “never needs sharpening” mode runs like a utility. It is transaction-based with an emphasis on maintaining operational stability and keeping the lights on. The second mode rapidly slices and dices, responding quickly to business demands with an agile time-to-market mindset and rapid application evolution.

 

Gartner Research refers to these two speeds as bimodal IT. It defines this as a form of IT “where slowly evolving, ‘run-the-business’ IT systems (Mode 1) need to coexist and interoperate with fast-changing and innovative ‘transform the business’ initiatives (Mode 2).”

 

These two modes of IT could be viewed as counter intuitive and having conflicting goals—to provide stable operations, delivery, and support, but also to be agile, fast, and responsive. However, like the Ginsu knife, these are really compatible and complementary features. For most IT shops, becoming a true Ginsu knife takes time. It is a journey.

 

In this first part of a four-part blog series, we will discuss how NetApp IT discovered that stabilizing the IT operations was the first step in our journey.

 

Stabilizing IT

To get to an agile IT operating mode, we had to first create a stable environment capable of assimilating frequent and rapid changes. These changes include things like capacity upgrades, new technology introductions, and new features and capabilities. The goal was to provide a predictable, steady-state mode of operations regardless of the changes being introduced into the environment. Historically, we found that the more change introduced, the more volatile our IT environment became, and the more our teams behaved in a reactive manner.

 

To become really good at change management, we needed direct accountability. Key individual contributors were positioned into visible IT roles with accountability for end-to-end processes (or services). They were given clear responsibility for outcomes—both good and bad. With accountability, the individual assumes the responsibility to build and evolve processes and the flexibility to change things as the environment requires. By elevating key individuals with daily operational familiarity, we had both clear accountability and authority to enable the type of responsiveness we wanted from our IT operations. With this front line empowerment, the process and services owners did not have to stop and get permission; instead they acted quickly and delegated as necessary.

 

Growing Change Volume, Declining P1s Incidents

With a goal to stabilize operations, we saw success with the dramatic reduction in major Priority 1 (P1) incidents, faster return to service, the ability to review and implement change quickly, and an overall shift from reactive to proactive mode. Over the past five years, our stabilization efforts have resulted in an 86 percent drop in P1 incidents and a corresponding 70 percent drop in the median P1 duration time to service recovery.

 

P1 Duration

 

Along the journey, we adjusted our process to review and approve changes. Previously, the team met every two weeks to review more than 100 change requests in a meeting that took hours. By improving the process, we were able to do quick change reviews and approvals daily. In fact, we have been able to assimilate over 40 percent more changes into our IT operations while keeping major P1 incidents to a bare minimum. The time formerly spent on reactive activities, like change window negotiations, incident management, root cause analysis, and management follow-ups, has been redirected to proactive, value-add activities focused on creating better IT services. (Click here to learn how our IT infrastructure stability has improved the lives of our business applications teams.)

 

In addition to improving the change process and assigning accountability to key people, we found NetApp technology contributed significantly to the stabilization of our IT operations. In the future, we will discuss how the VolMove feature of clustered Data ONTAP® ensures nondisruptive operations; how FlexClone® creates multiple, instant data-set clones in five minutes; and the importance of OnCommand® Insight and how we have integrated it with our configuration management database.

 

The NetApp-on-NetApp blog series features advice from subject matter experts from NetApp IT who share their real-world experiences using NetApp’s industry-leading storage solutions to support business goals. Want to view learn more about the program? Visit www.NetAppIT.com.