Subscribe
Accepted Solution

Shelf firmware upgrade disruption

I wanted to check some things before we start upgrading firmware on a system.

System is a fabric metrocluster with combined ESH2 and ESH4 modules, so FC only.

I always thought ESH updates were non disruptive, but am confused by the documentation on this.

The upgrade guide for ONTAP 7311 states:

During

firmware updates to disk shelves controlled by ESH series modules or LRC modules, you do

not need to schedule system downtime for maintenance. The data on the disk shelves remains accessible

during the upgrade.

Further on it states:

By running the

storage download shelf command once, you upgrade all eligible modules

connected to both controllers in an active/active con

figuration. The command updates the modules

sequentially:

first all A modules, then all B modules. In addition, the process pauses I/O to all loops on

the controllers (both FCP and SATA).

Now both statements can be true, as long as I do a cluster failover during the upgrade of the shelf firmware. But really I'd rather not as this is a large environment with no guaranteers all connected servers have timeouts correctly set.

If I cannot use the manual method as it will pause all I/O to all shelves, I would need to do a cluster failover to initiate an upgrade I guess, is this correct?

A related question for another situation would be when we have FC-ATx modules in place with firmware 36 or lower, I read on this forum that not only would we have disruption for data on the SATA aggregate but also for FC aggregates (on seperate loops of course) because ONTAP would freeze all I/O on all loops during the upgrade. Is this true? It would be nice if the documentation could state this clearer.

Re: Shelf firmware upgrade disruption

I think the documentation you are referring to is out of date. NetApp does sometimes a bad job updating their documentation

Upgrading shelf firmware for ESH/ESH2/ESH4 is non disruptive. I have been using it on my (metro)clusters for years without any issues.

Upgrading shelf firmware for AT-FC and AT-FCX, used in SATA shelves, is a different issue. This is a disruptive process, except when all the folowing conditions apply.

  1. AT-FCX shelf modules
  2. Shelf is dual attached to a single controller. A single connection to each node in a cluster does not count as dual attached!
  3. Controller running ontap 7.3.1.
  4. Shelf module firmware already on version 37 and you are upgrading to a higher version. Upgrading from a version lower than 37 requires downtime.

Upgrading AT-FC/AT-FCX does not affect the data availability of ESH(x) shelves.

But don't take my word for it and check the following documentation link:

http://now.netapp.com/NOW/knowledge/docs/ontap/rel7311/html/ontap/upgrade/upgrading/concept/c_oc_upg_shelf_fw_availability.html#c_oc_upg_shelf_fw_avai...

Re: Shelf firmware upgrade disruption

Ok, but when you do an upgrade, is there a need to do this cluster takeover to initiate an upgrade or can both heads and their respective aggregates happily stay online servicing data?

Re: Shelf firmware upgrade disruption

There is no need execute cluster failover when upgrading the shelf firmware, because the module stays active.

Re: Shelf firmware upgrade disruption

Yes. but not always.

I think we need to know why uppgardding shelf firmware is needed.

If the firmware has some problems then rebooting can be needed.

In normal situation, cf takeover does not need.

Re: Shelf firmware upgrade disruption

Same here -- I've never had an ESH/ESH2/ESH4 firmware upgrade be disruptive. AT-FCX updates are disruptive as noted unless running AT-FCX v. 37 + MPHA + 7.3.1.1+.

Re: Shelf firmware upgrade disruption

Another interesting possibility is when customer has mixed old and new FC-x SATA

shelves and want to do an NDU of ONTAP or wants to upgrade a subset of the SATA shelves .

Whenever ONTAP finds a new f/w file it will apply it to all necessary shelves, an ONTAP upgrade (as I understand it) might put newer f/w files in the etc directory. So a theoretical NDU would become disruptive because of an unwanted f/w upgrade of the shelves. Is there any way around this?

Also why am I not able to upgrade specific shelves? I know of the command to this by loop but placing the f/w file in ONTAP seems like a big risk to me. Suppose I have some idle shelves SATA I want to upgrade while some production SATA shelves should be left alone.

Thanks for the great input.

Re: Shelf firmware upgrade disruption

joostvandrenth wrote:

Whenever ONTAP finds a new f/w file it will apply it to all necessary shelves, an ONTAP upgrade (as I understand it) might put newer f/w files in the etc directory. So a theoretical NDU would become disruptive because of an unwanted f/w upgrade of the shelves. Is there any way around this?

Remove the firmware file after installing the new ontap version and before you reboot

Re: Shelf firmware upgrade disruption

I am still running into this problem.

We had an AT-FCx module fail on us, luckily the passive module of the two. Having received a replacement with firmware version 35 (IBM shame on you!) twice (shame on you... twice? ) we had no way of updating that particular module only.

A command to update only a channel of even a channel and shelf combination will first update the targeted module(s) and THEN will run the update on ALL eligible shelves on the system.... Which would have brought down services to 160 TB of storage for a long time....

Even though NDU upgrades are possible, I still run into sites without the necessary cabling or software versions to support it.

Is there a way around this?