VMware Solutions Discussions

Data Ontap 8.1 upgrade - RLW_Upgrading process and other issues

davidrnexon
25,851 Views

Hi,

We recently upgraded ontap from version 8.02 to 8.1. We read through the release notes, upgrade advisor and the upgrade notes, proceeded with the upgrade which was quite smooth BUT..

No where in the release notes or 8.1 documentation does it mention that in the background (after the upgrade) there is a background process that runs that can potentially dramatically degrade performance of your filer. If anyone from Netapp reads this, can you please ask to add this caveat into the release notes and upgrade advisor.

Right after upgrading there is a background process that begins which is entitled rlw_upgrading. RLW is short for Raid Protection Against Lost Writes. It is new functionality that is added into Data Ontap 8.1.

to see this process you need to be in priv set diag and then aggr status <aggr_name> -v

The issue is, while this process is running, and your dedupe jobs kick in, the CPU will sky rocket to 99% and filer latency goes through the roof. The only way to run the filer sufficiently is to either disable all dedupe, or turn all dedupe schedules to manual.

The problem is, this background process has been running for the last 3 weeks on one filer, and the last 2 weeks on another filer.

I have a case open with Netapp at the moment, but was wondering if anyone else has experience with this, or any recommendations/commands as for us to see how long this process has left to complete because no one seems to know much about this process or function ?

Becasuse for the last 2-3 weeks we have not been able to run any deduplication without severly impacting filer latency.

107 REPLIES 107

christoph_reeg
2,622 Views

Hello,

has anyone tested if 8.1.2 still has these problems?

vladimirzhigulin
2,622 Views

I just upgraded a pair of non-production FAS3240 systems from 8.0.2P4 to 8.1.2

Here are some observations that might be helpful for those of you who are looking at the 8.1.2 upgrade.

All aggr's (~16TB each) had "raid_lost_write on" prior to the upgrade. Right after controllers got 8.1.2 running I spotted disk activity on all aggr's , it was there for ~1h. During that time all disks were ~30-40% busy.

Now the aggr's have "rlw_upgrading" flag.

It's not very clear how this workload would affect production systems in terms of latency .. I'd say it's not going to improve it. I think it's doable to upgrade my systems during low workload hours, but it really depends on a particular environments I assume.

My 2c.

Vladimir

schrie
2,622 Views

I'm not sure if it will answer everyone's questions, but the rlw_upgrading function in ONTAP 8.1 is explained here: https://kb.netapp.com/support/index?page=content&id=3013583

This article was posted Oct 22/2012, so it wasn't available at the beginning of this thread, but I hope it helps provide some detail/background to the admins currently hitting this problem.

Public