Solved: Re: Noob to netapp - Disks are almost always 100% utilized with nothing running?

chrisbeach · ‎2018-08-10

I've got an FAS2240-4 that is currently sitting in a datacenter for DR purposes, I've also got a Dell R720 sat there as well, with the plan of using it as a VMware host in the event of a DR.

The FAS2240 is snapMirrored to another Netapp here at the office. I created an NFS volume on it to use for VMs on the Dell R720, however we're seeing 150-500ms response times in VMware for the datastore with 1 VM booting up.

I looked at different posts for statistics and ran some, I see all of the disks are constantly close to 100% utilized, snapMirror didn't appear to be running at the time and deduplication wasn't running. How can I figure out what is utilizing the disks?

I ran Statit several times, the below was run for about 10 seconds, results attached.

ANY help is appreciated, I'm absolutely new to Netapp, thanks!

Damien_Queen · ‎2018-08-13

Looks like you have Deswizzling scanner running background all the time.

bFiler0*> wafl scan status DS06
Volume DS06:
 Scan id                   Type of scan     progress
   76839    active bitmap rearrangement     fbn 514 of 564 w/ max_chain_len 3

Deswizzling is a process of mapping newly replicated virtual blocks in WAFL to physical blocks on the destination SnapMirror/SnapVault system.

To be more precise virtual virtual to virtual physical 🙂

This kind of architecture allows WAFL to store block data from one storage with one disk number & disk types to another system with different disk types & disk numbers.

For example, you can have a primary system with 24 x SAS drives and replicate to 12x SATA drives. This requires remapping of data on the destination system.

The deswizzling process can be disabled, but if and when a disaster occurs, your DR system will run very slowly because remapping will take place until remapping will finish. So you interested deswizzling to run in advance.

https://community.netapp.com/t5/Data-ONTAP-Discussions/Deswizzling-amp-SnapMirror/td-p/22857

https://community.netapp.com/t5/Data-ONTAP-Discussions/performance-issue-with-snapmirror-and-snapshots-on-target-aggregate/td-p/107835

https://community.netapp.com/t5/Data-ONTAP-Discussions/Deswizzle-how-to-read-status/td-p/49809

https://www.netapp.com/us/media/tr-4075.pdf

View solution in original post

chrisbeach · ‎2018-08-13

Hoping someone can point me in the right direction 🙂

naveens17 · ‎2018-08-13

Since this is a destionation target filer . there could lot of back 2 back CP's may be happening due to the block reclamaintion

so check systat -x c or in diag mode sysstat -M (kahuna)

Damien_Queen · ‎2018-08-13

Run this command & provide output for it:

priv set diag
sysstat -M 1

wafl scan status my_SM_volume
rdfile /etc/snapmirror.conf

chrisbeach · ‎2018-08-13

Thanks for taking the time to reply!

Attached are the outputs (edited).

Damien_Queen · ‎2018-08-13

Looks like you have Deswizzling scanner running background all the time.

bFiler0*> wafl scan status DS06
Volume DS06:
 Scan id                   Type of scan     progress
   76839    active bitmap rearrangement     fbn 514 of 564 w/ max_chain_len 3

Deswizzling is a process of mapping newly replicated virtual blocks in WAFL to physical blocks on the destination SnapMirror/SnapVault system.

To be more precise virtual virtual to virtual physical 🙂

This kind of architecture allows WAFL to store block data from one storage with one disk number & disk types to another system with different disk types & disk numbers.

For example, you can have a primary system with 24 x SAS drives and replicate to 12x SATA drives. This requires remapping of data on the destination system.

The deswizzling process can be disabled, but if and when a disaster occurs, your DR system will run very slowly because remapping will take place until remapping will finish. So you interested deswizzling to run in advance.

https://community.netapp.com/t5/Data-ONTAP-Discussions/Deswizzling-amp-SnapMirror/td-p/22857

https://community.netapp.com/t5/Data-ONTAP-Discussions/performance-issue-with-snapmirror-and-snapshots-on-target-aggregate/td-p/107835

https://community.netapp.com/t5/Data-ONTAP-Discussions/Deswizzle-how-to-read-status/td-p/49809

https://www.netapp.com/us/media/tr-4075.pdf

Damien_Queen · ‎2018-08-13

Can you do just in case also

*>  sysstat -x -m

chrisbeach · ‎2018-08-14

Thanks again for the explanation! As someone who is absolutely new to Netapp, I would have sworn you were making up words 🙂 After reading through the links you sent I think I understand.

This Netapp is acting as a DR filer for ours at the office, purely for file shares, I was going to chunk out some space for VM's to run on top of it as well, but it sounds like as long as deswizzling is enabled and running, the BSAS disks are going to constantly be busy trying to keep up (unless it's never able to finish, stuck in a loop because it keeps snapmirroring and restarting?).

The command you wanted:

bFiler0> sysstat -x -m
 CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out
  6%      0      0      0       0       0      1    6981      0       0      0     8s    96%    0%  -   100%       0      0      0       0      0       0      0
  6%      1      0      0       1       0      0    6733      0       0      0     8s    96%    0%  -   100%       0      0      0       0      0       0      0
  6%      2      0      0       2      15      1    7080      0       0      0     8s    96%    0%  -   100%       0      0      0       0      0       0      0
  6%      0      0      0       3       0      0    7261      0       0      0     8s    96%    0%  -   100%       3      0      0       0      0       0      0
  7%      0      0      0       0       0      0    8157    765       0      0     4s    96%   37%  T   100%       0      0      0       0      0       0      0
  7%      3      0      0       3       1      1    7045      0       0      0     4s    96%    0%  -   100%       0      0      0       0      0       0      0
  6%      0      0      0       3       0      0    7267      0       0      0     4s    96%    0%  -   100%       3      0      0       0      0       0      0
  6%      0      0      0      60       0      0    6811      0       0      0     4s    97%    0%  -   100%      60      0      0       0      0       0      0
  6%      0      0      0      68       0      0    6781      0       0      0     4s    96%    0%  -   100%      68      0      0       0      0       0      0
  6%      1      0      0       6       0      1    7115      0       0      0     4s    96%    0%  -   100%       5      0      0       0      0       0      0
  6%      0      0      0       3       0      0    7939      0       0      0     4s    94%    0%  -   100%       3      0      0       0      0       0      0

Damien_Queen · ‎2018-08-14

Yeah, what I thought, CP ty T means "Occurs every 10 seconds since the last CP if no other trigger has caused it.", which means NVRAM not busy at all.

Disabling deswizzling is not the only way to reduce disk utilization.

You can try to change the SnapMirror schedule to reduce the time of the deswizzling process.