bad performance read/write latency and snapmirror lag

Umar · ‎2015-12-17

Brief Overview

Netapp 8.1.4 P9

2 x FAS6020 in a flexpod setup (Configured in 2013)

Aggr 1 Fsata0 with Flash Pool (24 volumes) (124tb used) (nfs datastores + vfilercifs )

Aggr Sas0 (48tb used)

Aggr Sata (50tb)

Aggr sata (45tb)

Netapp connected to Nexus 5K via 10Gig (flexpod setup)

Summary of Problems : Bad performance read /write | Snap mirror taking ages to complete | throughput is very low

We have been experincing bad read/write latency since the summer and we upgrade to 8.1.4 P9 in september which made the problem go away for four weeks. Types of problems Users can't write/read small documents such a 1mb word document or it takes upto 1min to open them. The problem seems to be our netapp02 controller which has high ping rates from controller 1 to controller 2 but when you ping anything else on the network you get sub 1ms from controller 2.

Snap mirror lag times are horrdenous please see screenshot.

Netapp can't find anything wrong with perfstats, we are thinking its a bug or our config is incorrect somewhere.

Please could you guys help me investigate

 sysstat -m
 ANY  AVG  CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
100%  59%   58%  61%  58%  59%  59%  61%  59%  59%
100%  53%   52%  54%  52%  51%  52%  55%  53%  52%
 99%  50%   49%  51%  49%  49%  50%  53%  50%  50%
100%  57%   56%  58%  56%  56%  57%  59%  57%  57%
 99%  51%   50%  51%  50%  50%  51%  53%  51%  51%
 99%  48%   47%  49%  47%  47%  48%  51%  48%  48%
 99%  51%   49%  52%  49%  49%  51%  54%  51%  50%
 99%  50%   48%  51%  48%  49%  50%  52%  49%  49%
 99%  50%   49%  52%  49%  48%  50%  52%  50%  49%
 99%  48%   47%  50%  47%  48%  48%  51%  48%  49%
 99%  52%   52%  53%  52%  52%  52%  54%  52%  52%
 99%  51%   49%  52%  50%  50%  51%  54%  51%  51%
100%  53%   52%  54%  51%  52%  52%  56%  52%  53%
100%  53%   52%  55%  52%  53%  53%  56%  54%  53%
 99%  49%   48%  51%  48%  49%  49%  52%  49%  49%

MACAKIGO1 · ‎2015-12-17

Hi,
I would say that aggregate with SATA disks + flash pool is slowing down your system bit as it's mostly utilized based on provided statit output.
Flash pool is caching reads and random overwrites(operations smaller then 16KB) - make sense to use it for cifs shares with many small files but not for datastores.
Usually datastores needs quick read response as OS of VMs is laying on your volumes. For that purpose Flash Cache is better solution.(ideally with dedup enabled)
Flash Cache is PCI based (much faster then accessing SSD on disk layer as flash pool does) .
Also in case of Flash pool , all hot blocks(the most accessed blocks) needs to be written to the SSDs during consistency point. It means you need to wait for writing it to the disk to use advantage of cache.
I would recommend (if you have flash cache installed) to create new datastore on aggregate with SAS disks and migrate some VMs with heavy load there.

Another thing what I would say is not best practice is that you are mixing disk types within one controller. If you have SATA disks and SAS disks on one controller , you are slowing down consistency point as you need to still wait once SATA disks will complete their writes.

That's just my opinion 🙂

Umar · ‎2015-12-18

we have a 512gb flash cache card.

I understand your view on same disk types per controller but this would impact on n+1 redundancy netapp ? (that's what we were sold)

asulliva · ‎2015-12-17

Do you have compression or deduplication jobs which are running at the same time as your snapmirrors? How full are your aggregates? Have you done a reallocate measure to check for noncontiguous free space?

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

Umar · ‎2015-12-18

Hi @asulliva

We need to find out why its taking so long to do a snap mirror ? (WAN is not the problem )

Do you have compression or deduplication jobs which are running at the same time as your SnapMirrors?

We have snap mirrors running continuously and de-dup runs concurrently to these if it runs at all.

Path                           State      Status     Progress
/vol/vol_188_data188_T3_01   Enabled    Idle       Idle for 93:49:11
/vol/vol_dart_data             Disabled   Idle       Idle for 2950:24:43
/vol/vol_188_documentumsql_backup Disabled   Idle       Idle for 3102:04:50
/vol/vol_188_188cvma           Disabled   Idle       Idle for 2943:03:06
/vol/vol_188direct_pool_1_j  Disabled   Idle       Idle for 2950:23:22
/vol/vol_188direct_n         Disabled   Idle       Idle for 2950:22:11
/vol/vol_188direct_pool2_d   Disabled   Idle       Idle for 2949:41:44
/vol/vol_188direct_pool3_e   Disabled   Idle       Idle for 2950:03:09
/vol/vol_188direct_q         Disabled   Idle       Idle for 2950:21:51
/vol/vol_188direct_r         Disabled   Idle       Idle for 2950:16:06
/vol/vol_188icedblive_s        Disabled   Idle       Idle for 2942:49:07
/vol/vol_188icedblive_r        Disabled   Idle       Idle for 2949:30:52
/vol/vol_188_dss_clust         Enabled    Idle       Idle for 66:35:05
/vol/vol_188_dss_rdm_map       Enabled    Idle       Idle for 43:32:23
/vol/vol_vfiler_medical_records_images0 Disabled   Idle       Idle for 2933:17:27
/vol/vol_vfiler_188doccache_cache Disabled   Idle       Idle for 2934:03:20
/vol/vol_188_dss_clust_file188 Enabled    Idle       Idle for 14:53:40
/vol/vol_188_vfilercifs_departments_01 Enabled    Idle       Idle for 120:30:15
/vol/vol_188_vfilercifs_applications_01 Enabled    Idle       Idle for 20:47:53
/vol/vol_vfiler_records_images1 Disabled   Idle       Idle for 2944:24:43
/vol/vol_vfiler_records_images2 Disabled   Idle       Idle for 2944:24:43
/vol/vol_188_vfilercifs_backups_01 Disabled   Idle       Idle for 3120:24:42
/vol/vol_188_vfilercifs_users_01 Enabled    Idle       Idle for 20:42:47
/vol/vol_188_data188_T2_01   Enabled    Idle       Idle for 12:23:52
/vol/vol_vfiler_images0_test Disabled   Idle       Idle for 2944:24:43
/vol/vol_vfiler_images1_train Disabled   Idle       Idle for 2944:23:40
/vol/vol_vfiler_medical_records_dart_images2_live Disabled   Idle       Idle for 2944:23:43
/vol/vol_188_data188_T2_04   Disabled   Idle       Idle for 3076:43:44
/vol/vol_vfiler_retinal_image188 Disabled   Idle       Idle for 2943:28:41
/vol/vol_188_data188_T4_01   Enabled    Idle       Idle for 165:00:53
/vol/vol_188_data188_T4_03   Enabled    Idle       Idle for 148:12:06
/vol/vol_188_vfilercifs_archive_01 Disabled   Idle       Idle for 2941:37:40
/vol/vol_188_vfilercifs_backups_02 Disabled   Idle       Idle for 3123:58:09
/vol/vol_188_data188_T4_06   Disabled   Idle       Idle for 3076:43:54

How full are your aggregates?

NetApp consultant from neos healthcheck said we go upto 95% utilisation on a large aggregate we are currently at 90% for fstat0

Have you done a reallocate measure to check for noncontiguous free space?

No we havn't done this

netapp02> sysstat -x 2
 CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s                                                               Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                       in    out    read  write    read  write                                                                 age    hit  time  ty  util                            in    out      in    out
 93%   7703   7416      0   15438   24008 114883  156912  66706       0      0     1     95%   69%  :    32%       3    316      0     155   2978       0      0
 94%   5332   6664      0   12511   24394 113986  191314  70969       0      0     1     96%   21%  Hn   45%       0    515      0     663   4527       0      0
 93%   6986   8064      0   15282   27105 101081  151629 119054       0      0     0s    94%  100%  :f   32%       0    232      0     120   2354       0      0
 91%   6623   7459      0   14320   24715 133290  197996 116298       0      0     0s    92%  100%  :f   31%       4    234      0     187   2642       0      0
 95%   6043   6910      0   13296   33680 123154  181144   9527       0      0    36s    95%   29%  Hn   36%       0    321      0     212   2306       0      0
 94%   4993   8246      0   13591   27255  66921  144321 145748       0      0     0s    96%  100%  :f   38%       2    350      0     187   3107       0      0
 91%   5346  10869      0   17902   25553  71091  155638 155790       0      0     0s    95%  100%  :v   35%       0   1687      0   14114   1678       0      0
 92%   4914   8204      0   13328   42350  78161  184114 119610       0      0     1     96%   46%  Hs   44%       0    210      0     200   1706       0      0
 84%   5295   7032      0   12760   26394  73475  142920 132128       0      0     1     95%  100%  :f   33%       4    407      0     292   3103       0      0
 88%   6525   8547      0   15443   38743  90686  115278  31376       0      0     1     96%   35%  :    43%     130    241      0     411   2083       0      0
 92%   6419   8500      0   15200   47308 118509  193627  80080       0      0     1     95%   18%  Hn   37%       2    279      0     598   2113       0      0
 92%   5533   8235      0   13942   20524  86481  170934 168575       0      0     1     93%  100%  :f   41%       0    174      0     133   1710       0      0
 92%   8000   6383      0   14973   48441 133375  165988  70304       0      0    47s    95%   66%  :    32%       0    590      0     632   4876       0      0
 97%   7145   6787      0   14285   30185 101420  175611  80692       0      0     1     97%   26%  Hs   42%       2    351      0     301   2997       0      0
 91%   6967   6911      0   14224   46559  96112  152779 122749       0      0     0s    94%  100%  :f   39%       0    346      0     645   2594       0      0
 88%   7634   7319      0   15310   28505 126086  160210 101060       0      0     1     95%  100%  :f   37%       2    355      0     198   3279       0      0
 97%   7885   6401      0   14636   42104 211693  299165 134110       0      0     0s    95%   58%  Hs   50%       3    347      0    1081   2406       0      0
 98%   7764   9916      0   18045  100886  88082  220507 176332       0      0    48s    93%  100%  :f   57%       1    364      0   31110  32819       0      0
 99%   8120   7809      0   16458   58469 115699  300433 213909       0      0     1     96%   99%  Zs   63%       5    524      0   42626  45206       0      0

Umar · ‎2015-12-21

@asulliva

@MACAKIGO1

Any advice guys ?

Do you think we need to stagger our snap mirror schdule because all volumes start snap mirror operations every 15minutes ?

Thanks

Umar

asulliva · ‎2015-12-21

There's too many variables to really narrow it down. The snippet of Graphana output shows some pretty high disk utiliziation, so I'd start with that. Try disabling some tasks...is it ok to disable both SnapMirror and dedupe for a time and see whether performance returns to an acceptable level for the clients? If not, try disabling one or the other and seeing how it affects performance.

Reducing the frequency of SnapMirror jobs could help, so could alternating dedupe and SnapMirror so that they aren't both running at the same time. You said WAN isn't an issue, but you're averaging over 37MB of data coming into the system every second (net KB/s in)...if you're replicating all of that data then you need a WAN pipe which can support at least that much bandwidth (> 300mb/s). Check the SnapMirror transfer sizes to help determine how much bandwidth each volume needs and divide that how much throughput is available to determine windows.

Do a reallocate measure on the volumes to determine if reallocation would help. The chains in your statit are ok, not great, but not terrible either...might be worth doing a reallocate measure on the aggregates as well. Be aware that reallocate will consume some IO, so it could impact latency if it's already bad.

If you haven't opened a support case, I would do so. Reach out to your account team to have them escalate if needed.

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.