VMware Solutions Discussions
VMware Solutions Discussions
Hello
i hope someone can explain, whats happening here on our filer (have a look at CP ty and NFS ops). This happened during the removal of a snapshot in VMware vCenter.
6 Servers are connected to the Filer via NFS ~100 VMs and all stood still for nearly 1 minute, which is really bad.
Version is 8.1GA on FAS3140 Metrocluster
> sysstat -sux 2
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
9% 1048 0 0 1048 2974 15409 19904 24 0 0 1 92% 0% - 55% 0 0 0 0 0 0 0
34% 598 0 0 598 2330 6283 58466 44562 0 0 1 99% 83% Hf 94% 0 0 0 0 0 0 0
25% 530 0 0 533 3497 5174 21092 16582 0 0 1 99% 100% :v 99% 3 0 0 0 0 0 0
42% 1723 0 0 1723 52082 5675 23278 16974 0 0 1 99% 18% Hn 100% 0 0 0 0 0 0 0
35% 438 0 0 441 1310 7318 68280 77686 0 0 1 96% 100% :f 89% 3 0 0 0 0 0 0
25% 824 0 0 824 2011 11035 17782 5220 0 0 1 98% 43% : 87% 0 0 0 0 0 0 0
9% 738 0 0 738 3873 14086 19720 32 0 0 1 92% 0% - 72% 0 0 0 0 0 0 0
8% 705 0 0 710 1694 15661 20524 0 0 0 1 91% 0% - 50% 5 0 0 0 0 0 0
8% 691 0 0 691 1925 15826 21270 32 0 0 1 91% 0% - 60% 0 0 0 0 0 0 0
13% 892 0 0 895 5199 14905 22720 24 0 0 1 90% 0% - 67% 3 0 0 0 0 0 0
30% 1083 0 0 1083 5625 10959 64170 38404 0 0 1 93% 57% Tf 85% 0 0 0 0 0 0 0
24% 602 0 0 602 1956 4569 22430 27258 0 0 1 99% 100% :f 100% 0 0 0 0 0 0 0
25% 463 0 0 468 1798 6305 20898 17053 0 0 1 99% 84% : 98% 5 0 0 0 0 0 0
31% 369 0 0 369 1454 4700 13074 8 0 0 1 99% 0% - 100% 0 0 0 0 0 0 0
13% 705 0 0 708 3119 13847 20014 24 0 0 5s 97% 0% - 81% 3 0 0 0 0 0 0
12% 1155 0 0 1197 4705 17936 25348 24 0 0 1s 93% 0% - 66% 42 0 0 0 0 0 0
36% 833 0 0 963 3822 7066 58388 49152 0 0 1 98% 84% Hf 94% 130 0 0 0 0 0 0
18% 1402 0 0 1406 6026 10577 18860 15904 0 0 1 95% 69% : 82% 4 0 0 0 0 0 0
17% 1256 0 0 1256 4175 15981 23592 32 0 0 1 94% 0% - 70% 0 0 0 0 0 0 0
24% 565 0 0 568 1983 8432 45066 41308 0 0 1 92% 81% Hs 90% 3 0 0 0 0 0 0
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
11% 1074 0 0 1074 5300 17035 22170 24 0 0 1 91% 100% :s 61% 0 0 0 0 0 0 0
15% 1358 0 0 1358 4361 18970 25256 32 0 0 1 92% 100% :s 54% 0 0 0 0 0 0 0
11% 1222 0 0 1225 2892 18328 24654 0 0 0 1 92% 100% :s 54% 3 0 0 0 0 0 0
10% 897 0 0 897 2104 17728 23934 32 0 0 1 91% 100% :s 65% 0 0 0 0 0 0 0
8% 719 0 0 721 1781 16394 20878 24 0 0 1 92% 100% :s 46% 2 0 0 0 0 0 0
9% 1133 0 0 1133 1808 18861 24810 0 0 0 1 93% 100% :s 58% 0 0 0 0 0 0 0
10% 802 0 0 802 2227 17639 23124 32 0 0 1 92% 100% :s 49% 0 0 0 0 0 0 0
11% 914 0 0 918 4303 17106 23486 24 0 0 1 92% 100% :s 59% 4 0 0 0 0 0 0
13% 1140 0 0 1140 12629 23893 30876 8 0 0 1 93% 100% :s 57% 0 0 0 0 0 0 0
10% 865 0 0 936 6002 7431 11088 24 0 0 1 86% 100% :s 54% 71 0 0 0 0 0 0
10% 907 0 0 907 6374 22753 21940 24 0 0 1 94% 100% :s 47% 0 0 0 0 0 0 0
16% 1468 0 0 1468 1785 68958 71042 8 0 0 1 99% 100% :s 46% 0 0 0 0 0 0 0
15% 1338 0 0 1341 1716 61911 64486 24 0 0 1 98% 100% :s 49% 3 0 0 0 0 0 0
15% 1379 0 0 1379 2372 69090 69644 32 0 0 1 99% 100% :s 48% 0 0 0 0 0 0 0
17% 1401 0 0 1404 2538 69727 76790 0 0 0 1 99% 100% :s 60% 3 0 0 0 0 0 0
9% 788 0 0 788 1731 28390 32940 24 0 0 1 96% 100% #s 40% 0 0 0 0 0 0 0
3% 155 0 0 155 339 740 1142 32 0 0 1 46% 100% #s 14% 0 0 0 0 0 0 0
4% 177 0 0 179 591 3640 4752 0 0 0 1 84% 100% #s 17% 2 0 0 0 0 0 0
2% 3 0 0 3 100 48 528 32 0 0 1 12% 100% #s 9% 0 0 0 0 0 0 0
2% 0 0 0 5 67 1 460 24 0 0 1 0% 100% #s 10% 5 0 0 0 0 0 0
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
2% 0 0 0 0 47 1 420 0 0 0 1 0% 100% #s 11% 0 0 0 0 0 0 0
5% 0 0 0 0 69 1 434 32 0 0 1 0% 100% #s 9% 0 0 0 0 0 0 0
2% 0 0 0 3 38 1 444 24 0 0 1 0% 100% #s 8% 3 0 0 0 0 0 0
2% 0 0 0 0 50 1 378 8 0 0 1 0% 100% #s 8% 0 0 0 0 0 0 0
2% 0 0 0 4 30 0 426 24 0 0 1 0% 100% #s 15% 4 0 0 0 0 0 0
3% 0 0 0 6 27 1 496 24 0 0 1 0% 100% #s 10% 6 0 0 0 0 0 0
Mon Jul 23 13:00:00 CEST [rflenasb:kern.uptime.filer:info]: 1:00pm up 63 days, 4:50 3117716252 NFS ops, 11007 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops
5% 0 0 0 2 21 1 528 8 0 0 1 0% 100% #s 10% 2 0 0 0 0 0 0
2% 0 0 0 2 8 0 1230 24 0 0 1 0% 100% #s 11% 2 0 0 0 0 0 0
3% 0 0 0 0 18 1 422 32 0 0 1 0% 100% #s 9% 0 0 0 0 0 0 0
2% 0 0 0 0 8 0 392 0 0 0 1 0% 100% #s 9% 0 0 0 0 0 0 0
2% 0 0 0 0 9 0 450 24 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
2% 0 0 0 0 7 0 460 32 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
3% 0 0 0 0 8 0 464 0 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
2% 0 0 0 0 9 0 446 32 0 0 1 0% 100% #s 11% 0 0 0 0 0 0 0
3% 0 0 0 0 7 0 414 24 0 0 1 0% 100% #s 9% 0 0 0 0 0 0 0
2% 0 0 0 0 9 0 352 0 0 0 1 0% 100% #s 9% 0 0 0 0 0 0 0
22% 3 0 0 3 176 5 30662 43798 0 0 1 99% 100% #v 62% 0 0 0 0 0 0 0
Mon Jul 23 13:00:20 CEST [rflenasb: NwkThd_00:warning]: NFS response to client 10.80.0.22 for volume 0x153d1d5(esx2) was slow, op was v3 setattr, 70 > 60 (in seconds)
34% 2393 0 0 2443 12116 4590 23814 18694 0 0 1 97% 100% bn 100% 50 0 0 0 0 0 0
26% 1217 0 0 1217 8639 13542 20518 8 0 0 1 97% 100% :n 100% 0 0 0 0 0 0 0
42% 1866 0 0 1866 24263 21275 71234 65884 0 0 1 96% 100% :s 100% 0 0 0 0 0 0 0
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
24% 1921 0 0 1921 12677 73422 77234 24 0 0 1 97% 100% #s 71% 0 0 0 0 0 0 0
5% 29 0 0 29 256 332 960 8 0 0 1 67% 100% #s 14% 0 0 0 0 0 0 0
3% 0 0 0 0 158 2 394 24 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
3% 0 0 0 0 15 0 444 32 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
2% 0 0 0 0 10 1 398 0 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
2% 0 0 0 0 9 0 426 24 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
3% 0 0 0 0 7 0 432 32 0 0 1 0% 100% #s 9% 0 0 0 0 0 0 0
2% 0 0 0 0 7 1 404 0 0 0 1 1% 100% #s 10% 0 0 0 0 0 0 0
2% 0 0 0 0 7 0 418 32 0 0 1 0% 100% #s 10% 0 0 0 0 0 0 0
3% 0 0 0 0 8 0 422 24 0 0 1 70% 100% #s 14% 0 0 0 0 0 0 0
2% 0 0 0 70 8 0 466 0 0 0 1 0% 100% #s 10% 70 0 0 0 0 0 0
31% 1 0 0 1 75 4 70836 78572 0 0 1 99% 100% #f 61% 0 0 0 0 0 0 0
39% 1740 0 0 1747 14148 5937 42408 57304 0 0 1 97% 100% bn 100% 7 0 0 0 0 0 0
28% 1534 0 0 1535 15540 23453 44092 32448 0 0 1 95% 100% :n 99% 1 0 0 0 0 0 0
44% 1103 0 0 1103 25991 9884 60308 101382 0 0 1 97% 100% #s 100% 0 0 0 0 0 0 0
37% 1519 0 0 1522 26352 14175 43054 37259 0 0 1 98% 100% bn 100% 3 0 0 0 0 0 0
51% 1277 0 0 1277 31948 27319 94415 92132 0 0 1 98% 100% #s 100% 0 0 0 0 0 0 0
32% 251 0 0 254 739 11242 51692 56126 0 0 0s 99% 100% #v 100% 3 0 0 0 0 0 0
49% 1572 0 0 1576 20006 14113 89682 82876 0 0 1 97% 100% bs 100% 4 0 0 0 0 0 0
38% 578 0 0 578 2713 16394 46161 41123 0 0 1 99% 64% : 100% 0 0 0 0 0 0 0
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
29% 800 0 0 807 4079 27154 56046 3248 0 0 1 96% 9% Hn 100% 7 0 0 0 0 0 0
43% 792 0 0 793 5975 14662 61450 70862 0 0 1 97% 100% :f 100% 1 0 0 0 0 0 0
27% 382 0 0 384 3978 3430 19794 26324 0 0 1 98% 100% :f 100% 2 0 0 0 0 0 0
27% 849 0 0 849 10722 6265 14694 128 0 0 1 98% 38% : 100% 0 0 0 0 0 0 0
43% 1555 0 0 1556 35538 9930 58596 58081 0 0 1 98% 85% Hn 100% 1 0 0 0 0 0 0
26% 936 0 0 940 2751 20755 55928 32876 0 0 1 98% 100% :f 94% 4 0 0 0 0 0 0
27% 633 0 0 633 3124 13768 23174 2060 0 0 1 99% 84% : 100% 0 0 0 0 0 0 0
26% 518 0 0 564 2123 10895 34226 32 0 0 1 98% 4% Hn 100% 46 0 0 0 0 0 0
35% 617 0 0 617 4633 11030 54556 66508 0 0 1 98% 100% :f 100% 0 0 0 0 0 0 0
29% 1307 0 0 1307 6683 23034 38128 11512 0 0 16s 98% 100% :f 100% 0 0 0 0 0 0 0
28% 669 0 0 672 3165 14640 24352 1988 0 0 11s 99% 32% : 100% 3 0 0 0 0 0 0
26% 541 0 0 541 2567 5139 23508 12504 0 0 11s 99% 17% Hn 100% 0 0 0 0 0 0 0
32% 710 0 0 713 4564 11496 51318 46780 0 0 13s 97% 100% :f 96% 3 0 0 0 0 0 0
29% 930 0 0 931 3539 14889 25052 14222 0 0 13s 99% 100% :f 95% 1 0 0 0 0 0 0
29% 1028 0 0 1028 11638 15819 25936 58 0 0 12s 99% 31% : 100% 0 0 0 0 0 0 0
44% 1043 0 0 1046 16706 12302 74558 57306 0 0 4s 97% 74% Hs 100% 3 0 0 0 0 0 0
31% 742 0 0 742 5226 3535 26398 26496 0 0 4s 99% 100% :f 100% 0 0 0 0 0 0 0
31% 870 0 0 873 24193 1799 25810 35476 0 0 5s 97% 100% Hn 100% 3 0 0 0 0 0 0
33% 558 0 0 558 8323 2071 49475 70310 0 0 1s 96% 100% :f 88% 0 0 0 0 0 0 0
33% 584 0 0 585 6386 1733 9606 7780 0 0 1s 99% 35% : 100% 1 0 0 0 0 0 0
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
23% 311 0 0 381 1451 1613 9700 0 0 0 1s 99% 0% - 100% 70 0 0 0 0 0 0
46% 507 0 0 507 2685 3678 65432 87158 0 0 2 99% 95% Hf 100% 0 0 0 0 0 0 0
21% 706 0 0 709 2565 3980 9540 3926 0 0 2 98% 64% : 79% 3 0 0 0 0 0 0
10% 606 0 0 606 5845 4239 12220 8 0 0 2 92% 0% - 58% 0 0 0 0 0 0 0
13% 967 0 0 969 12382 8464 13946 24 0 0 2 91% 0% - 67% 2 0 0 0 0 0 0
51% 1822 0 0 1825 50860 6899 66208 63404 0 0 2 98% 80% Hf 99% 3 0 0 0 0 0 0
38% 594 0 0 594 9072 2694 50650 60972 0 0 2 99% 100% bn 100% 0 0 0 0 0 0 0
41% 399 0 0 401 2265 2079 60946 87910 0 0 1 99% 100% :f 100% 2 0 0 0 0 0 0
33% 364 0 0 364 2546 1906 10574 712 0 0 1 99% 39% : 100% 0 0 0 0 0 0 0
25% 416 0 0 416 2167 2314 11375 0 0 0 1 99% 0% - 100% 0 0 0 0 0 0 0
24% 343 0 0 346 3119 2457 10940 24 0 0 1 99% 0% - 100% 3 0 0 0 0 0 0
32% 438 0 0 438 1675 2632 51148 42310 0 0 1 99% 51% Hf 100% 0 0 0 0 0 0 0
33% 292 0 0 295 1811 3018 12778 18658 0 0 1 99% 67% : 97% 3 0 0 0 0 0 0
18% 533 0 0 533 3053 7998 17002 32 0 0 1 97% 0% - 82% 0 0 0 0 0 0 0
12% 773 0 0 773 5609 16628 25700 24 0 0 0s 94% 0% - 66% 0 0 0 0 0 0 0
14% 924 0 0 930 5664 29022 37660 0 0 0 1 96% 0% - 71% 6 0 0 0 0 0 0
27% 761 0 0 761 5015 8151 49150 40240 0 0 1 96% 41% Hf 80% 0 0 0 0 0 0 0
28% 383 0 0 429 3626 2528 39080 45334 0 0 1 99% 100% :f 100% 46 0 0 0 0 0 0
15% 958 0 0 958 4586 5615 13704 16926 0 0 1 94% 60% : 74% 0 0 0 0 0 0 0
10% 634 0 0 634 5092 5723 13354 24 0 0 1 92% 0% - 62% 0 0 0 0 0 0 0
...
The Filer then proceeded as normal and the Snapshot-deletion finished few seconds later
Solved! See The Solution
We had this same issue. Turns out it was a bug in OnTap 8.1.2. We updated to OnTap 8.1.2 P3 and it resolved our issue. (Non Metrocluster, FAS3240)
http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=393877&app=portal
no
same here: after upgrading to 8.1.1GA - no difference
i was sending manually created core dumps to netapp
netapp is still analysing my core dumps
Ich bin bis 10/08/2012 abwesend.
Mit freundlichen Grüßen / Best Regards
Oliver Matic
Server-Administrator
ICT
Phone: +49 6181 403 525 - Fax: +49 6181 403 1 525 - Mobile: +49 1520 900
8726
E-mail: Oliver.Matic@normagroup.com
Similar problems here.
3240 Metrocluster with PAM-Modules running Ontap 8.1P2.
It seems it´s not only a VmWare problem because beside the VmWare-case it also applies a VCS cluster with Oracle RAC.
Aug 31 08:54:27 mxxxxx022a nfs: [ID 563706 kern.notice] NFS server mxxxxx003a ok
Aug 31 08:54:27 mxxxxx022a AgentFramework[1414]: [ID 702911 daemon.notice] VCS ERROR V-16-1-13027 Thread(44) Resource(oraxxxxxx_Netxxx) - monitor procedure did not complete within the expected time.
Aug 31 08:54:27 mxxxxx022a Had[1382]: [ID 702911 daemon.notice] VCS ERROR V-16-1-13027 (mxxxxx022a) Resource(oraxxxxxx_Netxxx) - monitor procedure did not complete within the expected time.
Aug 31 08:54:29 mxxxxx022a nfs: [ID 563706 kern.notice] NFS server mxxxxx003a ok
Aug 31 08:54:30 mxxxxx022a nfs: [ID 333984 kern.notice] NFS server mxxxxx003a not responding still trying
Aug 31 08:54:30 mxxxxx022a nfs: [ID 563706 kern.notice] NFS server mxxxxx003a ok
Aug 31 08:54:30 mxxxxx022a nfs: [ID 333984 kern.notice] NFS server mxxxxx003a not responding still trying
Aug 31 08:54:30 mxxxxx022a nfs: [ID 563706 kern.notice] NFS server mxxxxx003a ok
Aug 31 08:54:30 mxxxxx022a nfs: [ID 333984 kern.notice] NFS server mxxxxx003a not responding still trying
Aug 31 08:54:30 mxxxxx022a nfs: [ID 563706 kern.notice] NFS server mxxxxx003a ok
Aug 31 08:54:30 mxxxxx022a nfs: [ID 333984 kern.notice] NFS server mxxxxx003a not responding still trying
Aug 31 08:54:30 mxxxxx022a nfs: [ID 563706 kern.notice] NFS server mxxxxx003a ok
Any response from Netapp belonging this case?
Regards
Not really... i also opened a ticket and sent perflogs but all they found are some misaligned VMs and that i need to run reallocate to fix fragmentation (Optimization Level 4). We had a 1.5 hour 'blackout' due to this and i somewhat doubt that it's because of a fragmented Filesystem. If you experience this again, run perfstats and send them to NetApp - maybe that will help them to find the problem.
The NetApp filer fails terrible here and i'm really disappointed
a PAM-modul solved our problems with ontap NetApp Release 8.1.2 7-Mode
flexscale.enable on
flexscale.lopri_blocks off
flexscale.normal_data_blocks off
flexscale.pcs_high_res off
flexscale.pcs_size 0GB
flexscale.rewarm on
we are caching only metadata
We had this same issue. Turns out it was a bug in OnTap 8.1.2. We updated to OnTap 8.1.2 P3 and it resolved our issue. (Non Metrocluster, FAS3240)
http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=393877&app=portal
you're right - it seems like this patch finally fixed it
Thanks all for the information in this post. We just experienced a massive spike in latency following the completion of a storage vmotion. We are running 8.1.2.
I'll also let everybody know if we are told to upgrade to 8.1.2P3 and if that solves our problem.
Did 8.1.2P3 fix your issues following the Storage vMotion?
Thanks for holding me accountable! We actually were going to deploy 8.1.2P3 the day of my previous post (5/9) but then we noticed that 8.1.2P4 was being released on 5/10. We ended up going with 8.1.2P4 as it contained additional fixes that seemed related to large file deletes. 8.1.2P4 did resolve the issue with Storage vMotion and large file deletes! 🙂