Active IQ Unified Manager Discussions

NetApp Harvest poller keeps crashing for one of the controller

BharathSingh
3,172 Views
we monitor cluster health from harvest poller and that is keep crashing on iz1netapp3, there is not exact error in the logs, but I had shared output of strace command when it crashed.
5 REPLIES 5

BharathSingh
3,149 Views

here is the strace output

stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90 
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 75) = 75 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 75) = 75
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 88) = 88 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 88) = 88
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 84) = 84 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 84) = 84
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 84) = 84 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 84) = 84
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 79) = 79 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 79) = 79
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 73) = 73 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 73) = 73
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 74) = 74 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 74) = 74
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 74) = 74 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 74) = 74
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 85) = 85 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 85) = 85
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 97) = 97 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 97) = 97
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 92) = 92 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 92) = 92
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 87) = 87 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 87) = 87
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 98) = 98 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 98) = 98
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 79) = 79 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 79) = 79
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 73) = 73 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 73) = 73
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 77) = 77 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 77) = 77
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 90) = 90
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 87) = 87 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 87) = 87
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 93) = 93 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 93) = 93
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 84) = 84 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 84) = 84
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 89) = 89 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 89) = 89
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 87) = 87 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 87) = 87
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 79) = 79 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 79) = 79
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 74) = 74 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 74) = 74
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 89) = 89 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 89) = 89
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 80) = 80
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 82) = 82
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 76) = 76
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 81) = 81
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 92) = 92 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 92) = 92
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 write(1, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83 write(2, "[2020-03-19 16:22:09] [DEBUG ] "..., 83) = 83

write(3, "Illegal division by zero at /opt"..., 73) = 73 rt_sigaction(SIG_0, NULL, 0x7ffceeb80950, 😎 = -1 EINVAL (Invalid argument) rt_sigaction(SIGHUP, NULL, {SIG_DFL, [], SA_RESTORER, 0x7f241d3a15f0}, 😎 = 0 rt_sigaction(SIGINT, NULL, {SIG_DFL, [], SA_RESTORER, 0x7f241d3a15f0}, 😎 = 0 rt_sigaction(SIGQUIT, NULL, {SIG_DFL, [], SA_RESTORER, 0x7f241d3a15f0}, 😎 = 0 rt_sigaction(SIGILL, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGTRAP, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGABRT, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGBUS, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGFPE, NULL, {SIG_IGN, [FPE], SA_RESTORER|SA_RESTART, 0x7f241cffa3b0}, 😎 = 0 rt_sigaction(SIGKILL, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGUSR1, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGSEGV, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGUSR2, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGPIPE, NULL, {SIG_IGN, [], SA_RESTORER, 0x7f241d3a15f0}, 😎 = 0 rt_sigaction(SIGALRM, NULL, {SIG_DFL, [], SA_RESTORER, 0x7f241d3a15f0}, 😎 = 0 rt_sigaction(SIGTERM, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGSTKFLT, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGCONT, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGSTOP, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGTSTP, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGTTIN, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGTTOU, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGURG, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGXCPU, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGXFSZ, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGVTALRM, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGPROF, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGWINCH, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGIO, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGPWR, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGSYS, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_2, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_3, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_4, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_5, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_6, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_7, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_8, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_9, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_10, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_11, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_12, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_13, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_14, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_15, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_16, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_17, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_18, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_19, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_20, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_21, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_22, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_23, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_24, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_25, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_26, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_27, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_28, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_29, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_30, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_31, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGRT_32, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGABRT, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGIO, NULL, {SIG_DFL, [], 0}, 😎 = 0 rt_sigaction(SIGSYS, NULL, {SIG_DFL, [], 0}, 😎 = 0 close(4) = 0 close(5) = 0 close(3) = 0 exit_group(255) = ? +++ exited with 255 +++

vachagan_gratian
3,097 Views

Hi @BharathSingh,

 

I would probably need Harvest logs to understand what's going on. The strace output seems to point at ZDE which maybe is happening in one of the plugins. But usually ZDE is an indication of something else going wrong.

 

Can you run the controller in verbose mode for ~20-30 minutes and share the logs with me?

BharathSingh
3,075 Views

here is the netapp-harvest log for that contoller

 

[2020-04-20 04:45:56] [WARNING] [workload] data-list poller refresh overdue; skipped [14] poll(s) from [2020-04-20 04:33:00] to [2020-04-20 04:46:00]
[2020-04-20 04:45:56] [WARNING] [workload_detail] data-list poller refresh overdue; skipped [14] poll(s) from [2020-04-20 04:33:00] to [2020-04-20 04:46:00]
[2020-04-20 04:46:02] [WARNING] [workload_volume] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:36:00] to [2020-04-20 04:47:00]
[2020-04-20 04:46:11] [WARNING] [workload_detail_volume] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:36:00] to [2020-04-20 04:47:00]
[2020-04-20 04:46:11] [WARNING] [cifs:node] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:36:00] to [2020-04-20 04:47:00]
[2020-04-20 04:46:11] [WARNING] [cifs:vserver] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:36:00] to [2020-04-20 04:47:00]
[2020-04-20 04:46:16] [WARNING] [copy_manager] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:37:00] to [2020-04-20 04:47:00]
[2020-04-20 04:47:20] [WARNING] [disk:constituent] update of data cache failed with reason: Timeout. Could not read API response.
[2020-04-20 04:47:20] [WARNING] [disk:constituent] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:38:00] to [2020-04-20 04:48:00]
[2020-04-20 04:47:20] [WARNING] [disk:constituent] data-list update failed.
[2020-04-20 04:47:20] [WARNING] [ext_cache_obj] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:38:00] to [2020-04-20 04:48:00]
[2020-04-20 04:47:20] [WARNING] [fcp_lif] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:38:00] to [2020-04-20 04:48:00]
[2020-04-20 04:47:25] [WARNING] [hostadapter] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:39:00] to [2020-04-20 04:48:00]
[2020-04-20 04:47:25] [WARNING] [iscsi_lif] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:39:00] to [2020-04-20 04:48:00]
[2020-04-20 04:47:33] [WARNING] [lif] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:40:00] to [2020-04-20 04:48:00]
[2020-04-20 04:48:45] [WARNING] [lun] update of counter_list cache failed with reason: Timeout. Could not read API response.
[2020-04-20 04:48:45] [WARNING] [lun] counter-list update failed.
[2020-04-20 04:50:07] [WARNING] [nfsv3] update of data cache failed with reason: Timeout. Could not read API response.
[2020-04-20 04:50:07] [WARNING] [nfsv3] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:41:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:07] [WARNING] [nfsv3] data-list update failed.
[2020-04-20 04:50:20] [WARNING] [nfsv3:node] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:41:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:20] [WARNING] [nfsv4] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:41:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:20] [WARNING] [nfsv4:node] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:41:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:20] [WARNING] [nfsv4_1] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:20] [WARNING] [nfsv4_1:node] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:25] [WARNING] [nic_common] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:31] [WARNING] [object_store_client_op] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:38] [WARNING] [offbox_vscan] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:42] [WARNING] [offbox_vscan_server] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:51:00]
[2020-04-20 04:50:55] [WARNING] [path] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:51:00]
[2020-04-20 04:51:04] [WARNING] [processor] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:42:00] to [2020-04-20 04:52:00]
[2020-04-20 04:51:35] [WARNING] [resource_headroom_aggr] update of instance_list cache failed with reason: Server returned HTTP Error:
[2020-04-20 04:51:35] [WARNING] [resource_headroom_aggr] instance-list update failed.
[2020-04-20 04:52:19] [WARNING] [resource_headroom_cpu] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:44:00] to [2020-04-20 04:53:00]
[2020-04-20 04:52:36] [WARNING] [system:node] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:44:00] to [2020-04-20 04:53:00]
[2020-04-20 04:52:49] [WARNING] [token_manager] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:44:00] to [2020-04-20 04:53:00]
[2020-04-20 04:54:02] [WARNING] [volume] update of counter_list cache failed with reason: Timeout. Could not read API response.
[2020-04-20 04:54:02] [WARNING] [volume] counter-list update failed.
[2020-04-20 04:55:04] [WARNING] [volume:node] update of counter_list cache failed with reason: Timeout. Could not read API response.
[2020-04-20 04:55:04] [WARNING] [volume:node] counter-list update failed.
[2020-04-20 04:55:08] [WARNING] [wafl] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:48:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:08] [WARNING] [wafl_comp_aggr_vol_bin] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:48:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:08] [WARNING] [wafl_hya_per_aggr] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:48:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:08] [WARNING] [wafl_hya_sizer] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:48:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:08] [WARNING] [workload] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:48:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:08] [WARNING] [workload_detail] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:48:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:14] [WARNING] [workload_volume] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:49:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:35] [WARNING] [workload_detail_volume] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:49:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:35] [WARNING] [cifs:node] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:49:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:35] [WARNING] [cifs:vserver] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:49:00] to [2020-04-20 04:56:00]
[2020-04-20 04:55:40] [WARNING] [copy_manager] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:49:00] to [2020-04-20 04:56:00]
[2020-04-20 04:56:38] [WARNING] [disk:constituent] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:50:00] to [2020-04-20 04:57:00]
[2020-04-20 04:56:38] [WARNING] [ext_cache_obj] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:50:00] to [2020-04-20 04:57:00]
[2020-04-20 04:56:38] [WARNING] [fcp_lif] data-list poller refresh overdue; skipped [8] poll(s) from [2020-04-20 04:50:00] to [2020-04-20 04:57:00]
[2020-04-20 04:57:30] [WARNING] [hostadapter] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:50:00] to [2020-04-20 04:58:00]
[2020-04-20 04:57:30] [WARNING] [iscsi_lif] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:50:00] to [2020-04-20 04:58:00]
[2020-04-20 04:57:33] [WARNING] [lif] data-list poller refresh overdue; skipped [9] poll(s) from [2020-04-20 04:50:00] to [2020-04-20 04:58:00]
[2020-04-20 04:58:40] [WARNING] [lun] instance-list poller refresh overdue; skipped [2020-04-20 04:58:10]
[2020-04-20 04:58:40] [WARNING] [lun] data-list poller refresh overdue; skipped [49] poll(s) from [2020-04-20 04:11:00] to [2020-04-20 04:59:00]
[2020-04-20 05:01:26] [WARNING] [nfsv3] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:02:00]
[2020-04-20 05:01:54] [WARNING] [nfsv3:node] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:02:00]
[2020-04-20 05:01:59] [WARNING] [nfsv4] data-list poller refresh overdue; skipped [10] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:02:00]
[2020-04-20 05:02:06] [WARNING] [nfsv4:node] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:03:00]
[2020-04-20 05:02:10] [WARNING] [nfsv4_1] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:03:00]
[2020-04-20 05:02:26] [WARNING] [nfsv4_1:node] data-list poller refresh overdue; skipped [11] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:03:00]
[2020-04-20 05:03:24] [WARNING] [nic_common] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:04:00]
[2020-04-20 05:03:24] [WARNING] [object_store_client_op] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:04:00]
[2020-04-20 05:03:24] [WARNING] [offbox_vscan] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:04:00]
[2020-04-20 05:03:24] [WARNING] [offbox_vscan_server] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:04:00]
[2020-04-20 05:03:24] [WARNING] [path] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:53:00] to [2020-04-20 05:04:00]
[2020-04-20 05:04:35] [WARNING] [processor] update of data cache failed with reason: Timeout. Could not read API response.
[2020-04-20 05:04:35] [WARNING] [processor] data-list poller refresh overdue; skipped [12] poll(s) from [2020-04-20 04:54:00] to [2020-04-20 05:05:00]
[2020-04-20 05:04:35] [WARNING] [processor] data-list update failed.
[2020-04-20 05:07:22] [WARNING] [resource_headroom_aggr] data-list poller refresh overdue; skipped [26] poll(s) from [2020-04-20 04:43:00] to [2020-04-20 05:08:00]
[2020-04-20 05:07:38] [WARNING] [resource_headroom_cpu] data-list poller refresh overdue; skipped [14] poll(s) from [2020-04-20 04:55:00] to [2020-04-20 05:08:00]
[2020-04-20 05:08:01] [WARNING] [system:node] data-list poller refresh overdue; skipped [15] poll(s) from [2020-04-20 04:55:00] to [2020-04-20 05:09:00]
[2020-04-20 05:08:25] [WARNING] [token_manager] update of data cache failed with reason: Server returned HTTP Error:
[2020-04-20 05:08:25] [WARNING] [token_manager] data-list poller refresh overdue; skipped [15] poll(s) from [2020-04-20 04:55:00] to [2020-04-20 05:09:00]
[2020-04-20 05:08:25] [WARNING] [token_manager] data-list update failed.
[2020-04-20 05:09:32] [WARNING] [volume] update of counter_list cache failed with reason: Timeout. Could not read API response.
[2020-04-20 05:09:32] [WARNING] [volume] counter-list update failed.
[2020-04-20 05:10:34] [WARNING] [volume:node] update of counter_list cache failed with reason: Server returned HTTP Error:
[2020-04-20 05:10:34] [WARNING] [volume:node] counter-list update failed.
[2020-04-20 05:10:52] [WARNING] [wafl] data-list poller refresh overdue; skipped [14] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:11:00]
[2020-04-20 05:11:02] [WARNING] [wafl_comp_aggr_vol_bin] data-list poller refresh overdue; skipped [15] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:12:00]
[2020-04-20 05:11:13] [WARNING] [wafl_hya_per_aggr] data-list poller refresh overdue; skipped [15] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:12:00]
[2020-04-20 05:11:18] [WARNING] [wafl_hya_sizer] data-list poller refresh overdue; skipped [15] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:12:00]
[2020-04-20 05:14:36] [WARNING] [workload] update of instance_list cache failed with reason: Timeout. Could not read API response.
[2020-04-20 05:14:36] [WARNING] [workload] instance-list update failed.
[2020-04-20 05:14:36] [WARNING] [workload_detail] data-list poller refresh overdue; skipped [18] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:15:00]
[2020-04-20 05:15:19] [WARNING] [workload_volume] data-list poller refresh overdue; skipped [19] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:16:00]
[2020-04-20 05:15:24] [WARNING] [workload_detail_volume] data-list poller refresh overdue; skipped [19] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:16:00]
[2020-04-20 05:15:25] [WARNING] [cifs:node] data-list poller refresh overdue; skipped [19] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:16:00]
[2020-04-20 05:15:34] [WARNING] [cifs:vserver] data-list poller refresh overdue; skipped [19] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:16:00]
[2020-04-20 05:18:10] [WARNING] [copy_manager] data-list poller refresh overdue; skipped [22] poll(s) from [2020-04-20 04:58:00] to [2020-04-20 05:19:00]
[2020-04-20 05:19:11] [WARNING] [disk:constituent] data-list poller refresh overdue; skipped [22] poll(s) from [2020-04-20 04:59:00] to [2020-04-20 05:20:00]
[2020-04-20 05:19:20] [WARNING] [ext_cache_obj] data-list poller refresh overdue; skipped [22] poll(s) from [2020-04-20 04:59:00] to [2020-04-20 05:20:00]
[2020-04-20 05:19:26] [WARNING] [fcp_lif] data-list poller refresh overdue; skipped [22] poll(s) from [2020-04-20 04:59:00] to [2020-04-20 05:20:00]
[2020-04-20 05:20:01] [WARNING] [hostadapter] data-list poller refresh overdue; skipped [22] poll(s) from [2020-04-20 05:00:00] to [2020-04-20 05:21:00]
[2020-04-20 05:20:13] [WARNING] [iscsi_lif] data-list poller refresh overdue; skipped [22] poll(s) from [2020-04-20 05:00:00] to [2020-04-20 05:21:00]

vachagan_gratian
2,980 Views

Hi sorry, that I am responding late, a bit though on schedule ...

 

Not seeing anything useful in the logs you shared, I think I would need the entire log file of the session (from the timestamp you started Harvest in verbose mode), any chance you can share that with me? I'll send you my email address in DM.

fwdalrymple
2,005 Views

Similar problem on newly added controller. Sorry to necro - did you get it fixed? Assume either you did or you abandoned Harvest?

 

strace shows the same division by zero error

harvest logs are similar in that they don't really show any useful error

harvest version is 1.6

OnTap version is 9.6p3

Hardware is AFF a220

 

The repeatability of the issue is pretty inconsistent. Sometimes it happens in the 2nd or 3rd round of gathering metrics, sometimes it lasts much longer.

 

I have the same exact harvest installation monitoring other AFF clusters with 9.6p3 so I'm not sure what gives here.

 

I'm happy to provide any logs you feel useful, but my problem looks pretty much identical to OPs.

 

Any help appreciated.

Public