Community

Subscribe
Highlighted
Accepted Solution

CFOD not initiated, but why?

Hi,

I'm just running some final test with TB 2.1.1 and OCUM 5.2, but can't figure out what went wrong. I shutdown ProblemFiler2 and after some timeouts TB says that ProblemFiler1 has been too long in degraded status and disables itself. I have another MC/TB configuration where everyting is done right on same kind of hardware fail.

2013-08-27 16:19:10,180 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:19:10,180 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:19:10,180 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:19:10,190 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status CONNECTED, Enabled: true, IC: true

2013-08-27 16:19:10,190 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:19:10,210 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler2: cf-status CONNECTED, Enabled: true, IC: true

2013-08-27 16:19:10,210 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking interconnects

2013-08-27 16:19:10,210 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:19:10,210 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:19:10,240 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:19:10,240 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:19:10,240 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirrored

2013-08-27 16:19:10,240 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirrored

2013-08-27 16:19:10,240 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:19:10,240 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:19:10,270 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler2 loading aggregate mirror status

2013-08-27 16:19:10,270 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirrored

2013-08-27 16:19:10,270 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas2: mirrored

2013-08-27 16:19:10,270 [ProblemFiler1-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.10:80

2013-08-27 16:19:10,270 [ProblemFiler1-conn-3] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.10:445

2013-08-27 16:19:10,270 [ProblemFiler1-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.10:80

2013-08-27 16:19:10,270 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:19:10,290 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler2: vol0

2013-08-27 16:19:10,290 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:19:10,310 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:19:10,320 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler2: aggr0

2013-08-27 16:19:10,320 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler1

2013-08-27 16:19:10,320 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:19:10,320 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: CONNECTED

2013-08-27 16:19:10,320 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: true

2013-08-27 16:19:10,320 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: false

2013-08-27 16:19:10,330 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:19:10,330 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:19:10,330 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:19:10,340 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:19:10,340 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:19:10,360 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: CONNECTED

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: true

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: false

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:19:10,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler2

2013-08-27 16:19:20,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:19:20,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:19:20,370 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:19:20,380 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status TAKEOVER_STARTED, Enabled: true, IC: false

2013-08-27 16:19:20,380 [MetroCluster1-monitor] WARN  com.netapp.rre.anegada.Monitor - ProblemFiler1 in a take-over state, skipping remaining checks.

2013-08-27 16:19:30,390 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:19:30,390 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:19:30,390 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:19:30,400 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status TAKEOVER_STARTED, Enabled: true, IC: false

2013-08-27 16:19:30,400 [MetroCluster1-monitor] WARN  com.netapp.rre.anegada.Monitor - ProblemFiler1 in a take-over state, skipping remaining checks.

2013-08-27 16:19:40,399 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:19:40,399 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:19:40,399 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:19:40,409 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status TAKEOVER_STARTED, Enabled: true, IC: false

2013-08-27 16:19:40,409 [MetroCluster1-monitor] WARN  com.netapp.rre.anegada.Monitor - ProblemFiler1 in a take-over state, skipping remaining checks.

2013-08-27 16:19:50,409 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:19:50,409 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:19:50,409 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:19:50,419 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:19:50,419 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:19:55,439 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:19:55,439 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:19:55,439 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:19:55,489 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:19:55,489 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:19:55,489 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:19:55,489 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:19:55,489 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:19:55,489 [ProblemFiler2-conn-3] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:445

2013-08-27 16:19:55,489 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:19:55,489 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:19:55,509 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:19:55,509 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:19:55,529 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:19:55,539 [MetroCluster1-monitor] WARN  com.netapp.rre.bautils.DfmCommand - DFM_EVENT MetroCluster-TieBreakerSmiley Very Happyegradation-Event:Mirror-Degradation, source: ProblemFiler1: Mirror degradations have been detected on filer ProblemFiler1

2013-08-27 16:19:55,539 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.DfmCommand - Executing: dfm event generate MetroCluster-TieBreakerSmiley Very Happyegradation-Event:Mirror-Degradation ProblemFiler1 Mirror degradations have been detected on filer ProblemFiler1

2013-08-27 16:20:06,558 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:20:06,558 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:20:06,558 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:06,568 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:20:06,568 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:11,568 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:20:11,568 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:20:11,568 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:20:11,588 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:20:11,588 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:20:11,588 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:20:11,588 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:20:11,588 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:20:11,588 [ProblemFiler2-conn-3] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:445

2013-08-27 16:20:11,598 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:20:11,598 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:20:11,608 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:20:11,608 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:20:11,628 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:20:11,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:20:21,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:20:21,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:20:21,648 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:21,658 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:20:21,658 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:26,667 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:20:26,667 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:20:26,667 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:20:26,687 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:20:26,697 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:20:26,697 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:20:26,697 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:20:26,697 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:20:26,697 [ProblemFiler2-conn-2] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:2049

2013-08-27 16:20:26,697 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:20:26,697 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:20:26,717 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:20:26,717 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:20:26,737 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:20:26,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:20:36,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:20:36,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:20:36,757 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:36,767 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:20:36,767 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:41,767 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:20:41,767 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:20:41,767 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:20:41,787 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:20:41,787 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:20:41,787 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:20:41,787 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:20:41,797 [ProblemFiler2-conn-2] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:2049

2013-08-27 16:20:41,797 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:20:41,797 [ProblemFiler2-conn-3] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:445

2013-08-27 16:20:41,797 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:20:41,797 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:20:41,807 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:20:41,807 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:20:41,827 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:20:41,847 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:20:51,846 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:20:51,846 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:20:51,846 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:51,856 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:20:51,856 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:20:56,866 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:20:56,866 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:20:56,866 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:20:56,896 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:20:56,896 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:20:56,896 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:20:56,896 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:20:56,896 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:20:56,896 [ProblemFiler2-conn-2] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:2049

2013-08-27 16:20:56,896 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:20:56,896 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:20:56,916 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:20:56,916 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:20:56,936 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:20:56,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:21:06,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:21:06,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:21:06,946 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:21:06,956 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:21:06,956 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:21:11,966 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:21:11,966 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:21:11,966 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:21:11,986 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:21:11,996 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:21:11,996 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:21:11,996 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:21:11,996 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:21:11,996 [ProblemFiler2-conn-3] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:445

2013-08-27 16:21:11,996 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:21:11,996 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:21:12,006 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:21:12,006 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:21:12,026 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:21:12,046 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:21:22,045 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:21:22,045 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:21:22,045 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:21:22,055 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:21:22,055 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:21:27,065 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:21:27,065 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:21:27,065 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:21:27,085 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:21:27,085 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:21:27,085 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:21:27,085 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:21:27,085 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:21:27,085 [ProblemFiler2-conn-2] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:2049

2013-08-27 16:21:27,095 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:21:27,095 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:21:27,105 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:21:27,105 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:21:27,125 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:21:27,145 [MetroCluster1-monitor] WARN  com.netapp.rre.bautils.DfmCommand - DFM_EVENT MetroCluster-TieBreakerSmiley Very Happyegradation-Event:Threshold-Exceeded, source: ProblemFiler1: Automatic CFOD disabled due to mirror degredation threshold (cfodAbortTimeout: 90000) exceeded on filer ProblemFiler1

2013-08-27 16:21:27,145 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.DfmCommand - Executing: dfm event generate MetroCluster-TieBreakerSmiley Very Happyegradation-Event:Threshold-Exceeded ProblemFiler1 Automatic CFOD disabled due to mirror degredation threshold (cfodAbortTimeout: 90000) exceeded on filer ProblemFiler1

2013-08-27 16:21:38,014 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:21:38,014 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:21:38,014 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:21:38,024 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:21:38,024 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:21:43,024 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

2013-08-27 16:21:43,024 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Getting aggr status for ProblemFiler1

2013-08-27 16:21:43,024 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.aggr.AggrListInfoRequest

2013-08-27 16:21:43,054 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 loading aggregate mirror status

2013-08-27 16:21:43,054 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sata1: unmirrored (ignored)

2013-08-27 16:21:43,054 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr0: mirror degraded

2013-08-27 16:21:43,054 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer -     aggr_sas1: mirror degraded

2013-08-27 16:21:43,054 [ProblemFiler2-conn-2] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:2049

2013-08-27 16:21:43,054 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:21:43,054 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

2013-08-27 16:21:43,054 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeGetRootNameRequest

2013-08-27 16:21:43,074 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root volume for ProblemFiler1: vol0

2013-08-27 16:21:43,074 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeListInfoRequest

2013-08-27 16:21:43,084 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.volume.VolumeContainerRequest

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - Found root aggr for ProblemFiler1: aggr0

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking CFOD for filer ProblemFiler2

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    reachable: true

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    survivor CF state: ERROR

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Survivor IC: false

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor -    Root Aggr Mirror Degraded: true

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking mirror degradation on filer ProblemFiler1

2013-08-27 16:21:43,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1 aggregate aggr_sas1 mirror degraded

2013-08-27 16:21:53,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Checking site reachability

2013-08-27 16:21:53,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Monitor - Updating filer status

2013-08-27 16:21:53,104 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

2013-08-27 16:21:53,114 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - ProblemFiler1: cf-status ERROR, Enabled: true, IC: false

2013-08-27 16:21:53,114 [MetroCluster1-monitor] DEBUG com.netapp.rre.bautils.ZapiExecutor - Executing ZAPI: com.netapp.ontap.api.cf.CfStatusRequest

//KK

Re: CFOD not initiated, but why?

Hi KK,

Do you have a "mctb.properties" file in the MCTB installation or configuration directory?  If so, can you report what it contains?   Also, does the log file, or any past log file contain the most recent startup log entries (I'm looking for the initial connection to the controllers, which has a log entry that begins with "Connected to").

The reason MCTB never issued the CFOD was because ProblemFiler2  continued to be reachable via TCP connect all the way until cfodAbortTimeout was exceeded (see the cfodAbortTimeout configuration parameter description in the Admin Guide):

2013-08-27 16:21:27,085 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:80

2013-08-27 16:21:27,085 [ProblemFiler2-conn-2] DEBUG com.netapp.rre.bautils.NetUtils - Opening connection to: 10.10.10.11:2049

2013-08-27 16:21:27,095 [ProblemFiler2-conn-1] DEBUG com.netapp.rre.bautils.NetUtils - Successful connect to: 10.10.10.11:80

Is it possible that the two controller IP addresses are switched in the MCTB configuration file?  I've done that before and had similar behaviors.

Another log entry shows an extremely small timeout for API communications:

2013-08-27 16:21:27,065 [MetroCluster1-monitor] ERROR com.netapp.rre.anegada.Filer - ProblemFiler2: ApiTimeoutException on com.netapp.ontap.api.cf.CfStatusRequest: ZAPI call cf-status to 10.10.10.11 timed out after 0:00:05.000

The API timeout should be on the order of 90 seconds, not 5, so I'm curious about that (although this wouldn't have caused MCTB to fail to issue a CFOD).

Thanks,

Brian

Re: CFOD not initiated, but why?

Hi,

There is not "mctb.properties" file. What this file should contain?

I have double checked IP addresses and everything seems to be in order. We where wondering if there is some router, intruder dectection system or some other network equipment to fool MCTB to think that TCP port is answering. We can't be sure, but it's unlikely. 

Also I'm wondering why MCTB documentation doesn't mention anything about connection tests into TCP ports 80,445 and 2049.

I don't know how to set timeout for API calls. I try to adjust all other timeouts in my test MCTB, but without any effect. It's stays on 10 sec, not in 5 sec like in production MCTB. So I don't have any clue why production MCTB is using 5 sec timeout.

I can create new TB by myself using Python, but first I have to completly test if Netapp MCTB is usable or not

Thanks for you help,

KK

Re: CFOD not initiated, but why?

I just downloaded the 2.1.1 installation package and installed it from scratch, and it indeed had the incorrect timeout values.  I'm not sure what happened, but I think a previous revision somehow became current.   I have uploaded the correct installation packages for both Windows and Linux.   I downloaded and installed the new package and confirmed that it has the correct timeout values.   I recommend you download a fresh copy from the Community and reinstall to get the correct API timeout values.   I apologize for the inconvenience.   The values in the Connected to log message should confirm timeout values of 60000 and 60000:

2013-08-29 13:16:41,597 [MetroCluster1-monitor] DEBUG com.netapp.rre.anegada.Filer - MCController1: Connecting using 10.61.166.216, connTimeout: 60000, readTimeout: 60000

The mctb.properties file is optional, and provides a way to override some internal configuration values. including the API read and connect timeouts.

I apologize for the lack of detail in the documentation on the TCP connect test.  As you can see in the log file, the controller reachability test is performed by attempting to connect, in parallel, on the TCP ports for HTTP (80), CIFS (445) and NFS (2049).  If any one of those TCP ports responds, the controller is considered reachable (and CFOD will not be issued).

I don't know why the port on that address continued to be reachable even after the controller was halted.

Re: CFOD not initiated, but why?

Hi,

I did fresh install with fixed version and now TB is working as expected. Great!!!

But there is still one problem after server reboot. MCTB service is set for Delayed Start and I can see that service is running after reboot, but MCTB doesn't start to insert any entrys into log file. I have now been waiting about 10 minutes for automatic start but nothing happened. Documentation actually doesn't say that MCTB is able to start automaticly after server reboot. Is it planned to start automatically?

I also rebooted server two times and every time MCTB logs proper lines like "Stop called, terminating", but nothing about starting.

---edit---

I did more reboot tests and MCTB is not starting propelly. I even did script whitch run command "c:\<path to bins>\dfm.exe script job start -p start -j autostart 450" after server reboot, but only MCTB service is started, but nothing is logged. 450 is object id of TieBreaker script. Also job status appears in DFM job status list with Success status.

But... If run Stop command while service is running (but not nothing shown in log file) and after that rerun start job MCTB may start..or not

//KK

Re: CFOD not initiated, but why?

Make sure to rerun the "install" script command.  Even though the service exists, it appears to be stale if you download and install the plugin again.

Re: CFOD not initiated, but why?

Hi,

I'm testing MCTB with my test environmenr and here is steps I did:

  1. Run UNINSTALL command and removed old broken MCTB. Also I checked that all files are deleted from script-plugin directory.
  2. Installed new version. Run INSTALL and RUN commands. MCTB starts to enter entrys into log. Everything ok.
  3. 1st server reboot. Everything ok.
  4. 2nd server reboot. Service running, but not new entrys in log. Running START results "An instance of the service is already running.". After STOP and RUN commands MCTB starts to enter entrys into log.
  5. 3rd to 9th server reboot. Everyting ok.
  6. 10th server reboot. Service running, but no entrys in log. Running STATUS results "STATE RUNNING". Running START results "An instance of the service is already running.". After STOP and RUN commands MCTB starts to enter entrys into log.
  7. 10th server reboot. Everything ok.
  8. 11th to 15th server reboot. Everything ok.
  9. 15th server reboot. Service running. Restarting MCTB service starts MCTB to enter entrys into log.
  10. 16th to 18th server reboot.
  11. 19th server reboot. Same procedure like on step 4.

So I rebooted server 20 times and MCTB won't start 4 times. Not so good

Is there anyway to get more debug data from MCTB?

//KK

Re: CFOD not initiated, but why?

Hi KK,

If you are using Windows, make sure the MCTB service startup is set to "Automatic (Delayed)".  Apparently, Windows has some internal dependencies (networking, I believe), that require delayed start for some services.

Brian

Re: CFOD not initiated, but why?

Hi,

I have used delayed start all in my tests. Do you know any way get more debug information for this problem?

//KK

Re: CFOD not initiated, but why?

There should be stdin and stdout logfiles from the service wrapper that runs MCTB (mctb-stdin-yyyy-mm-dd.log, etc.).  There might be some information there.  I'm not sure if there is a way to get more detailed logging on service startup from Windows,  You might check the various Windows event logs and see if anything is being reported.

The service wrapper (mctb.exe) is the Apache Daemon wrapper that Tomcat uses (http://commons.apache.org/proper/commons-daemon/procrun.html), which is very stable. I did find this http://stackoverflow.com/questions/11632275/apache-procrun-fails-to-start-service, but the version of log4j being used in MCTB is 1.2.15, and has been that version for over a year now.  None of the other jars mentioned in that article are in use by MCTB, so I don't think that is the same issue.

You might try manually uninstalling and reinstalling the Windows service (remember to manually reset startup to "Automatic (Delayed)").  The instserv.bat script can do this from the command line:

.\instserv.bat uninstall

.\instserv.bat install

Unfortunately, if the mctb.exe process is running and has opened the log file, but isn't producing any log entries, something has probably happened during service startup.  The only time I've seen this behavior is when startup is set to Automatic without the delayed start.  

Sorry I don't have a better answer for you.  Of course, let me know if you find anything in the logs.