Re: [cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:000000...

mauro · ‎2012-09-11

Sun Sep 9 22:49:15 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.

I have this error message but nothing else happened..

with sysconfig -a and sysconfig -r all is fine.

what means this message?

many thanks,

Mauro.

GARDINEC_EBRD · ‎2012-09-11

From http://support.netapp.com/eservice/ems?emsAction=details&eventId=253662&software=ontap&emsId=cf.disk.inventory.mismatch&emsversion=0

"This message occurs when one of the nodes in a high-availability (HA) pair has reported this disk in its disk inventory, but the HA partner node has not. This might be due to one of following reasons: (1) One node can see the disk, but the other node cannot. (2) Ownership of the disk has changed. (3) The disk has either been failed or unfailed. (4) The disk has been inserted or removed."

What does storage show disk -p show? Any missing paths? How about disk show / disk show -n? Are all disks showing correct ownership?

mauro · ‎2012-09-11

These are all messages that i see:

netapp1:

Sun Sep 9 21:48:16 CEST [netapp1: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk 0d.01.2 (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.

Sun Sep 9 21:48:21 CEST [netapp1: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp1 (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) ERROR

netapp:

Sun Sep 9 21:41:21 CEST [netapp2: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp2 (CLIENT APP ALERT Backup Failure, Storage: SMVI SnapManager for Virtual Infrastru) CRITICAL

Sun Sep 9 21:48:16 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.

Sun Sep 9 21:48:22 CEST [netapp2: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp2 (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) ERROR

Mon Sep 10 20:17:37 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00173764:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk. (no autosupport fo this !!!)

What does storage show disk -p show? all good

Any missing paths? no

How about disk show / disk show -n? No disks match option -n

Are all disks showing correct ownership? yes

scottgelb · ‎2012-09-11

When you run disk show -v, is the count different between nodes? One node isn't seeing all the disks... definitely open a case as well since this affects failover and availability.

gopinathp · ‎2012-09-11

usually happens when a disk is taken offline for maintenance or has failed.

Check the status of aggr status -f and aggr status -m

sysconfig -c may help too.

mauro · ‎2012-09-11

it is a temporarry message..

i don't have errors.

In the autosupport generated (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) there aren't errors..

sysconfig -c , no errors, disk show -n : all ok.

sysconfig -a all disk are present

scottgelb · ‎2012-09-11

When temporary you will also see "mismatch resolved" messages too

Sent from my iPhone 4S

mauro · ‎2012-09-11

I don't have this kind of messages.. it's very strange case.

VMUNOZ_NTT · ‎2012-09-11

Check with "aggr status -f" if there are failed disks.

Maybe a disk is marked "to test" because has found several errors on it. First the system will reconstruct the raid group with one of the available spares, later will test the disk with two possible results: return the disk as spare or mark it as failed.

Run "sysconfig -r" and check the configuration of all raid groups. Review also the messages log and analyze suspicious messages just before the disk count mismatch alert.

ANTON_MSSM · ‎2013-05-20

Can I bump this... I have a 6080 in HA and I get the same error as the author.

Very strange. all disks are accounted for, all shelves are accounted for.

I ran sysconfg -a, sysconfig -r, environment status.

nothing looks suspicious.

I do notice that syslog shows one disk that has the "issue" and when I access it via storage show disk 6a.00.3 on one head I get a good response like this;

HEAD1

storage show disk 6a.00.3

Disk: 6a.00.3

Shelf: 0

Bay: 3

Serial: WD-WMAUR0407847

Vendor: NETAPP

Model: X306_WMANT02TSSM

Rev: NA01

RPM: 7200

WWN: --

UID: 500605BA:0014CDB0:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000

Downrev: no

Pri Port: B

Sec Name: 7a.00.3

Sec Port: A

Power-on Hours: N/A

Blocks read: 0

Blocks written: 0

Time interval: 00:00:00

Glist count: 0

Scrub last done: 00:00:00

Scrub count: 0

LIP count: 0

Dynamically qualified: No

Power cycle count: 0

Power cycle on error: 0

Current owner: ## - I took these out

Home owner:## I took these out

Reservation owner: ## I took these out

HEAD2

storage show disk 6a.00.3

Could not open disk "6a.00.3". --- What gives? I have a support case open with NetApp... so far they havent given me any solutions hat make sense, 1) fail the disk and get a replacement 2) reseat shelf modules.

Anyone seen this issue before and can comment ? thanks,

ANTON_MSSM · ‎2013-05-20

also, I tried to fail the disk to see what the results may be;

on HEAD1 -- the head that shows me disk information

disk fail 6a.00.3

get_disk_attributes: Stop reason: CR_OBJ_NOT_FOUND

disk fail: Disk 6a.00.3 not found

on HEAD2 - can pre-fail it and it actually sees it when its about to fail it..

aborzenkov · ‎2013-05-20

Disk obviously have issues, why you say that replacement does not make sense? What would you expect?

ANTON_MSSM · ‎2013-05-21

Strange because disk is not failed or in maintenance. I expect a resolution that makes sense but sometimes things happen that don't make any sense. I'm going to get this drive replaced

Sent from mobile device, please excuse any typos.

[cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000: 00000000:00000000:00000000: 00000000:00000000) has recently changed or the node (netapp2) is missing the disk