Active IQ and AutoSupport Discussions

[cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000: 00000000:00000000:00000000: 00000000:00000000) has recently changed or the node (netapp2) is missing the disk

mauro
18,678 Views

Sun Sep  9 22:49:15 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.

 

 

I have this error message but nothing else happened..

 

with sysconfig -a and sysconfig -r all is fine.

 

what means this message?

 

many thanks,

Mauro.

12 REPLIES 12

GARDINEC_EBRD
18,578 Views

From http://support.netapp.com/eservice/ems?emsAction=details&eventId=253662&software=ontap&emsId=cf.disk.inventory.mismatch&emsversion=0

"This message occurs when one of the nodes in a high-availability (HA) pair has reported this disk in its disk inventory, but the HA partner node has not.  This might be due to one of following reasons: (1) One node can see the disk, but the other node cannot. (2) Ownership of the disk has changed. (3) The disk has either been failed or unfailed. (4) The disk has been inserted or removed."

What does storage show disk -p show?  Any missing paths? How about disk show / disk show -n?  Are all disks showing correct ownership?

mauro
18,578 Views

These are all messages that i see:

netapp1:

Sun Sep  9 21:48:16 CEST [netapp1: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk 0d.01.2 (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.

Sun Sep  9 21:48:21 CEST [netapp1: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp1 (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) ERROR

netapp:

Sun Sep  9 21:41:21 CEST [netapp2: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp2 (CLIENT APP ALERT Backup Failure, Storage: SMVI SnapManager for Virtual Infrastru) CRITICAL

Sun Sep  9 21:48:16 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.

Sun Sep  9 21:48:22 CEST [netapp2: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp2 (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) ERROR

Mon Sep 10 20:17:37 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00173764:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.  (no autosupport fo this !!!)

What does storage show disk -p show? all good

Any missing paths? no

How about disk show / disk show -n? No disks match option -n

Are all disks showing correct ownership? yes

scottgelb
18,578 Views

When you run disk show -v, is the count different between nodes?  One node isn't seeing all the disks... definitely open a case as well since this affects failover and availability.

gopinathp
18,578 Views

usually happens when a disk is taken offline for maintenance or has failed.

Check the status of aggr status -f and aggr  status -m

sysconfig -c may help too.

mauro
18,578 Views

it is a temporarry message..

i don't have errors.

In the autosupport generated (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) there aren't errors..

sysconfig -c , no errors, disk show -n : all ok.

sysconfig -a all disk are present

scottgelb
18,578 Views

When temporary you will also see "mismatch resolved" messages too

Sent from my iPhone 4S

mauro
18,578 Views

I don't have this kind of messages.. it's very strange case.

VMUNOZ_NTT
18,578 Views

Check with "aggr status -f" if there are failed disks.

Maybe a disk is marked "to test" because has found several errors on it. First the system will reconstruct the raid group with one of the available spares, later will test the disk with two possible results: return the disk as spare or mark it as failed.

Run "sysconfig -r" and check the configuration of all raid groups. Review also the messages log and analyze suspicious messages just before the disk count mismatch alert.

ANTON_MSSM
18,578 Views

Can I bump this... I have a 6080 in HA and I get the same error as the author.

Very strange. all disks are accounted for, all shelves are accounted for.

I ran sysconfg -a, sysconfig -r, environment status.

nothing looks suspicious.

I do notice that syslog shows one disk that has the "issue" and when I access it via storage show disk 6a.00.3 on one head I get a good response like this;

HEAD1

storage show disk 6a.00.3

Disk: 6a.00.3

Shelf: 0 

Bay: 3

Serial: WD-WMAUR0407847

Vendor: NETAPP 

Model: X306_WMANT02TSSM

Rev: NA01

RPM: 7200

WWN: --

UID: 500605BA:0014CDB0:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000

Downrev: no

Pri Port: B

Sec Name: 7a.00.3                

Sec Port: A

Power-on Hours: N/A

Blocks read: 0

Blocks written: 0

Time interval: 00:00:00

Glist count: 0

Scrub last done: 00:00:00

Scrub count: 0

LIP count: 0

Dynamically qualified: No

Power cycle count: 0

Power cycle on error: 0

Current owner: ## - I  took these out

Home owner:##  I  took these out

Reservation owner: ##  I  took these out

HEAD2

storage show disk 6a.00.3

Could not open disk "6a.00.3". --- What gives? I have a support case open with NetApp... so far they havent given me any solutions hat make sense, 1) fail the disk and get a replacement 2) reseat shelf modules.

Anyone seen this issue before and can comment ? thanks,

ANTON_MSSM
11,270 Views

also, I tried to fail the disk to see what the results may be;

on HEAD1 -- the head that shows me disk information

disk fail 6a.00.3       

get_disk_attributes: Stop reason: CR_OBJ_NOT_FOUND

disk fail: Disk 6a.00.3 not found

on HEAD2 - can pre-fail it and it actually sees it when its about to fail it..

aborzenkov
11,270 Views

Disk obviously have issues, why you say that replacement does not make sense? What would you expect?

ANTON_MSSM
11,270 Views

Strange because disk is not failed or in maintenance. I expect a resolution that makes sense but sometimes things happen that don't make any sense. I'm going to get this drive replaced

Sent from mobile device, please excuse any typos.

Public