Active IQ and AutoSupport Discussions
Active IQ and AutoSupport Discussions
Sun Sep 9 22:49:15 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.
I have this error message but nothing else happened..
with sysconfig -a and sysconfig -r all is fine.
what means this message?
many thanks,
Mauro.
"This message occurs when one of the nodes in a high-availability (HA) pair has reported this disk in its disk inventory, but the HA partner node has not. This might be due to one of following reasons: (1) One node can see the disk, but the other node cannot. (2) Ownership of the disk has changed. (3) The disk has either been failed or unfailed. (4) The disk has been inserted or removed."
What does storage show disk -p show? Any missing paths? How about disk show / disk show -n? Are all disks showing correct ownership?
These are all messages that i see:
netapp1:
Sun Sep 9 21:48:16 CEST [netapp1: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk 0d.01.2 (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.
Sun Sep 9 21:48:21 CEST [netapp1: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp1 (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) ERROR
netapp:
Sun Sep 9 21:41:21 CEST [netapp2: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp2 (CLIENT APP ALERT Backup Failure, Storage: SMVI SnapManager for Virtual Infrastru) CRITICAL
Sun Sep 9 21:48:16 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00175D0C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk.
Sun Sep 9 21:48:22 CEST [netapp2: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from netapp2 (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) ERROR
Mon Sep 10 20:17:37 CEST [netapp2: cf.disk.inventory.mismatch:CRITICAL]: Status of the disk ?.? (500605BA:00173764:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000) has recently changed or the node (netapp2) is missing the disk. (no autosupport fo this !!!)
What does storage show disk -p show? all good
Any missing paths? no
How about disk show / disk show -n? No disks match option -n
Are all disks showing correct ownership? yes
When you run disk show -v, is the count different between nodes? One node isn't seeing all the disks... definitely open a case as well since this affects failover and availability.
usually happens when a disk is taken offline for maintenance or has failed.
Check the status of aggr status -f and aggr status -m
sysconfig -c may help too.
it is a temporarry message..
i don't have errors.
In the autosupport generated (CLUSTER ERROR: DISK/SHELF COUNT MISMATCH) there aren't errors..
sysconfig -c , no errors, disk show -n : all ok.
sysconfig -a all disk are present
When temporary you will also see "mismatch resolved" messages too
Sent from my iPhone 4S
I don't have this kind of messages.. it's very strange case.
Check with "aggr status -f" if there are failed disks.
Maybe a disk is marked "to test" because has found several errors on it. First the system will reconstruct the raid group with one of the available spares, later will test the disk with two possible results: return the disk as spare or mark it as failed.
Run "sysconfig -r" and check the configuration of all raid groups. Review also the messages log and analyze suspicious messages just before the disk count mismatch alert.
Can I bump this... I have a 6080 in HA and I get the same error as the author.
Very strange. all disks are accounted for, all shelves are accounted for.
I ran sysconfg -a, sysconfig -r, environment status.
nothing looks suspicious.
I do notice that syslog shows one disk that has the "issue" and when I access it via storage show disk 6a.00.3 on one head I get a good response like this;
HEAD1
storage show disk 6a.00.3
Disk: 6a.00.3
Shelf: 0
Bay: 3
Serial: WD-WMAUR0407847
Vendor: NETAPP
Model: X306_WMANT02TSSM
Rev: NA01
RPM: 7200
WWN: --
UID: 500605BA:0014CDB0:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000
Downrev: no
Pri Port: B
Sec Name: 7a.00.3
Sec Port: A
Power-on Hours: N/A
Blocks read: 0
Blocks written: 0
Time interval: 00:00:00
Glist count: 0
Scrub last done: 00:00:00
Scrub count: 0
LIP count: 0
Dynamically qualified: No
Power cycle count: 0
Power cycle on error: 0
Current owner: ## - I took these out
Home owner:## I took these out
Reservation owner: ## I took these out
HEAD2
storage show disk 6a.00.3
Could not open disk "6a.00.3". --- What gives? I have a support case open with NetApp... so far they havent given me any solutions hat make sense, 1) fail the disk and get a replacement 2) reseat shelf modules.
Anyone seen this issue before and can comment ? thanks,
also, I tried to fail the disk to see what the results may be;
on HEAD1 -- the head that shows me disk information
disk fail 6a.00.3
get_disk_attributes: Stop reason: CR_OBJ_NOT_FOUND
disk fail: Disk 6a.00.3 not found
on HEAD2 - can pre-fail it and it actually sees it when its about to fail it..
Disk obviously have issues, why you say that replacement does not make sense? What would you expect?
Strange because disk is not failed or in maintenance. I expect a resolution that makes sense but sometimes things happen that don't make any sense. I'm going to get this drive replaced
Sent from mobile device, please excuse any typos.