ONTAP Discussions

node not booting. probably disk ownership issue

jeddgo123
4,414 Views

Hello all

need some assistance. got handed a 4-node cluster (old). it is a 6020 system. node-1 is down, cannot boot back online due to errors below

Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1a.00.23 (S/N XXVER7AA) is supposed to be owned by th
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.11.19 (S/N S410EAVG) is supposed to be owned by th
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.12.0 (S/N KPVJGY4F) is supposed to be owned by thi
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.11.0 (S/N KPH2869F) is supposed to be owned by thi
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.10.0 (S/N KPVKRXHF) is supposed to be owned by thi
Jun 13 13:22:59 [b125c1fm1-01:wafl.memory.status:info]: 34689MB of memory is currently available for the WAFL file system.
WARNING: 0 disks found!

 

i tried the following

1. halted node-1, and also halted node-2. cluster is a test cluster

2. rebooted node-1 in hopes that it will take over its own disks. this did not work.

3. brought both nodes back online

4. tried assigning disk ownership from node-2, but get the below

b125c1fm1-02(takeover)> disk assign 1c.12.0 -s 1874201084
Assign request failed for disk 1b.12.0. Reason:Disk is a file system disk and part of an online aggregate. Changing its owner may cause aggregate or filer outage. Disk assign request failed.
b125c1fm1-02(takeover)> disk assign 1c.11.0 -s 1874201084
Assign request failed for disk 1c.11.0. Reason:Disk is a file system disk and part of an online aggregate. Changing its owner may cause aggregate or filer outage. Disk assign request failed.
b125c1fm1-02(takeover)> disk assign 1c.10.0 -s 1874201084
Assign request failed for disk 1c.10.0. Reason:Disk is a file system disk and part of an online aggregate. Changing its owner may cause aggregate or filer outage. Disk assign request failed.

 

i'm out of ideas. any other suggestions that we can try. after i can get node-1 online, i have to reconfigure whole cluster ni best practice. like i said, just inherited this mess.

 

THanks everyone. appreciate the help.

4 REPLIES 4

jeddgo123
4,438 Views

Hello all

recently inherited a 4-node cluster

when doing health check on the cluster. found that node-1 is down in boot prompt. attempted to boot and got the below error

Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1a.00.23 (S/N XXVER7AA) is supposed to be owned by th
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.11.19 (S/N S410EAVG) is supposed to be owned by th
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.12.0 (S/N KPVJGY4F) is supposed to be owned by thi
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.11.0 (S/N KPH2869F) is supposed to be owned by thi
Jun 13 13:22:56 [b125c1fm1-01:diskown.ownerReservationMismatch:warning]: disk 1c.10.0 (S/N KPVKRXHF) is supposed to be owned by thi
Jun 13 13:22:59 [b125c1fm1-01:wafl.memory.status:info]: 34689MB of memory is currently available for the WAFL file system.
WARNING: 0 disks found!

 

 

 

tried

1. shutdown node-2, to see if reboot node-1 will take over its own disks. this did not work

2. restarted node-2 and attempted to re-assign disks to node-1. this did not work. node-2 is in takeover mode. (should i inhibit takeover prior to re-assigning disks)

3. during re-assigning of disks to node-1, received the below error

b125c1fm1-02(takeover)> disk assign 1c.11.19 -s 1874201084
b125c1fm1-02(takeover)> disk assign 1c.12.0 -s 1874201084
Assign request failed for disk 1b.12.0. Reason:Disk is a file system disk and part of an online aggregate. Changing its owner may cause aggregate or filer outage. Disk assign request failed.
b125c1fm1-02(takeover)> disk assign 1c.11.0 -s 1874201084
Assign request failed for disk 1c.11.0. Reason:Disk is a file system disk and part of an online aggregate. Changing its owner may cause aggregate or filer outage. Disk assign request failed.
b125c1fm1-02(takeover)> disk assign 1c.10.0 -s 1874201084
Assign request failed for disk 1c.10.0. Reason:Disk is a file system disk and part of an online aggregate. Changing its owner may cause aggregate or filer outage. Disk assign request failed.

 

any thoughts would be welcome. on how node-1 can take back the disks. I have not tried any commands in priv setting. NOt sure if this would make a difference

 

Thanks All

GidonMarcus
4,357 Views

Hi

 

by the way it look. your 02 has taken over (01 i assume). you need to issue a storage giveback command from 02 and it will release the disks to the other node.

i'll just make it clear. i have very limited view on your environment. and most of the messages i see here are expected during a boot in takeover mode. any advise you receive on the community is provided as-is and without any guarantee to success or it can maybe make things worse. you should evaluate advises receive over the platform independently.

 

Gidi

Gidi Marcus (Linkedin) - Storage and Microsoft technologies consultant - Hydro IT LTD - UK

jeddgo123
4,337 Views

HI Gidi

thanks. I do understand that anything suggested here is as-is and no support.

Will try that. My other question is

is there some kind of logs on partner node that i can look at why the takeover took place to begin with?

Thanks again

GidonMarcus
4,232 Views

hi

 

if you access the filer via cifs you have \\file\etc$\messages files. note that they tend to truncate frequently. you may have older ones in volume snapshots for vol0

 

Gidi

Gidi Marcus (Linkedin) - Storage and Microsoft technologies consultant - Hydro IT LTD - UK
Public