ONTAP Hardware

not equal number of spares

lukasz_borek
2,537 Views

Hi,

What can caouse such situation : FAS 3160 (A/A + syncmirror)

filerA> vol status -s

Pool1 sparedisks

RAIDDisk       Device  HA  SHELF BAY CHANPool Type  RPM  Used (MB/blks)    Phys (MB/blks)

---------      ------  ------------- ---- ---- ---- -------------------    --------------

Spare disksfor block or zoned checksum traditional volumes or aggregates

spare          2b.45   2b    2   13  FC:A  1  FCAL 15000 418000/856064000  420156/860480768

Pool0 sparedisks

RAIDDisk       Device  HA  SHELF BAY CHANPool Type  RPM  Used (MB/blks)    Phys (MB/blks)

---------      ------  ------------- ---- ---- ---- -------------------    --------------

Spare disksfor block or zoned checksum traditional volumes or aggregates

spare          2a.61   2a    3   13  FC:A  0  FCAL 15000 418000/856064000  420156/860480768

filerB> vol status -s

Pool1 sparedisks

RAIDDisk       Device  HA  SHELF BAY CHANPool Type  RPM  Used (MB/blks)    Phys (MB/blks)

---------      ------  ------------- ---- ---- ---- -------------------    --------------

Spare disksfor block or zoned checksum traditional volumes or aggregates

spare          1c.29   1c    1   13  FC:A  1  FCAL 15000 418000/856064000  420156/860480768

spare          1c.45   1c    2   13  FC:A  1  FCAL 15000 418000/856064000  420156/860480768

spare          1c.58   1c    3   10  FC:A  1  FCAL 15000 418000/856064000  420156/860480768

spare          2c.28   2c    1   12  FC:B  1  FCAL 15000 418000/856064000  420156/860480768

spare          2c.37   2c    2   5  FC:B   1  FCAL 15000 418000/856064000  420156/860480768

Pool0 sparedisks

RAIDDisk       Device  HA  SHELF BAY CHANPool Type  RPM  Used (MB/blks)    Phys (MB/blks)

---------      ------  ------------- ---- ---- ---- -------------------    --------------

Spare disksfor block or zoned checksum traditional volumes or aggregates

spare          1d.29   1d    1   13  FC:A  0  FCAL 15000 418000/856064000  420156/860480768

spare          1d.45   1d    2   13  FC:A  0  FCAL 15000 418000/856064000  420156/860480768

spare          1d.61   1d    3   13  FC:A  0  FCAL 15000 418000/856064000  420156/860480768

spare          1d.77   1d    4   13  FC:A  0  FCAL 15000 418000/856064000  420156/860480768

spare          2d.76   2d    4   12  FC:B  0  FCAL 15000 418000/856064000  420156/860480768

filerA> vol status -f

Broken disks (empty)

filerB> vol status -f

Broken disks (empty)

filerA> disk show -v | grep -ifail

filerB> disk show -v | grep -ifail

1c.58       filerB(151707439)   FAILED 3QQ11KNX00009940GUDH [wtf?]

But :

filerA> disk show -v 1c.58

  DISK      OWNER                 POOL   SERIAL NUMBER

-------------------------          ----- -------------

1c.58       filerB (151707439)   FAILED 3QQ11KNX00009940GUDH


filerB> disk show -v 1c.58

  DISK      OWNER                 POOL   SERIAL NUMBER

-------------------------          ----- -------------

1c.58       filerB(151707439)   Pool1  3QQ11KNX00009940GUDH

My understanding was  that both nodes should have same spares visable? Why one filer shows disk 1c.58 as failed and second as healthy?

1 REPLY 1

Darkstar
2,537 Views

I can only answer part of your question:

Each filer has its own spare disks. So it's perfectly normal to see different spare counts for each filer.

However, why the partner filer still thinkgs that 1c.58 is FAILED although the owning filer (filerB) sees the disk as working is beyond me. Maybe a simple "disk fail" followed by "disk unfail" (in diag mode) on filerB can resolve that issue. Otherwise I would open a case with NetApp if it bothers you too much. But as long as the owning filer sees the disk as usable (and not the other way round) it's normally not a problem

-Michael

Public