ONTAP Hardware

CRC-Error Messages seen since Ontap Upgrade to 9.3P7

klmi
27,908 Views

Dear Community,

 

since we have updated some of our systems (FAS82xx, AFF A300) to Ontap 9.3P7,

we see the following Errors in Messages for our (UTA2) 10GBit LAN-Ports, which have not been here before the update.

10/16/2018 11:04:14 <Filer>-<node>   ALERT         vifmgr.cluscheck.crcerrors: Port e0g on node <Filer>-<Node> is reporting a high number of observed hardware errors, possibly CRC errors.

 

ifstat shows TotalErrors (increasing) and Errors/Minute but no CRC-Errors

-- interface  e0g  (1 hour, 42 minutes, 32 seconds) --

RECEIVE
 Total frames:    56798k | Frames/second:    9233  | Total bytes:       178g
 Bytes/second:    28949k | Total errors:     1337  | Errors/minute:      13
 Total discards:      2  | Discards/minute:     0  | Multi/broadcast: 31503
 Non-primary u/c:     0  | CRC errors:          0  | Runt frames:        18
 Fragment:            0  | Long frames:      1319  | Alignment errors:    0
 No buffer:           2  | Pause:               0  | Jumbo:               0
 Noproto:           105  | Bus overruns:        0  | LRO segments:    50798k
 LRO bytes:         174g | LRO6 segments:       0  | LRO6 bytes:          0
 Bad UDP cksum:       0  | Bad UDP6 cksum:      0  | Bad TCP cksum:       0
 Bad TCP6 cksum:      0  | Mcast v6 solicit:    0
TRANSMIT
 Total frames:    16298k | Frames/second:    2649  | Total bytes:     11749m
 Bytes/second:     1909k | Total errors:        0  | Errors/minute:       0
 Multi/broadcast:   605  | Pause:               0  | Jumbo:            6655k
 Cfg Up to Downs:     0  | TSO non-TCP drop:    0  | Split hdr drop:      0
 Timeout:             0  | TSO segments:      840k | TSO bytes:        9910m
 TSO6 segments:       0  | TSO6 bytes:          0  | HW UDP cksums:       0
 HW UDP6 cksums:      0  | HW TCP cksums:       0  | HW TCP6 cksums:      0
 Mcast v6 solicit:    0
DEVICE
 Mcast addresses:     4  | Rx MBuf Sz:       4096
LINK INFO
 Speed:           10000m | Duplex:            full | Flowcontrol:       none
 Media state:     active | Up to downs:          2

 

From my feeling it looks like a BUG in Data Ontap 9.3P7 (Error in their Portstats, ...), as we dont find any matching Errors in our Network infrastructure. Also no impact seen to the systems.

I already opened a support Case, but uptonow they cannot match this to an existing BUG, as 9.3P7 should have fixed all issues regarding this problem.

 

So timeconsuming debugging on customer site must be done to find the root-cause 😞 

 

So the Question to the community: Anybody  seen this Errors on Ontap 9.3P7?

 

Best Regards,

Klaus

1 ACCEPTED SOLUTION

klmi
24,525 Views

Hello,

 

short update on this topic:

it seems like the issues are really related to the Case, that we receive Packets with MTU-Size >1500,  while the Port is set to MTU1500.

Starting with Ontap 9.3 this issue gets reported als "long frames" an in the events and alerts.

 

Our solution for a permanent fix is,

to set the MTU to 9000 on LAN-Ports on the Filer which are connected to a Switch with Jumbo Frames enabled.

 

Thanks Gidi for your feedback which helped much to solve the issue.

 

Best Regards,

Klaus

View solution in original post

28 REPLIES 28
Public