Active IQ Unified Manager Discussions

Brocade performance failing

sunilyadav
6,592 Views

Hi,

 

Since some weeks I observed that many of Brocade switches in my multiple environments are failing on performance. Error shown in datasource and log file is:

 

"Failed to get performance for switch x.x.x.x VF 120 Error indication in response: A no access error occurred.

Errindex: 0"

 

"Access Control List on SNMP may be configured and Acquisition Unit server's ip is not permitted by it, so it is not allowed to make SNMP requests."

 

 

Quick help on this is much appreciated.

 
5 REPLIES 5

ostiguy
6,585 Views

Have virtual fabrics been enabled for the first time on these switches?

 

The error message is basically stating that OCI is getting denied tried to obtain performance statistics on ports in VF 120

 

You will need to be using SNMPv3 to collect performance from these switches. It is impossible to obtain statistics on ports in non-default VFs via SNMP v2.

 

If you are using SNMP v3, I would recommend looking at what username you are using via SNMP v3, and how that user is configured on that switch - it is possible that someone built a RBAC style, least privilege user account, but subsequently VF 120 has been introduced, but no one added VF 120 to the list of VFs that your datasource's user account is permitted to access

sunilyadav
6,561 Views

I checked with customer and they are saying that nothing has been changed on switches. 

 

BTW on another switch this is the error, I am sure that have nothing to do with SNMP version or user

 

2017-07-10 13:18:09,614 ERROR [com.netapp.oci.platform.common.interfaces.session.SessionCache] Session Cache - Failed to communicate with the server ( PerformanceApiRemote) - unrecoverable error: Failed to store performance samples for dataSource: #181, type: port, key: 20:00:00:0D:EC:3A:99:C0cause: class com.datastax.driver.core.exceptions.WriteTimeoutException:Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)

 

java.lang.RuntimeException: Failed to store performance samples for dataSource: #181, type: port, key: 20:00:00:0D:EC:3A:99:C0cause: class com.datastax.driver.core.exceptions.WriteTimeoutException:Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)

ostiguy
6,556 Views

Those failed to store... messages have a high correlation with Cassandra problems. They are basically indicative that Acquisition did all of its work to obtain and process the data, but the Server is telling Acq that it was unable to store the data to complete the action so the performance poll functionally failed as the data had no where to go.

 

There are some Cassandra problems that only impact 1-n datasources, whereas there can also be systemic Cassandra problems where all datasources' performance packages fail with similar insertion messages.

 

Systemic Cassandra problems tend to occur on systems that are undersized, or have virtual memory problems due to small, fixed sized Windows paging files. We strongly recommend that OCI servers have paging files set to Windows managed sizing.

 

It sounds like you may be experiencing one of the 1-n problems, but I'd like to see a bit more.

 

If this OCI instance is sending OCI ASUP, could you PM me the site name?

 

Failing that,

 

C:\Program Files\SANscreen\jboss\server\onaro\log

 

Has some cassandra-client.log files. If you zip them up, and email them to me, I can take a look. However, there is a chance that root cause or when the problem started won't be captured, as these logs can roll over on large systems

 

Matt

sunilyadav
6,503 Views

Hi Matt,

 

Yes, ASUP is enabled. I will PM you the site name.

 

Thanks.

pippen23
6,355 Views

I am having the same issue, only on 1 side of my Brocade fabric...very odd, no changes made...

Public