Solved: Re: hw_assist error

SVHO · ‎2017-10-04

Have a few of this error below.

Error: bind failed to port 4444 on IP address xx.xx.xx.xx. Error 49. The IP was not on our list 192.0.x.x. I called the orignal installer of the NAS unit and he made some adjustment but we are still getting time out error.

Do you guys have any ideas?

nas-01
                     Partner : nas-02
                    Hwassist Enabled : true
                    Hwassist IP : 134.xxx.xxx.97
                    Hwassist Port : 4444
                     Monitor Status : active
                     Inactive Reason : -
                   Corrective Action : -
                   Keep Alive Status : Error: did not receive hwassist keep alive alerts from partner.
nas-02
                     Partner : nas-01
                    Hwassist Enabled : true
                     Hwassist IP : 134.xxx.xxx.98
                     Hwassist Port : 4444
                     Monitor Status : active
                     Inactive Reason : -
                   Corrective Action : -
                   Keep Alive Status : Error: did not receive hwassist keep alive alerts from partner.
2 entries were displayed.

nas::*> storage failover hwassist test -node nas-01

Info: No response from partner(nas-02).Timed out.

andris · ‎2017-10-07

A few points...

1. you should have several management IP addresses: cluster_mgmt, node1_mgmt and node2_mgmt.

Q: What are the IP's for node1_mgmt and node2_mgmt?

2. HW Assist requires a communication path between each other's SP's and node_mgmt IP addresses, i.e.

node1 SP <-> node 2 node_mgmt LIF

node 2 SP <-> node 1 node_mgmt LIF

NOTE: This means the service processor's 10.x.x.x subnet and configured default gateway must be able to reach the 134.x.x.x subnet where your node_mgmt LIF's are... ping the 10.x.x.x SP addresses to help confirm this.

3. The storage failover modify commands need to specify the -node to configure each node's partner node_mgmt LIF correctly... This is a cluster shell command (i.e. nas::>)

storage failover modify -node nas-01 -hwassist-partner-ip <node-02's node_mgmt IP>

storage failover modify -node nas-02 -hwassist-partner-ip <node-01's node_mgmt IP>

4. system node run -node nas-01 cluster shell commands place you into the node's "node shell". That's not necessary for this configuration exercise.

View solution in original post

sgrant · ‎2017-10-05

Hello,

For hw_assit to work, you need the IP address configured to be up and available on the partner. Idealy this shoud be across the management network and the SP port. Therefore the SP needs to be configured and available, check using the system service-processor show command. Assuming all is well, then:

For nas-01 ensure 134.xxx.xxx.98 is UP and reachable from nas-02.
Likewise for nas-02 ensure 134.xxx.xxx.97 is UP and reachable from nas-01.

If they are not then you need to fix any issues with the SP first. If you wish to use a different IP address of a different port then change the -hwassist-partner-ip of the storage failover modify command.

Hope this helps,

Cheers,

Grant.

SVHO · ‎2017-10-05

still no luck. with time out issue when tested with storage failover hwassist test -node xxx. I read somewhere on the older posts that RLM has to be installed. How do I know if it is installed/configured?

nas::> system service-processor show
                               IP           Firmware
Node          Type Status      Configured   Version   IP Address
------------- ---- ----------- ------------ --------- -------------------------
nas-01     SP   online      true         5.1       10.xxx.xxx.249
nas-02     SP   online      true         5.1       10.xxx.xxx.250
2 entries were displayed.

nas::> network ping -node nas-02 -destination 134.xxx.xxx.97
134.xxx.xxx.97 is alive

nas::> network ping -node nas-01 -destination 134.xxx.xxx.98
134.xxx.xxx.98 is alive

sgrant · ‎2017-10-06

The RLM and SP are both lights out management ports. In older FAS systems we used RLM but all new models use SP (Service Processor). From your output your SP looks correctly configured.

We now need to have the hw_assist talk to each other across the SP IP addresses, so need to run the following commands:

On nas-01 run:

storage failover modify -hwassist-partner-ip 10.xxx.xxx.250

On nas-02 run:

storage failover modify -hwassist-partner-ip 10.xxx.xxx.249

Hopefully this should resolve the communication issue.

Cheers,

Grant.

SVHO · ‎2017-10-06

I am getting this error:

The IP address specified with "-hwassist-partner-ip" must be partner's
node management IP address. Press the TAB key after the parameter to
list the valid IP addresses.

Our management IPs 134.xxx.xxx.97, 134.xxx.xxx.99.

In general, I wonder if the management IPs have to be able to communicate to the SP IPs.

A newbie question, when I ssh into the nas-01 using the IP address. It still shows "nas::>". This means I am still at the cluster level?

I then run this command:

system node run -node nas-01 which brings me to "nas-01>" (notice at this level the commands are not the same at the cluster level).

Thanks for helping...

andris · ‎2017-10-07

A few points...

1. you should have several management IP addresses: cluster_mgmt, node1_mgmt and node2_mgmt.

Q: What are the IP's for node1_mgmt and node2_mgmt?

2. HW Assist requires a communication path between each other's SP's and node_mgmt IP addresses, i.e.

node1 SP <-> node 2 node_mgmt LIF

node 2 SP <-> node 1 node_mgmt LIF

NOTE: This means the service processor's 10.x.x.x subnet and configured default gateway must be able to reach the 134.x.x.x subnet where your node_mgmt LIF's are... ping the 10.x.x.x SP addresses to help confirm this.

3. The storage failover modify commands need to specify the -node to configure each node's partner node_mgmt LIF correctly... This is a cluster shell command (i.e. nas::>)

storage failover modify -node nas-01 -hwassist-partner-ip <node-02's node_mgmt IP>

storage failover modify -node nas-02 -hwassist-partner-ip <node-01's node_mgmt IP>

4. system node run -node nas-01 cluster shell commands place you into the node's "node shell". That's not necessary for this configuration exercise.

SVHO · ‎2017-10-13

2. HW Assist requires a communication path between each other's SP's and node_mgmt IP addresses, i.e.

node1 SP <-> node 2 node_mgmt LIF

node 2 SP <-> node 1 node_mgmt LIF

NOTE: This means the service processor's 10.x.x.x subnet and configured default gateway must be able to reach the 134.x.x.x subnet where your node_mgmt LIF's are... ping the 10.x.x.x SP addresses to help confirm this.

This one resolved the issue.

Thanks!

Minh-Truong · ‎2021-11-08

Thank you for the detailed insight!

I observed a similar problem right after an ontap update. It seems to solve itself minutes after (between 10-30min) finishing the ontap update.

Abhijeet0007 · ‎2018-12-19

1.A newbie question, when I ssh into the nas-01 using the IP address. It still shows "nas::>". This means I am still at the cluster level?

A.Yes you are at the cluster level. You can know this by running the command on nas::>node show which will show the nodes in that cluster.

2. system node run -node nas-01 which brings me to "nas-01>" (notice at this level the commands are not the same at the cluster level).

A. if you run the command nas::>system node run -node nas-01, it will take you to the 7 mode command line from cluster mode where only limited commands work of 7-mode.

hw_assist error

Get ready to power on