ONTAP Discussions

Problem with hw_assist settings

bosko_radivojevic
10,407 Views

Hi all,

I have a new 3240EA system (two controllers in two different enclosures) and have some troubles with hw_assist feature configuration. It is 8.0.2 7-Mode ONTAP version.The problem manifestation is the following:

NAS-THQ-1A> cf hw_assist test

cf hw_assist Error: No response from partner(NAS-THQ-1B), timed out.

NAS-THQ-1B> cf hw_assist test

cf hw_assist Error: No response from partner(NAS-THQ-1A), timed out.

NAS-THQ-1A> cf hw_assist status

Local Node(NAS-THQ-1A) Status:

        Active: NAS-THQ-1A monitoring alerts from partner(NAS-THQ-1B)

        port 4444 IP address x.y.z.121

        Missed keep alive alert from partner(NAS-THQ-1B).

                 No keep alive alert received from partner since last boot.

Partner Node(NAS-THQ-1B) Status:

        Active: NAS-THQ-1B monitoring alerts from partner(NAS-THQ-1A)

        port 4444 IP address x.y.z.122

NAS-THQ-1B> cf hw_assist status

Local Node(NAS-THQ-1B) Status:

        Active: NAS-THQ-1B monitoring alerts from partner(NAS-THQ-1A)

        port 4444 IP address x.y.z.122

        Missed keep alive alert from partner(NAS-THQ-1A).

                 No keep alive alert received from partner since last boot.

Partner Node(NAS-THQ-1A) Status:

        Active: NAS-THQ-1A monitoring alerts from partner(NAS-THQ-1B)

        port 4444 IP address x.y.z.121

But, it looks everyting is ok:

NAS-THQ-1A> ping x.y.z.122

x.y.z.122 is alive

NAS-THQ-1B> ping x.y.z.121

x.y.z.121 is alive

NAS-THQ-1A> ifconfig e0M

e0M: flags=0x2b4c867<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500

        inet x.y.z.121 netmask 0xfffffff8 broadcast x.y.z.127

        ether 00:a0:98:16:bb:84 (auto-100tx-fd-up) flowcontrol full

NAS-THQ-1B> ifconfig e0M

e0M: flags=0x2b4c867<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500

        inet x.y.z.122 netmask 0xfffffff8 broadcast x.y.z.127

        ether 00:a0:98:15:d5:6c (auto-100tx-fd-up) flowcontrol full

NAS-THQ-1A> options cf.hw_assist

cf.hw_assist.enable          on

cf.hw_assist.partner.address x.y.z.122

cf.hw_assist.partner.port    4444

NAS-THQ-1B> options cf.hw_assis

cf.hw_assist.enable          on

cf.hw_assist.partner.address x.y.z.121

cf.hw_assist.partner.port    4444

NAS-THQ-1A> cf status

Cluster enabled, NAS-THQ-1B is up.

Interconnect status: up.

NAS-THQ-1B> cf status

Cluster enabled, NAS-THQ-1A is up.

Interconnect status: up.

What is wrong? Thanks.

1 ACCEPTED SOLUTION

peter_lehmann
10,407 Views

For cf hw_assist to work you need the RLM to be configured... Is it?

*snip*

Requirements for hardware-assisted takeover

The hardware-assisted takeover feature is available only on systems that support Remote LAN Modules (RLMs) and have the RLMs installed and set up. The remote management card provides remote platform management capabilities, including remote access, monitoring, troubleshooting, logging, and alerting features.

*snip*

Peter

View solution in original post

3 REPLIES 3

peter_lehmann
10,408 Views

For cf hw_assist to work you need the RLM to be configured... Is it?

*snip*

Requirements for hardware-assisted takeover

The hardware-assisted takeover feature is available only on systems that support Remote LAN Modules (RLMs) and have the RLMs installed and set up. The remote management card provides remote platform management capabilities, including remote access, monitoring, troubleshooting, logging, and alerting features.

*snip*

Peter

bosko_radivojevic
10,408 Views

Yup, that sounds reasonable. Thanks.

ps. Actual instructions to setup RLM could be found on http://now.netapp.com/NOW/knowledge/docs/ontap/rel801/html/ontap/sysadmin/GUID-10633784-DEFF-47A9-891A-AA57C2A8FC68.html

RichardTodd
10,216 Views

I just thought I would add to this post to save people some time resolving hw_assist timeout issues with a Service Processor (SP):

 

Firstly check SP speed / duplex, type 'sp status' and check SP has negotiated 100Mb / Full, if not reconfigure SP network switch ports to auto / auto i.e. speed / duplex.

 

Once this has been completed type 'sp status' to confirm 100Mb / full duplex, if the output still shows 100Mb / half duplex, type sp reboot and use sp status to confirm reboot has completed and speed / duplex is set correctly.

 

Another reason for getting time out messages is if the SP has not been configured properly. This may be observed by a SP prompt without hostname i.e. 'SP>'. The SP prompt should be 'SP hostname>'

 

To fix this issue use the following commands:

 

sp status
options sp.setup off
sp setup (using info from sp status)
cf hw_assist test
cf hw_assist status

 

I hope this helps.

Public