Tech ONTAP Blogs

IBM MQ in AWS built on Amazon FSx for NetApp ONTAP

wstowe
NetApp
1,499 Views

Overview

All organizations rely on a healthy stream of data to help shape decisions on how to best serve their needs as well as that of their customers. From the non-profit to the largest Fortune 500 enterprise, the ability to gather insights and quickly react hinges on the services and systems that deliver the information. In that, many customers rely on IBM MQ  to deliver a reliable and secure messaging platform that can handle the demands of the largest enterprise.

 

With that, and as more and more enterprise applications migrate to Amazon Web Services, the need to craft a solution with the AWS infrastructure building blocks that can handle the requirements and scale to power IBM MQ can be a daunting task. Fortunately, IBM has provided guidance on a wide scope of how to architect your solution. In this document we’ll cover the joint efforts between IBM and NetApp on the certification of Amazon FSx for NetApp ONTAP; an AWS fully managed storage solution that powers some of the most mission critical workloads today.

 

 

AWS Blog - MQ on AWS FSx for NetApp ONTAP

wstowe_0-1709833052462.png

 

AWS has published a blog showcasing IBM MQ running on EC2 compute and FSx for NetApp ONTAP storage. The article can be found here.

In the article, the solution leverages FSx ONTAP in a multi-AZ configuration with IBM MQ deployed on EC2 instances in an active/passive topology. This allows for redundancy in that the MQ service can moved between nodes across both AZs as needed. The backing storage FSx for NetApp ONTAP is in turn available across both sites and accessible in a Primary/Standby manner.

 

 

Certification Process

IBM’s documentation for testing shared filesystems for IBM MQ can be found here. The guide puts the storage through tests to validate compatibility with  IBM MQ Multi-instance queue managers . I

 

IBM publishes certified shared storage certifications

 

NetApp has completed all testing outlined in the certification process successfully. The architecture was built on Linux based EC2 instances and FSx ONTAP volumes accessed over NFSv4.1 protocol.

 

 

 

 

Certification Results

Test

Test Result

Run amqmfsck with each command-line option

Success

Create a multi-instance queue manager

Success

Run the amqsfhac integrity checker sample program during failovers

Success

Delete the test multi-instance queue manager

Success

 

 

Run amqmfsck with each command -line option

Test: Run amqmfsck with each command-line option

Step 1: Testing basic file system behavior on Linux and UNIX

Host 1:

Command: amqmfsck /ibmmq_data

Output: The tests on the directory completed successfully.

Outcome: Success

Host 2:

Command: amqmfsck /ibmmq_data

Output: The tests on the directory completed successfully.

Outcome: Success

 

Step 2: Testing concurrent writes on Linux and UNIX

Host 1:

Command: amqmfsck -c /ibmmq_data

Output: Start a second copy of this program with the same parameters on another server.

Writing to test file. This will normally complete within about 60 seconds.

..................................................

The tests on the directory completed successfully.

 

Outcome: success

Host 2:

Command: amqmfsck -c /ibmmq_data

Output: Writing to test file. This will normally complete within about 60 seconds.

...........................................................

The tests on the directory completed successfully.

 

 

Outcome: Success

 

Step 3: Test waiting for and releasing file locks (the -w option). On both machines at the same time run:

Host 1:

Command: amqmfsck -w /ibmmq_data

Output: File lock acquired.

Start a second copy of this program with the same parameters on another server.

Press Enter or terminate this process to release the lock.

File lock released.

The tests on the directory completed successfully.

 

Outcome: Success

Host 2:

Command: amqmfsck -w /ibmmq_data

Output: Waiting for the file lock.

Waiting for the file lock.

File lock acquired.

Press Enter or terminate this process to release the lock.

File lock released.

The tests on the directory completed successfully.

 

Outcome: Success

 

Create a multi-instance queue manager

Step1: Create multi-instance queue manager

Host 1:

Command: crtmqm -ld /ibmmq_data/shared/test1/logs -md /ibmmq_data/shared/test1/data TESTQM

Output: IBM MQ queue manager 'TESTQM' created.

Directory '/ibmmq_data/shared/test1/data/TESTQM' created.

The queue manager is associated with installation 'Installation1'.

Creating or replacing default objects for queue manager 'TESTQM'.

Default objects statistics : 83 created. 0 replaced. 0 failed.

Completing setup.

Setup completed.

 

Outcome: Success

Stept 2: add queue manager to the second host

Command: addmqinf -s QueueManager -v Name=TESTQM -v Directory=TESTQM -v Prefix=/var/mqm -v DataPath=/ibmmq_data/shared/test1/data/TESTQM

Output: IBM MQ configuration information added.

 

Outcome: Success

Step1: On both machines, start a queue manager instance. On IBM i use the STRMQM command:

Command: strmqm -x TESTQM

Host 1 Output: IBM MQ queue manager 'TESTQM' starting.

The queue manager is associated with installation 'Installation1'.

6 log records accessed on queue manager 'TESTQM' during the log replay phase.

Log replay for queue manager 'TESTQM' complete.

Transaction manager state recovered for queue manager 'TESTQM'.

Plain text communication is enabled.

IBM MQ queue manager 'TESTQM' started using V9.3.0.0.

 

Host 2 Output: IBM MQ queue manager 'TESTQM' starting.

The queue manager is associated with installation 'Installation1'.

Plain text communication is enabled.

A standby instance of queue manager 'TESTQM' has been started. The active

instance is running elsewhere.

 

Outcome: Success

 

Step 2: On both machines, display the queue manager status to see which is the active instance and which is the standby. On IBM i use the WRKMQM command:

Command: dspmq -x

Host 1 output: QMNAME(TESTQM)    STATUS(Running)

Host 2 Output: QMNAME(TESTQM)   STATUS(Running as standby)

 

Outcome: Success

 

Step 3: On the machine with the active instance, define a listener which the queue manager can automatically start after fail over. On IBM i use the CRTMQMLSR command:

Command: echo "DEFINE LISTENER(PORT.1414) TRPTYPE(TCP) PORT(1414) CONTROL(QMGR)" | runmqsc TESTQM

Output:      1 : DEFINE LISTENER(PORT.1414) TRPTYPE(TCP) PORT(1414) CONTROL(QMGR)

AMQ8626I: IBM MQ listener created.

One MQSC command read.

No commands have a syntax error.

All valid MQSC commands were processed.

 

Outcome:  Success

Step 4: On the machine with the active instance, end the queue manager so it fails over to the standby instance. On IBM i use the ENDMQM command:

Command: endmqm -is TESTQM

Output: IBM MQ queue manager 'TESTQM' ending.

IBM MQ queue manager 'TESTQM' ended, permitting switchover to a standby

instance.

 

Outcome: Success

 

 

 

Step 5: On both machines, display the queue manager status to confirm that the active instance shut down cleanly and the standby instance became active:

Command: dspmq -x

Host 1 Output: QMNAME(TESTQM)  STATUS(Running elsewhere) INSTANCE(ip-10-0-9-216) MODE(Active)

Host 2 Output: QMNAME(TESTQM) STATUS(Running) INSTANCE(ip-10-0-9-216) MODE(Active)

 

Outcome: Success

 

Step 6: Restart the queue manager as a standby instance on the machine where you ended it:

Command: strmqm -x TESTQM

Output: IBM MQ queue manager 'TESTQM' starting.

The queue manager is associated with installation 'Installation1'.

Plain text communication is enabled.

A standby instance of queue manager 'TESTQM' has been started. The active

instance is running elsewhere.

 

Outcome: Success

 

Step 6: Restart the queue manager as a standby instance on the machine where you ended it:

Command: strmqm -x TESTQM

Output: IBM MQ queue manager 'TESTQM' starting.

The queue manager is associated with installation 'Installation1'.

Plain text communication is enabled.

A standby instance of queue manager 'TESTQM' has been started. The active

instance is running elsewhere.

 

Outcome: Success

 

Test: On the machine with the active instance, make sure the listener is running. On IBM i use the WRKMQMLSR command:

Command: echo "DISPLAY LSSTATUS(PORT.1414)" | runmqsc TESTQM

Output:      1 : DISPLAY LSSTATUS(PORT.1414)

AMQ8631I: Display listener status details.

   LISTENER(PORT.1414)                     STATUS(RUNNING)

   PID(18853)                              STARTDA(2023-06-22)

   STARTTI(21.46.54)                       DESCR( )

   TRPTYPE(TCP)                            CONTROL(QMGR)

   IPADDR(*)                               PORT(1414)

   BACKLOG(100)

One MQSC command read.

No commands have a syntax error.

All valid MQSC commands were processed.

 

Outcome: Success

 

 

On the machine with the active instance, create the local queues used by the amqsfhac integrity checker program, using any names you wish. On IBM i use the CRTMQMQ command.

Step 1: Create the local queues on Linux and UNIX

Command: echo "DEFINE QLOCAL(MQMI.TEST) MAXDEPTH(10000)" | runmqsc LOCALQM

Output:       1 : DEFINE QLOCAL(MQMI.TEST) MAXDEPTH(10000)

AMQ8006I: IBM MQ queue created.

One MQSC command read.

No commands have a syntax error.

All valid MQSC commands were processed.

 

 

Command:  echo "DEFINE QLOCAL(MQMI.SIDE)" | runmqsc LOCALQM

Output:      1 : DEFINE QLOCAL(MQMI.SIDE)

AMQ8006I: IBM MQ queue created.

One MQSC command read.

No commands have a syntax error.

All valid MQSC commands were processed.

 

 

Outcome: SUCCESS

 

Run the amqsfhac integrity checker sample program during failovers

 

TEST FAILOVER(By issuing END command to active queue manager): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:

Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1

Output from Admin Host:

 

MQGET msglen = 5000 strlen(buffer) = 5000

MQGET side tranid=397

MQPUT side tranid=398

MQGET side tranid=398

Sample AMQSFHAC end

 

Command from active queue manager: endmqm -is TESTQM

Output from Admin Host: admin host stopped on “Put Message 612” when active queue manager was stopped and continued after a brief pause when the standby queue manager took over

 

 

Outcome: Success

 

Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:

Command: dspmq -x

Output on host with queue manager 1: QMNAME(TESTQM)  STATUS(Running)

Output on host with queue manager 2: QMNAME(TESTQM)  STATUS(Running elsewhere)

Outcome: Success

 

 

Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command

Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM

Output:          1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SIDE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SITE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.TEST)                        TYPE(QLOCAL)

   CURDEPTH(0)

One MQSC command read.

No commands have a syntax error.

 

Outcome: Success

 

Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.

Command: ls -l /shared/data/TESTQM/errors/AMQ*.LOG

Output: no unexpected errors found

 

Outcome: Success

 

Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.

Ouput: no unexpected errors found

Outcome: Success

 

 

Step 7: Restart the queue manager as a standby instance on the machine where it is not running:

Command:  strmqm -x TESTQM

Output: A standby instance of queue manager 'TESTQM' has been started. The active

instance is running elsewhere.

 

Outcome: Success

 

 

 

TEST FAILOVER( by restarting the EC2 instance hosting the active queue manger): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:

Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1

Output from Admin Host: 

MQGET msglen = 5000 strlen(buffer) = 5000

MQGET side tranid=397

MQPUT side tranid=398

MQGET side tranid=398

Sample AMQSFHAC end

 

Command from active queue manager: issued restart of VM hosting the active queue manager from AWS console

Output from Admin Host: Paused on put message 512, then continued when standby instance took over, pause was brief.

 

Outcome: Success

 

Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:

Command: dspmq -x

Output on host with queue manager 1: QMNAME(TESTQM)      STATUS(Running elsewhere)

 

Output on host with queue manager 2: QMNAME(TESTQM)      STATUS(Running)

Outcome: Success

 

 

Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command

Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM

Output:         1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SIDE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SITE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.TEST)                        TYPE(QLOCAL)

   CURDEPTH(0)

One MQSC command read.

No commands have a syntax error.

All valid MQSC commands were processed.

 

 

Outcome: Success

 

 

 

Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.

Command: ls -l /shared/data/TESTQM/errors/AMQ*.LOG

Output: no unexpected errors were found

 

Outcome: Success

 

Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.

Output: no unexpected errors were found

Outcome: Success

 

 

Step 7: Restart the queue manager as a standby instance on the machine where it is not running:

Command:  strmqm -x TESTQM

Output: IBM MQ queue manager 'TESTQM' starting.

The queue manager is associated with installation 'Installation1'.

Plain text communication is enabled.

A standby instance of queue manager 'TESTQM' has been started. The active

instance is running elsewhere.

 

Outcome: Success

 

TEST FAILOVER(By issue HALT in the ubuntu OS): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:

Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1

Output from Admin Host: MQGET msglen = 5000 strlen(buffer) = 5000

MQGET side tranid=397

MQPUT side tranid=398

MQGET side tranid=398

Sample AMQSFHAC end

 

Command from ubuntu CLI on VM hosting active queue manager: halt --force

Output from VM hosting active queue manager:

 

Output from Admin Host: paused on put message 416 and then continued after a brief pause when standby instance took over

 

Outcome: Success

Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:

Command: dspmq -x

Output on host with queue manager 1: QMNAME(TESTQM)  STATUS(Running)

 

Output on host with queue manager 2: QMNAME(TESTQM)  STATUS(Running elsewhere)

Outcome: Success

 

 

Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command

Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM

Output:     5724-H72 (C) Copyright IBM Corp. 1994, 2022.

Starting MQSC for queue manager TESTQM.

 

 

     1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SIDE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SITE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.TEST)                        TYPE(QLOCAL)

   CURDEPTH(0)

One MQSC command read.

No commands have a syntax error.

All valid MQSC commands were processed.

 

 

Outcome: Success

 

 

Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.

Command: ls -l /shared/data/TESTQM/errors/AMQ*.LOG

Output: No unexpected errors were found

 

Outcome: Success

 

Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.

Ouput:  No unexpected errors were found

Outcome: Success

 

 

 

Step 7: Restart the queue manager as a standby instance on the machine where it is not running:

Command:  strmqm -x TESTQM

Output: IBM MQ queue manager 'TESTQM' starting.

The queue manager is associated with installation 'Installation1'.

Plain text communication is enabled.

A standby instance of queue manager 'TESTQM' has been started. The active

instance is running elsewhere.

 

 

Outcome: Success

 

TEST FAILOVER(Down the network interface on the EC2 instance of the active queue manager): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:

Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1

Output from Admin Host:

 

MQGET msglen = 5000 strlen(buffer) = 5000

MQGET side tranid=397

MQPUT side tranid=398

MQGET side tranid=398

Sample AMQSFHAC end

 

Command from the CLI in ubuntu host: ifconfig ens5 down --force

Output from Admin Host: paused on put message 614 and then continued after a brief pause when standby instance took over

 

Outcome: Success

 

 

Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:

Command: dspmq -x

Output on host with queue manager 1: QMNAME(TESTQM)  STATUS(Running elsewhere)

Output on host with queue manager 2: QMNAME(TESTQM)  STATUS(Running)

Outcome: Success

 

 

 

 

 

 

 

Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command

Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM

Output:   

 

5724-H72 (C) Copyright IBM Corp. 1994, 2022.

Starting MQSC for queue manager TESTQM.

 

 

     1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SIDE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.SITE)                        TYPE(QLOCAL)

   CURDEPTH(0)

AMQ8409I: Display Queue details.

   QUEUE(MQMI.TEST)                        TYPE(QLOCAL)

   CURDEPTH(0)

One MQSC command read.

No commands have a syntax error.

All valid MQSC commands were processed.

 

Outcome: Success

 

Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.

Command: ls -l shared/data/TESTQM/errors/AMQ*.LOG

Output: No unexpected errors were found

 

Outcome: Success

 

Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.

Ouput: No unexpected errors were found

Outcome: Success

 

 

Step 7: Restart the queue manager as a standby instance on the machine where it is not running:

Command:  strmqm -x TESTQM

Output: IBM MQ queue manager 'TESTQM' starting.

The queue manager is associated with installation 'Installation1'.

Plain text communication is enabled.

A standby instance of queue manager 'TESTQM' has been started. The active

instance is running elsewhere.

Outcome: Success

 

 

 

Closing

With the validation of Amazon FSx for NetApp ONTAP for IBM MQ as a shared file system, customers can now deploy MQ with confidence knowing FSX for NetApp ONTAP meets the requirements for MQ with the convenience of a fully managed storage platform. In addition, the flexibility to dynamically scale performance to meet the application needs in real time as well as feature rich data protection of FSx ONTAP the ability to quickly backup, restore, and replicate MQ data rapidly.

Public