Tech ONTAP Blogs
Tech ONTAP Blogs
All organizations rely on a healthy stream of data to help shape decisions on how to best serve their needs as well as that of their customers. From the non-profit to the largest Fortune 500 enterprise, the ability to gather insights and quickly react hinges on the services and systems that deliver the information. In that, many customers rely on IBM MQ to deliver a reliable and secure messaging platform that can handle the demands of the largest enterprise.
With that, and as more and more enterprise applications migrate to Amazon Web Services, the need to craft a solution with the AWS infrastructure building blocks that can handle the requirements and scale to power IBM MQ can be a daunting task. Fortunately, IBM has provided guidance on a wide scope of how to architect your solution. In this document we’ll cover the joint efforts between IBM and NetApp on the certification of Amazon FSx for NetApp ONTAP; an AWS fully managed storage solution that powers some of the most mission critical workloads today.
AWS has published a blog showcasing IBM MQ running on EC2 compute and FSx for NetApp ONTAP storage. The article can be found here.
In the article, the solution leverages FSx ONTAP in a multi-AZ configuration with IBM MQ deployed on EC2 instances in an active/passive topology. This allows for redundancy in that the MQ service can moved between nodes across both AZs as needed. The backing storage FSx for NetApp ONTAP is in turn available across both sites and accessible in a Primary/Standby manner.
IBM’s documentation for testing shared filesystems for IBM MQ can be found here. The guide puts the storage through tests to validate compatibility with IBM MQ Multi-instance queue managers . I
IBM publishes certified shared storage certifications
NetApp has completed all testing outlined in the certification process successfully. The architecture was built on Linux based EC2 instances and FSx ONTAP volumes accessed over NFSv4.1 protocol.
Test |
Test Result |
Success |
|
Success |
|
Run the amqsfhac integrity checker sample program during failovers |
Success |
Success |
Test: Run amqmfsck with each command-line option
Step 1: Testing basic file system behavior on Linux and UNIX
Host 1:
Command: amqmfsck /ibmmq_data
Output: The tests on the directory completed successfully.
Outcome: Success
Host 2:
Command: amqmfsck /ibmmq_data
Output: The tests on the directory completed successfully.
Outcome: Success
Step 2: Testing concurrent writes on Linux and UNIX
Host 1:
Command: amqmfsck -c /ibmmq_data
Output: Start a second copy of this program with the same parameters on another server.
Writing to test file. This will normally complete within about 60 seconds.
..................................................
The tests on the directory completed successfully.
Outcome: success
Host 2:
Command: amqmfsck -c /ibmmq_data
Output: Writing to test file. This will normally complete within about 60 seconds.
...........................................................
The tests on the directory completed successfully.
Outcome: Success
Step 3: Test waiting for and releasing file locks (the -w option). On both machines at the same time run:
Host 1:
Command: amqmfsck -w /ibmmq_data
Output: File lock acquired.
Start a second copy of this program with the same parameters on another server.
Press Enter or terminate this process to release the lock.
File lock released.
The tests on the directory completed successfully.
Outcome: Success
Host 2:
Command: amqmfsck -w /ibmmq_data
Output: Waiting for the file lock.
Waiting for the file lock.
File lock acquired.
Press Enter or terminate this process to release the lock.
File lock released.
The tests on the directory completed successfully.
Outcome: Success
Step1: Create multi-instance queue manager
Host 1:
Command: crtmqm -ld /ibmmq_data/shared/test1/logs -md /ibmmq_data/shared/test1/data TESTQM
Output: IBM MQ queue manager 'TESTQM' created.
Directory '/ibmmq_data/shared/test1/data/TESTQM' created.
The queue manager is associated with installation 'Installation1'.
Creating or replacing default objects for queue manager 'TESTQM'.
Default objects statistics : 83 created. 0 replaced. 0 failed.
Completing setup.
Setup completed.
Outcome: Success
Stept 2: add queue manager to the second host
Command: addmqinf -s QueueManager -v Name=TESTQM -v Directory=TESTQM -v Prefix=/var/mqm -v DataPath=/ibmmq_data/shared/test1/data/TESTQM
Output: IBM MQ configuration information added.
Outcome: Success
Step1: On both machines, start a queue manager instance. On IBM i use the STRMQM command:
Command: strmqm -x TESTQM
Host 1 Output: IBM MQ queue manager 'TESTQM' starting.
The queue manager is associated with installation 'Installation1'.
6 log records accessed on queue manager 'TESTQM' during the log replay phase.
Log replay for queue manager 'TESTQM' complete.
Transaction manager state recovered for queue manager 'TESTQM'.
Plain text communication is enabled.
IBM MQ queue manager 'TESTQM' started using V9.3.0.0.
Host 2 Output: IBM MQ queue manager 'TESTQM' starting.
The queue manager is associated with installation 'Installation1'.
Plain text communication is enabled.
A standby instance of queue manager 'TESTQM' has been started. The active
instance is running elsewhere.
Outcome: Success
Step 2: On both machines, display the queue manager status to see which is the active instance and which is the standby. On IBM i use the WRKMQM command:
Command: dspmq -x
Host 1 output: QMNAME(TESTQM) STATUS(Running)
Host 2 Output: QMNAME(TESTQM) STATUS(Running as standby)
Outcome: Success
Step 3: On the machine with the active instance, define a listener which the queue manager can automatically start after fail over. On IBM i use the CRTMQMLSR command:
Command: echo "DEFINE LISTENER(PORT.1414) TRPTYPE(TCP) PORT(1414) CONTROL(QMGR)" | runmqsc TESTQM
Output: 1 : DEFINE LISTENER(PORT.1414) TRPTYPE(TCP) PORT(1414) CONTROL(QMGR)
AMQ8626I: IBM MQ listener created.
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.
Outcome: Success
Step 4: On the machine with the active instance, end the queue manager so it fails over to the standby instance. On IBM i use the ENDMQM command:
Command: endmqm -is TESTQM
Output: IBM MQ queue manager 'TESTQM' ending.
IBM MQ queue manager 'TESTQM' ended, permitting switchover to a standby
instance.
Outcome: Success
Step 5: On both machines, display the queue manager status to confirm that the active instance shut down cleanly and the standby instance became active:
Command: dspmq -x
Host 1 Output: QMNAME(TESTQM) STATUS(Running elsewhere) INSTANCE(ip-10-0-9-216) MODE(Active)
Host 2 Output: QMNAME(TESTQM) STATUS(Running) INSTANCE(ip-10-0-9-216) MODE(Active)
Outcome: Success
Step 6: Restart the queue manager as a standby instance on the machine where you ended it:
Command: strmqm -x TESTQM
Output: IBM MQ queue manager 'TESTQM' starting.
The queue manager is associated with installation 'Installation1'.
Plain text communication is enabled.
A standby instance of queue manager 'TESTQM' has been started. The active
instance is running elsewhere.
Outcome: Success
Step 6: Restart the queue manager as a standby instance on the machine where you ended it:
Command: strmqm -x TESTQM
Output: IBM MQ queue manager 'TESTQM' starting.
The queue manager is associated with installation 'Installation1'.
Plain text communication is enabled.
A standby instance of queue manager 'TESTQM' has been started. The active
instance is running elsewhere.
Outcome: Success
Test: On the machine with the active instance, make sure the listener is running. On IBM i use the WRKMQMLSR command:
Command: echo "DISPLAY LSSTATUS(PORT.1414)" | runmqsc TESTQM
Output: 1 : DISPLAY LSSTATUS(PORT.1414)
AMQ8631I: Display listener status details.
LISTENER(PORT.1414) STATUS(RUNNING)
PID(18853) STARTDA(2023-06-22)
STARTTI(21.46.54) DESCR( )
TRPTYPE(TCP) CONTROL(QMGR)
IPADDR(*) PORT(1414)
BACKLOG(100)
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.
Outcome: Success
On the machine with the active instance, create the local queues used by the amqsfhac integrity checker program, using any names you wish. On IBM i use the CRTMQMQ command.
Step 1: Create the local queues on Linux and UNIX
Command: echo "DEFINE QLOCAL(MQMI.TEST) MAXDEPTH(10000)" | runmqsc LOCALQM
Output: 1 : DEFINE QLOCAL(MQMI.TEST) MAXDEPTH(10000)
AMQ8006I: IBM MQ queue created.
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.
Command: echo "DEFINE QLOCAL(MQMI.SIDE)" | runmqsc LOCALQM
Output: 1 : DEFINE QLOCAL(MQMI.SIDE)
AMQ8006I: IBM MQ queue created.
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.
Outcome: SUCCESS
TEST FAILOVER(By issuing END command to active queue manager): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:
Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1
Output from Admin Host:
MQGET msglen = 5000 strlen(buffer) = 5000
MQGET side tranid=397
MQPUT side tranid=398
MQGET side tranid=398
Sample AMQSFHAC end
Command from active queue manager: endmqm -is TESTQM
Output from Admin Host: admin host stopped on “Put Message 612” when active queue manager was stopped and continued after a brief pause when the standby queue manager took over
Outcome: Success
Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:
Command: dspmq -x
Output on host with queue manager 1: QMNAME(TESTQM) STATUS(Running)
Output on host with queue manager 2: QMNAME(TESTQM) STATUS(Running elsewhere)
Outcome: Success
Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command
Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM
Output: 1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH
AMQ8409I: Display Queue details.
QUEUE(MQMI.SIDE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.SITE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.TEST) TYPE(QLOCAL)
CURDEPTH(0)
One MQSC command read.
No commands have a syntax error.
Outcome: Success
Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.
Command: ls -l /shared/data/TESTQM/errors/AMQ*.LOG
Output: no unexpected errors found
Outcome: Success
Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.
Ouput: no unexpected errors found
Outcome: Success
Step 7: Restart the queue manager as a standby instance on the machine where it is not running:
Command: strmqm -x TESTQM
Output: A standby instance of queue manager 'TESTQM' has been started. The active
instance is running elsewhere.
Outcome: Success
TEST FAILOVER( by restarting the EC2 instance hosting the active queue manger): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:
Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1
Output from Admin Host:
MQGET msglen = 5000 strlen(buffer) = 5000
MQGET side tranid=397
MQPUT side tranid=398
MQGET side tranid=398
Sample AMQSFHAC end
Command from active queue manager: issued restart of VM hosting the active queue manager from AWS console
Output from Admin Host: Paused on put message 512, then continued when standby instance took over, pause was brief.
Outcome: Success
Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:
Command: dspmq -x
Output on host with queue manager 1: QMNAME(TESTQM) STATUS(Running elsewhere)
Output on host with queue manager 2: QMNAME(TESTQM) STATUS(Running)
Outcome: Success
Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command
Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM
Output: 1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH
AMQ8409I: Display Queue details.
QUEUE(MQMI.SIDE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.SITE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.TEST) TYPE(QLOCAL)
CURDEPTH(0)
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.
Outcome: Success
Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.
Command: ls -l /shared/data/TESTQM/errors/AMQ*.LOG
Output: no unexpected errors were found
Outcome: Success
Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.
Output: no unexpected errors were found
Outcome: Success
Step 7: Restart the queue manager as a standby instance on the machine where it is not running:
Command: strmqm -x TESTQM
Output: IBM MQ queue manager 'TESTQM' starting.
The queue manager is associated with installation 'Installation1'.
Plain text communication is enabled.
A standby instance of queue manager 'TESTQM' has been started. The active
instance is running elsewhere.
Outcome: Success
TEST FAILOVER(By issue HALT in the ubuntu OS): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:
Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1
Output from Admin Host: MQGET msglen = 5000 strlen(buffer) = 5000
MQGET side tranid=397
MQPUT side tranid=398
MQGET side tranid=398
Sample AMQSFHAC end
Command from ubuntu CLI on VM hosting active queue manager: halt --force
Output from VM hosting active queue manager:
Output from Admin Host: paused on put message 416 and then continued after a brief pause when standby instance took over
Outcome: Success
Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:
Command: dspmq -x
Output on host with queue manager 1: QMNAME(TESTQM) STATUS(Running)
Output on host with queue manager 2: QMNAME(TESTQM) STATUS(Running elsewhere)
Outcome: Success
Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command
Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM
Output: 5724-H72 (C) Copyright IBM Corp. 1994, 2022.
Starting MQSC for queue manager TESTQM.
1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH
AMQ8409I: Display Queue details.
QUEUE(MQMI.SIDE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.SITE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.TEST) TYPE(QLOCAL)
CURDEPTH(0)
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.
Outcome: Success
Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.
Command: ls -l /shared/data/TESTQM/errors/AMQ*.LOG
Output: No unexpected errors were found
Outcome: Success
Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.
Ouput: No unexpected errors were found
Outcome: Success
Step 7: Restart the queue manager as a standby instance on the machine where it is not running:
Command: strmqm -x TESTQM
Output: IBM MQ queue manager 'TESTQM' starting.
The queue manager is associated with installation 'Installation1'.
Plain text communication is enabled.
A standby instance of queue manager 'TESTQM' has been started. The active
instance is running elsewhere.
Outcome: Success
TEST FAILOVER(Down the network interface on the EC2 instance of the active queue manager): While the amqsfhac program is running, make the queue manager fail over. The first time through these steps, fail the queue manager over normally by running this command on the active machine:
Command from Admin Host: ./amqsfhac TESTQM MQMI.TEST MQMI.SIDE 1000 200 1
Output from Admin Host:
MQGET msglen = 5000 strlen(buffer) = 5000
MQGET side tranid=397
MQPUT side tranid=398
MQGET side tranid=398
Sample AMQSFHAC end
Command from the CLI in ubuntu host: ifconfig ens5 down --force
Output from Admin Host: paused on put message 614 and then continued after a brief pause when standby instance took over
Outcome: Success
Step 3: After the fail over is done and the amqsfhac program has completed, check the status of the queue manager on both machines to confirm it is active on only one:
Command: dspmq -x
Output on host with queue manager 1: QMNAME(TESTQM) STATUS(Running elsewhere)
Output on host with queue manager 2: QMNAME(TESTQM) STATUS(Running)
Outcome: Success
Step 4: Display the queues used by the amqsfhac program on the active machine to confirm they are both empty. On IBM i use the WRKMQMQ command
Command: echo "DISPLAY QLOCAL(MQMI.*) CURDEPTH" | runmqsc TESTQM
Output:
5724-H72 (C) Copyright IBM Corp. 1994, 2022.
Starting MQSC for queue manager TESTQM.
1 : DISPLAY QLOCAL(MQMI.*) CURDEPTH
AMQ8409I: Display Queue details.
QUEUE(MQMI.SIDE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.SITE) TYPE(QLOCAL)
CURDEPTH(0)
AMQ8409I: Display Queue details.
QUEUE(MQMI.TEST) TYPE(QLOCAL)
CURDEPTH(0)
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.
Outcome: Success
Step 5: Go to the queue manager errors directory and confirm that there are three error logs, and that all have the correct permissions, owner and group. Review the recent error log messages from the failover and look for any unexpected issues. Depending on why the queue manager failed over, some errors are to be expected.
Command: ls -l shared/data/TESTQM/errors/AMQ*.LOG
Output: No unexpected errors were found
Outcome: Success
Step 6: Look at the recent FFSTs (AMQ*.FDC files) in the /var/mqm/errors directory (/QIBM/UserData/mqm/errors on IBM i) on both systems. On the active system, FFSTs showing Probe Id AO074001 and KN673000 are normal when a standby instance takes over the queue manager. Depending on why the queue manager failed over, other FFSTs might be expected.
Ouput: No unexpected errors were found
Outcome: Success
Step 7: Restart the queue manager as a standby instance on the machine where it is not running:
Command: strmqm -x TESTQM
Output: IBM MQ queue manager 'TESTQM' starting.
The queue manager is associated with installation 'Installation1'.
Plain text communication is enabled.
A standby instance of queue manager 'TESTQM' has been started. The active
instance is running elsewhere.
Outcome: Success
With the validation of Amazon FSx for NetApp ONTAP for IBM MQ as a shared file system, customers can now deploy MQ with confidence knowing FSX for NetApp ONTAP meets the requirements for MQ with the convenience of a fully managed storage platform. In addition, the flexibility to dynamically scale performance to meet the application needs in real time as well as feature rich data protection of FSx ONTAP the ability to quickly backup, restore, and replicate MQ data rapidly.