Cloud Volumes ONTAP

ONTAP Cloud for AWS - High Availability Instance fails to bootstrap

raghavbijjula
3,247 Views

Hello,

 

I was trying to setup ONTAP cloud in AWS Mumbai Region (ap-south-1). The instance keeps rebooting as it's unable to fetch ec2 instance data during bootstrap.

 

Following is the System log from the AWS Console.

 

 

 

 

http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.36059 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.49781 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.59794 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.52927 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.21739 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.18924 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.45564 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.23992 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.12557 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.48417 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.47910 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.11409 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.58067 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.52471 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.25558 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.53427 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.21719 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.14468 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.17155 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.41693 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.10622 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.40449 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.29401 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.55538 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.38954 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.48825 169.254.169.254.http TIME_WAIT -1 ANY Active UNIX domain sockets Address Type Recv-Q Send-Q Inode Conn Refs Nextref Addr fffffe0063238630 stream 0 0 fffffe00630613c0 0 0 0 /var/run/rpcbind.sock fffffe0063238528 dgram 0 0 0 fffffe00630e4528 0 0 fffffe00630e0108 dgram 0 0 0 fffffe00630e0210 0 0 fffffe00630e9000 dgram 0 0 fffffe006303f780 0 0 0 /var/run/control fffffe00630e5948 dgram 0 0 fffffe0063008000 0 0 0 /var/run/log fffffe00630e4528 dgram 0 0 fffffe0008e68780 0 fffffe0063238528 0 /var/run/logpriv fffffe0063092b58 dgram 0 0 fffffe00630391e0 0 0 0 /var/run/mlog_ems fffffe00630e0210 dgram 0 0 fffffe0063087960 0 fffffe00630e0108 0 /var/run/mlog
cat /var/db/dhclient.leases.e0a lease { interface "e0a"; fixed-address 10.62.48.42; option subnet-mask 255.255.254.0; option routers 10.62.48.1; option domain-name-servers 10.62.48.2; option host-name "ip-10-62-48-42"; option domain-name "tr-tax-nonprod.aws-int.thomsonreuters.com"; option broadcast-address 10.62.49.255; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 10.62.48.1; renew 2 2017/11/21 04:49:33; rebind 2 2017/11/21 05:12:03; expire 2 2017/11/21 05:19:33; } lease { interface "e0a"; fixed-address 10.62.48.42; option subnet-mask 255.255.254.0; option routers 10.62.48.1; option domain-name-servers 10.62.48.2; option host-name "ip-10-62-48-42"; option domain-name "tr-tax-nonprod.aws-int.thomsonreuters.com"; option broadcast-address 10.62.49.255; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 10.62.48.1; renew 2 2017/11/21 04:50:11; rebind 2 2017/11/21 05:12:41; expire 2 2017/11/21 05:20:11; }
ERROR: Failure detected while running pre-ONTAP initialization script.
ERROR: Data ONTAP startup failed - initiating reboot.
[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

[Nov 21 04:20:13]: 0x801c04200: 0: NOTICE: RpcConnectionCache: markFailure: markFailure: New Quarantine for localhost:536873474(spmd)/1[tcp]{fffffffd|ffffffff} with timeout(20.000s) Reason:: RPC: Port mapper failure - RPC: Unable to send

.
Terminated
Uptime: 17s
-\|/-\|/-\|/

Booting...               
-\|/-\|/x86_64/freebsd/image1/kernel data=0xd0c878+0x61cf48 syms=[0x8+0x696f0+0x8+0x4d7ed]
-\|/-\|/x86_64/freebsd/image1/platform.ko text=0x2771c0 data=0x60220+0x84b50 syms=[0x8+0x1b348+0x8+0x15d57]
-\|/-\|/-\x86_64/freebsd/image1/xenhvm.ko size 0x3ca80 at 0x196f000
|/-\|/-\|/-\NetApp Data ONTAP 9.2RC1D1
 at device/vif/0 feature-sg feature-gso-tcp4
Copyright (C) 1992-2017 NetApp.
All rights reserved.
cryptomod_fips: Executing Crypto FIPS Self Tests.
cryptomod_fips: Crypto FIPS self-test: 'CPU COMPATIBILITY' passed.
cryptomod_fips: Crypto FIPS self-test: 'AES-128 ECB, AES-256 ECB' passed.
cryptomod_fips: Crypto FIPS self-test: 'AES-128 CBC, AES-256 CBC' passed.
cryptomod_fips: Crypto FIPS self-test: 'CTR_DRBG' passed.
cryptomod_fips: Crypto FIPS self-test: 'SHA1, SHA256, SHA512' passed.
cryptomod_fips: Crypto FIPS self-test: 'HMAC-SHA1, HMAC-SHA256, HMAC-SHA512' passed.
cryptomod_fips: Crypto FIPS self-test: 'PBKDF2' passed.
cryptomod_fips: Crypto FIPS self-test: 'AES-XTS 128, AES-XTS 256' passed.
cryptomod_fips: Crypto FIPS self-test: 'Self-integrity' passed.
Starting DHCP client on e0a
Unable to 'wget' EC2 Instance user-data.
ifconfig e0a e0a: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=503<RXCSUM,TXCSUM,TSO4,LRO> ether 02:be:0b:18:cc:de inet 10.62.48.42 netmask 0xfffffe00 broadcast 10.62.49.255 NODEMGMTLIF Vserver ID: -1 media: Ethernet manual status: active
netstat Active Internet connections Proto Recv-Q Send-Q Local Address Foreign Address (state) VCTX Role tcp4 0 0 10.62.48.42.18041 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.60798 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.22545 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.54881 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.43512 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.18245 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.61150 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.58667 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.13322 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.61280 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.49549 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.34464 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.37473 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.52339 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.38618 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.45164 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.24141 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.40178 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.18313 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.61349 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.55879 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.16710 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.55038 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.47773 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.35454 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.32965 169.254.169.254.http TIME_WAIT -1 ANY tcp4 0 0 10.62.48.42.17480 169.254.169.254.http TIME_WAIT -1 ANY Active UNIX domain sockets Address Type Recv-Q Send-Q Inode Conn Refs Nextref Addr fffffe004f31a630 stream 0 0 fffffe004f323000 0 0 0 /var/run/rpcbind.sock fffffe004f33ed68 dgram 0 0 0 fffffe004f240108 0 0 fffffe004f241420 dgram 0 0 0 fffffe004f240210 0 0 fffffe004f247000 dgram 0 0 fffffe004f1645a0 0 0 0 /var/run/control fffffe004f245948 dgram 0 0 fffffe004f18a1e0 0 0 0 /var/run/log fffffe004f240108 dgram 0 0 fffffe004f1bb1e0 0 fffffe004f33ed68 0 /var/run/logpriv fffffe004f241528 dgram 0 0 fffffe004f1525a0 0 0 0 /var/run/mlog_ems fffffe004f240210 dgram 0 0 fffffe004f1bb3c0 0 fffffe004f241420 0 /var/run/mlog
cat /var/db/dhclient.leases.e0a lease { interface "e0a"; fixed-address 10.62.48.42; option subnet-mask 255.255.254.0; option routers 10.62.48.1; option domain-name-servers 10.62.48.2; option host-name "ip-10-62-48-42"; option domain-name "tr-tax-nonprod.aws-int.thomsonreuters.com"; option broadcast-address 10.62.49.255; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 10.62.48.1; renew 2 2017/11/21 04:50:11; rebind 2 2017/11/21 05:12:41; expire 2 2017/11/21 05:20:11; } lease { interface "e0a"; fixed-address 10.62.48.42; option subnet-mask 255.255.254.0; option routers 10.62.48.1; option domain-name-servers 10.62.48.2; option host-name "ip-10-62-48-42"; option domain-name "tr-tax-nonprod.aws-int.thomsonreuters.com"; option broadcast-address 10.62.49.255; option dhcp-lease-time 3600; option dhcp-message-type 5; option dhcp-server-identifier 10.62.48.1; renew 2 2017/11/21 04:50:47; rebind 2 2017/11/21 05:13:17; expire 2 2017/11/21 05:20:47; }
ERROR: Failure detected while running pre-ONTAP initialization script.
ERROR: Data ONTAP startup failed - initiating reboot.

 

 

NetApp - AWS Market Place URL - https://aws.amazon.com/marketplace/pp/B01H4LVJ84

 

AMI Id : ami-e1e8978e

 

 

Any help / guidelines here would be great help!

 

Thank you!

 

1 REPLY 1

Thor
3,196 Views

We have seen this type of failure before, though it is seldom in occurrance and has previously only happened within our lab/account setup within AWS. We believe what is happening is an issue on the host VM side (as opposed to ONTAP), though this is debateable. This somehow trips up networking such that communications through 'wget' to the AWS metadata service are failing or not returning the right data. We are in the middle of putting code in the initialization phase that will help us debug this. There is an open case with AWS regarding this, though that one might be because of a different error -- one we also believe to be in the AWS infrastructure.

 

To get around this, you could redeploy if that is possible. Or go into the AWS management console and stop and then start the instances manually. This forces the instances to relocate to a different host, which always solves the problem.

Public