ONTAP Discussions
ONTAP Discussions
As most of you already know there are a lot of bugs that prevent to send 8.1RC2 autosupport using an Exchange relay (2007 and 2010 but also 2003 as tested by myself).
And always as you know the 2240 FAS is shipped only with this DOT release!
This causes a lot embarrassment for us, ASP company, that have to say to the customer that us and them will not be able to receive their autosupport unlesse they install...a Sendmail (or some other Linux machine with STMP relays other than Exchange)...This is unacceptable and, let me say it, ridiculous in that environment where Exchange reigns...
Other source of embarrassment is due to the fact that to send an autosupport has always been a very easy action and now has become a nightmare due this silly bug that DOT does not end the message with CRLF.CRLF but LF (where's the programmer that forgot CR?!?)
To avoid any misunderstanding every fix and/or action has been done on the SMTP relays as from instructions...looking at its log the cause of bug appear clear...timeout on transmission for errors on format.
So: briefly. RC2 is supported and on the market since months...how much time do we have to wait again for a fix?
Thanks
https://communities.netapp.com/message/71152
https://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=549239
Hi!
Can you send me options autosupport output if possible, please?
Thanks
Klemen
Klemen, this is the autosupport:
autosupport.cifs.verbose off
autosupport.content complete (value might be overwritten in takeover)
autosupport.doit USER_TRIGGERED (test5)
autosupport.enable on (value might be overwritten in takeover)
autosupport.from netapp-dc1-a@xxx.xxx.br (value might be overwritten in takeover)
autosupport.local_collection on (value might be overwritten in takeover)
autosupport.mailhost smtpapp.xxx.xxx.br (value might be overwritten in takeover)
autosupport.max_http_size 10485760 (value might be overwritten in takeover)
autosupport.max_smtp_size 5242880 (value might be overwritten in takeover)
autosupport.minimal.subject.id systemid (value might be overwritten in takeover)
autosupport.nht_data.enable on (value might be overwritten in takeover)
autosupport.noteto (value might be overwritten in takeover)
autosupport.partner.to (value might be overwritten in takeover)
autosupport.payload_format 7z (value might be overwritten in takeover)
autosupport.performance_data.doit DONT
autosupport.performance_data.enable on (value might be overwritten in takeover)
autosupport.periodic.tx_window 1h (value might be overwritten in takeover)
autosupport.retry.count 15 (value might be overwritten in takeover)
autosupport.retry.interval 4m (value might be overwritten in takeover)
autosupport.support.enable on (value might be overwritten in takeover)
autosupport.support.proxy 0 (value might be overwritten in takeover)
autosupport.support.put_url support.netapp.com/put/AsupPut (value might be overwritten in takeover)
autosupport.support.to autosupport@netapp.com (value might be overwritten in takeover)
autosupport.support.transport smtp (value might be overwritten in takeover)
autosupport.support.url support.netapp.com/asupprod/post/1.0/postAsup (value might be overwritten in takeover)
autosupport.throttle on (value might be overwritten in takeover)
autosupport.to filer@xxx.com (value might be overwritten in takeover)
autosupport.validate_digital_certificate on (value might be overwritten in takeover)
I just filled with "xxx" my customer information.
We had a problem to send the smtp e-mail to an email outside their domain. But it was an Exchange issue, nothing into NetApp. You can see at "/etc/log/mlog/notifyd.log" if the Exchange Relay is authorizing you to send that e-mail.
Regards,
With Data Ontap 8.1RC3 the Autosupport Bug should be fixed.
Bug Id:
Now, we receive a daily autosupport message mit the subject management_log.
Which is the option to do only a weekly_log message?
Thanks for your reply!
Regards,
Florian
Florian,
AutoSupport in 8.1 has been significantly overhauled. From the release notes:
Excellent and updated information on AutoSupport is available in the System Administration Guide for 8.1.
https://now.netapp.com/NOW/knowledge/docs/ontap/rel81rc3/pdfs/ontap/sysadmin.pdf
One of the improvements is related to smoothing out the WEEKLY_LOG AutoSupport traffic at both customer sites and back at NetApp, as well as to solve problems with very large AutoSupport messages. To help with these issues:
1. Log files are separated from the WEEKLY LOG and sent daily, i.e. 7 chunks instead of 1 big weekly chunk
2. Performance data is now sent daily (PERFORMANCE DATA), i.e. 7 chunks instead of 1 big weekly chunk
3. Event-based AutoSupport messages send context-sensitive information
I hope that helps clarify matters.
I wasn't aware of the autosupport commands introduced in 8.1. The man page has full details but here are a couple that seem handy.
1) Check history of ASUPs:
salt> autosupport history show -fields seq-num,status,subject,uri,error,last-update
seq-num destination last-update status subject uri error
------- ----------- ------------------- --------------- ------------------------------- ------------------------ -----
1088 smtp "3/2/2012 11:09:43" sent-successful "USER_TRIGGERED (COMPLETE:now)" mailto:nobody@netapp.com -
1088 http "3/2/2012 11:09:43" ignore "USER_TRIGGERED (COMPLETE:now)" - -
1088 noteto "3/2/2012 11:09:38" ignore "USER_TRIGGERED (COMPLETE:now)" mailto: -
1087 smtp "3/2/2012 11:09:12" sent-successful "USER_TRIGGERED (test)" mailto:nobody@netapp.com -
1087 http "3/2/2012 11:09:12" ignore "USER_TRIGGERED (test)" - -
1087 noteto "3/2/2012 11:09:07" ignore "USER_TRIGGERED (test)" mailto: -
1087 retransmit "3/2/2012 11:24:59" sent-successful "USER_TRIGGERED (test)" mailto:nobody@netapp.com -
1086 smtp "3/2/2012 10:58:13" sent-successful "USER_TRIGGERED (test)" mailto:nobody@netapp.com -
1086 http "3/2/2012 10:54:03" ignore "USER_TRIGGERED (test)" - -
1086 noteto "3/2/2012 10:54:03" ignore "USER_TRIGGERED (test)" mailto: -
1085 smtp "3/2/2012 10:58:09" sent-successful "USER_TRIGGERED (COMPLETE:now)" mailto:nobody@netapp.com "Domain not found"
1085 http "3/2/2012 10:44:59" ignore "USER_TRIGGERED (COMPLETE:now)" - -
1085 noteto "3/2/2012 10:44:57" ignore "USER_TRIGGERED (COMPLETE:now)" mailto: -
2) Get full details about a specific autosupport:
salt> autosupport history show -instance -seq-num 1088
AutoSupport Sequence Number: 1088
Destination for this AutoSupport: smtp
Trigger Event: callhome.invoke.all
Time of Last Update: 3/2/2012 11:09:43
Status of Delivery: sent-successful
Delivery Attempts: 1
AutoSupport Subject: USER_TRIGGERED (COMPLETE:now)
Delivery URI: mailto:nobody@netapp.com
Last Error: -
AutoSupport Sequence Number: 1088
Destination for this AutoSupport: http
Trigger Event: callhome.invoke.all
Time of Last Update: 3/2/2012 11:09:43
Status of Delivery: ignore
Delivery Attempts: 1
AutoSupport Subject: USER_TRIGGERED (COMPLETE:now)
Delivery URI: -
Last Error: -
AutoSupport Sequence Number: 1088
Destination for this AutoSupport: noteto
Trigger Event: callhome.invoke.all
Time of Last Update: 3/2/2012 11:09:38
Status of Delivery: ignore
Delivery Attempts: 1
AutoSupport Subject: USER_TRIGGERED (COMPLETE:now)
Delivery URI: mailto:
Last Error: -
3 entries were displayed.
3) Retransmit a specific autosupport to an email address (can also be some totally other email address):
salt> autosupport history retransmit -seq-num 1087 -uri mailto:nobody2@netapp.com
I saw it on "?" output, but now is much better to know exatly how use it.
Thank you for your information. It will be very useful.
Regards,
I have a FAS2240-4 with 8.1RC3 and Exchange SMTP which works just fine!
However, there seems to be another "issue"...
I have a single FAS2240 but the asups it sends out, have the subject of "HA Group Notification", seems wrong to me. Anybody else experiencing this too?
Peter
Yep, 8.0 shipped with AutoSupport always sending "HA Group Notification" even for single nodes. BURT 371076
AutoSupport can't really use the test "hey partner are you there?" to report HA mode since the partner may be down, rebooting, improperly cabled, etc.
OK, fine for me then.
Well … ASUP is sent from a single node, is not it? It does not collect ASUP from partner and sends, does it? So it can speak only for itself.
The simple fact that it is causing confusion is indication that it probably is … confusing ☺
Congrats... Now I'm confused ...
Supportability features such as AutoSupport are doing more and more for High Availability diagnosis. For example, AutoSupport does things such as collect information about why the partner rebooted and what the local node sees of the other node. In Cluster-mode, AutoSupport reports even more information about other nodes including its storage failover partner.
As an engineer, I wish I didn't have to deal with complexity of thinking about single node issues and multiple node issues but it is my job to work on hiding the complexity and making it simple for the customer and support.
ok, thank you!
Hi Rudy - I have a case open on this.
One node of our 3270 cluster is logging
"transmission-failed MANAGEMENT_LOG support.netapp.com/put/AsupPut "couldn't connect to host" "
(see history below)
It is able to email us the autosupport, but the HTTPS connection to support.netapp.com is failing for this node
With support we've verified routing is the same, autosupport options are the same, traceroute looks the same for both nodes, etc.
The engineer wants me to add a static route - for support.netapp.com but the defauly route is working for the partner and we have any egress firewall rules...
Any ideas what else to check?
thanks!
na04*> autosupport history show -fields seq-num,status,subject,uri,error,last-update
seq-num destination last-update status subject uri error
------- ----------- -------------------- ------ -------------- --- -----
347 smtp "6/14/2012 00:21:07" ignore MANAGEMENT_LOG - -
347 http "6/14/2012 02:48:26" transmission-failed MANAGEMENT_LOG support.netapp.com/put/AsupPut "couldn't connect to host"
347 noteto "6/14/2012 00:21:07" ignore MANAGEMENT_LOG - -
346 smtp "6/14/2012 00:14:34" ignore "PERFORMANCE DATA" - -
346 http "6/14/2012 01:29:31" transmission-failed "PERFORMANCE DATA" support.netapp.com/put/AsupPut "couldn't connect to host"
346 noteto "6/14/2012 00:14:34" ignore "PERFORMANCE DATA" - -
345 smtp "6/13/2012 01:18:38" ignore MANAGEMENT_LOG - -
345 http "6/13/2012 03:14:23" transmission-failed MANAGEMENT_LOG support.netapp.com/put/AsupPut "couldn't connect to host"
345 noteto "6/13/2012 01:18:38" ignore MANAGEMENT_LOG - -
344 smtp "6/13/2012 00:40:33" ignore "PERFORMANCE DATA" - -
344 http "6/13/2012 01:55:28" transmission-failed "PERFORMANCE DATA" support.netapp.com/put/AsupPut "couldn't connect to host"
344 noteto "6/13/2012 00:40:33" ignore "PERFORMANCE DATA" - -
343 smtp "6/12/2012 00:46:44" ignore MANAGEMENT_LOG - -
343 http "6/12/2012 03:20:36" transmission-failed MANAGEMENT_LOG support.netapp.com/put/AsupPut "couldn't connect to host"
343 noteto "6/12/2012 00:46:44" ignore MANAGEMENT_LOG - -
342 smtp "6/12/2012 00:57:16" ignore "PERFORMANCE DATA" - -
7-mode? Try looking at the /etc/log/mlog/notifyd.log files to see more detailed information about the AutoSupport process.
If you ask the support engineer to open a consult request in our internal AutoSupport knowledge exchange, folks like Rudy might be able to help out further.
Reviewing the firewall logs, I see the good head is logging the outgoing ASUP HTTPS connection to netapp.support.com on its vif associated with the default route as expected.
The bad head is logging a vif NOT associated with the default route and the firewall is logging AGE OUT for this (since support.netapp.com is not responding before the timeout)
The routing tables for both show the same default route
Why is the bad head "preferring" the wrong vif for sending autosupports ?
thanks
Because it has routing table telling it to do so.Double check it.
I did - the only routing table entry that _should_ apply is the default route (which the partner head _does_ use successfully)
I added a dumb static route to force it to use the proper vif and autosupports now succeed connecting to support.netapp.com:443
This should not be necessary - feels like a bug.
Please open a technical case with NetApp Support to investigate this further.
Thanks.
I had 2003210902 open the whole time - when I requested "consult request in our internal AutoSupport knowledge exchange" another engineer took over the case.
Then Netapp support recommended the static route - it worked, but I wanted them to look for root cause, why it was necessary in the first place on just this node.
thanks