Tech ONTAP Blogs

Leveraging Cloud Insights to Monitor and Collect Logs Using Fluent Bit

ronnyf
NetApp
698 Views

In today's technology landscape, effective management and monitoring of IT infrastructure are crucial to ensure optimal performance and seamless operations of any organization. 
One such technology, NetApp StorageGRID, a software-defined object storage solution, provides a robust platform for managing and protecting of data at scale. 

 

In this blog post, we'll explore how Cloud Insights provides a comprehensive solution for collecting logs, monitoring performance, and sending alerts for your NetApp StorageGRID environment.

Cloud Insights provides a comprehensive monitoring and optimization tool that enables you to gain visibility across your entire infrastructure, including NetApp StorageGRID. It provides a unified view and helps you quickly identify and resolve potential issues. 

 

Benefits of using Cloud Insights with NetApp StorageGRID Integrating Cloud Insights into your StorageGRID environment offers several advantages:

  • Enhanced Visibility: Cloud Insights provides a holistic view of your entire IT infrastructure, making it easier to monitor and manage StorageGRID components.
  • Faster Troubleshooting: The ability to collect and analyze logs simplifies root cause investigation, enabling quicker identification and resolution of issues.
  • Improved Performance: Continuous monitoring and real-time alerts help optimize StorageGRID's performance, ensuring uninterrupted access to critical data.

Add StorageGrid logs into Cloud Insights:

We will use Fluent-Bit as a tool for processing and forwarding logs following grok patterns.
Lets follow these steps to install and configure Fluent-Bit:

  • Install Fluent-Bit by referring to the installation links provided in the official documentation here.
  • Copy the fluent-bit.conf file (output below) to /etc/fluent-bit/fluent-bit.conf
  • Copy the parsers.conf file(output below)  to /etc/fluent-bit/parsers.conf
  • Replace <host> with the fully qualified domain name (FQDN) of the CI tenant (without https or trailing /)
  • Replace <token> with an API key that has read/write permissions for log ingestion  here
  • Restart Fluent-Bit
  • Use the command sudo journalctl -u fluent-bit to review the Fluent-Bit logs if necessary

 

Configure StorageGRID to Send Logs to Fluent-Bit

  • Access the syslog server configuration wizard by referring to the official documentation here
  • During the configuration, enter the IP and port of the system running fluent-bit and select TCP
  • Send test messages to verify the setup.
  • Once you see the expected messages, click "Finish"

As a part of this workflow, Fluent Bit will collect event logs data from one (or several) StorageGrids and route them directly to Cloud Insights.

In the screenshot below we are filtering logs by selecting logs.storagegrid so we can view all of the storagegrid logs routed by Fluent Bit to Cloud Insights-

 

ronnyf_0-1693357478154.png


Now we can easily analyze and evaluate your StorageGrid event logs in a single location for any specific time, track trends of your StorageGrid events, and utilize filters by time and date, severity, and messages; you can also easily create monitors based on filtered alters to track failures and events that happened to StorageGrid to ensure proactive monitoring. 

 

With Cloud Insights, you can easily monitor the storage capacity and performance of StorageGRID. Its pre-built capabilities allow for a comprehensive view of your system's overall health with out-of-the-box alerts to ensure best practices are applied:

 

ronnyf_1-1693356974619.png

 

 

 

Conclusion: Cloud Insights is a powerful tool that simplifies the process of monitoring and managing your NetApp StorageGRID environment. By collecting logs, providing performance metrics and real-time alerts, Cloud Insights ensures the smooth operation of your IT infrastructure, improving performance and reducing the risk of system downtime. Don't let the complexity of your IT environment impact your organization; leverage the power of Cloud Insights and experience the benefits of comprehensive monitoring and infrastructure optimization. 

You can sign up here for a free 30-day trial.

 

 

fluent-bit.conf

[SERVICE]

       Flush 1

       Parsers_File parsers.conf

      

       [INPUT]

       Name syslog

       Parser all

       Tag syslog

       Listen 0.0.0.0

       Port 5140

       Mode tcp

      

       [FILTER]

       Name parser

       Match syslog

       Parser all

       Key_Name log

      

       [FILTER]

       Name parser

       Match syslog

       Parser sgrid-audit

       Parser sgrid-bycast

       Parser sgrid-bycast-event

       Parser syslog-rfc3164-custom

       Key_Name log

      

       # Keep log messages that have a syslog_pri value

       [FILTER]

       Name grep

       Regex syslog_pri .+

       Match syslog

      

       [FILTER]

       Name record_modifier

       Match syslog

       Record type logs.storagegrid

       Record source ${HOSTNAME}

      

       #[OUTPUT]

       # Name stdout

       # Match *

      

       # Output to CI

       [OUTPUT]

       Name http

       tls On

       Match syslog

       Port 443

       Format json

       URI /rest/v1/logs/ingest

       json_date_key timestamp

       json_date_format double

       Host <host>

       Header x-cloudinsights-apikey <token>

 

parsers.conf

 

[PARSER]

       Name syslog-rfc5424

       Format regex

       Regex ^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*?)\]|-)) (?<message>.+)$

       Time_Key time

       Time_Format %Y-%m-%dT%H:%M:%S.%L%z

       Time_Keep On

      

       [PARSER]

       Name syslog-rfc3164-local

       Format regex

       Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$

       Time_Key time

       Time_Format %b %d %H:%M:%S

       Time_Keep On

      

       [PARSER]

       Name syslog-rfc3164

       Format regex

       Regex /^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/

       Time_Key time

       Time_Format %b %d %H:%M:%S

      

       [PARSER]

       Name all

       Format regex

       Regex (?<log>.*)

      

       [PARSER]

       Name syslog-rfc3164-custom

       Format regex

       Regex ^\<(?<syslog_pri>[0-9]+)\>(?<timestamp>[^ ]* {1,2}[^ ]* [^ ]*) (?<logsource>[^ ]*) (?<logsource>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>(?:.*[\|\]]\s*(?<loglevel>[A-Z]+) )?.*)$

       Time_Key timestamp

       Time_Format %b %d %H:%M:%S

      

       [PARSER]

       Name sgrid-audit

       Format regex

       # GROK: <%{POSINT:syslog_pri}>%{SYSLOGBASE} %{TIMESTAMP_ISO8601:event_time} \[AUDT:%{GREEDYDATA:audit_details}\]

       Regex ^<(?<syslog_pri>\b(?:[1-9][0-9]*)\b)>(?<timestamp>\b(?:Jan(?:uary|uar)?|Feb(?:ruary|ruar)?|M(?:a|ä)?r(?:ch|z)?|Apr(?:il)?|Ma(?:y|i)?|Jun(?:e|i)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|O(?:c|k)?t(?:ober)?|Nov(?:ember)?|De(?:c|z)(?:ember)?)\b +(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]) (?!<[0-9])(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9])(?::(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?))(?![0-9])) (?:<(?<facility>\b(?:[0-9]+)\b).(?<priority>\b(?:[0-9]+)\b)> )?(?<logsource>(?:(?:((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?|(?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9]))|\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b))) (?<program>[\x21-\x5a\x5c\x5e-\x7e]+)(?:\[(?<pid>\b(?:[1-9][0-9]*)\b)\])?: (?<event_time>(?>\d\d){1,2}-(?:0?[1-9]|1[0-2])-(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])[T ](?:2[0123]|[01]?[0-9]):?(?:[0-5][0-9])(?::?(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?))?(?:Z|[+-](?:2[0123]|[01]?[0-9])(?::?(?:[0-5][0-9])))?) \[AUDT:(?<audit_details>.*)\]$

       Time_Key timestamp

       Time_Format %b %d %H:%M:%S

      

       [PARSER]

       Name sgrid-bycast

       Format regex

       # GROK: <%{POSINT:syslog_pri}>%{SYSLOGBASE} \|%{NUMBER:node_id} %{NUMBER:process_id} %{WORD:module} %{DATA:code} %{TIMESTAMP_ISO8601:event_time}\| %{LOGLEVEL:loglevel}\s*%{NUMBER:msg_id}(?> %{BASE16NUM:traceid})? %{WORD:module2}\: %{GREEDYDATA:message}

       Regex ^<(?<syslog_pri>\b(?:[1-9][0-9]*)\b)>(?<timestamp>\b(?:Jan(?:uary|uar)?|Feb(?:ruary|ruar)?|M(?:a|ä)?r(?:ch|z)?|Apr(?:il)?|Ma(?:y|i)?|Jun(?:e|i)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|O(?:c|k)?t(?:ober)?|Nov(?:ember)?|De(?:c|z)(?:ember)?)\b +(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]) (?!<[0-9])(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9])(?::(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?))(?![0-9])) (?:<(?<facility>\b(?:[0-9]+)\b).(?<priority>\b(?:[0-9]+)\b)> )?(?<logsource>(?:(?:((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?|(?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9]))|\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b))) (?<program>[\x21-\x5a\x5c\x5e-\x7e]+)(?:\[(?<pid>\b(?:[1-9][0-9]*)\b)\])?: \|(?<node_id>(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))) (?<process_id>(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))) (?<module>\b\w+\b) (?<code>.*?) (?<event_time>(?>\d\d){1,2}-(?:0?[1-9]|1[0-2])-(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])[T ](?:2[0123]|[01]?[0-9]):?(?:[0-5][0-9])(?::?(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?))?(?:Z|[+-](?:2[0123]|[01]?[0-9])(?::?(?:[0-5][0-9])))?)\| (?<loglevel>([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?))\s*(?<msg_id>(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))))(?> (?<traceid>(?<![0-9A-Fa-f])(?:[+-]?(?:0x)?(?:[0-9A-Fa-f]+))))? (?<module2>\b\w+\b)\: (?<message>.*)$

       Time_Key timestamp

       Time_Format %b %d %H:%M:%S

      

       [PARSER]

       Name sgrid-bycast-event

       Format regex

       # GROK: <%{POSINT:syslog_pri}>%{SYSLOGBASE}\s*\[%{DATA:event}\] %{LOGLEVEL:loglevel}\s*%{GREEDYDATA:message}

       Regex ^<(?<syslog_pri>\b(?:[1-9][0-9]*)\b)>(?<timestamp>\b(?:Jan(?:uary|uar)?|Feb(?:ruary|ruar)?|M(?:a|ä)?r(?:ch|z)?|Apr(?:il)?|Ma(?:y|i)?|Jun(?:e|i)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|O(?:c|k)?t(?:ober)?|Nov(?:ember)?|De(?:c|z)(?:ember)?)\b +(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]) (?!<[0-9])(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9])(?::(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?))(?![0-9])) (?:<(?<facility>\b(?:[0-9]+)\b).(?<priority>\b(?:[0-9]+)\b)> )?(?<logsource>(?:(?:((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?|(?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9]))|\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b))) (?<program>[\x21-\x5a\x5c\x5e-\x7e]+)(?:\[(?<pid>\b(?:[1-9][0-9]*)\b)\])?:\s*\[(?<event>.*?)\] (?<loglevel>([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?))\s*(?<message>.*)$

       Time_Key timestamp

       Time_Format %b %d %H:%M:%S

Public