Integrating StorageGRID with the open-source ELK stack to enhance customer experience

angelacheng · ‎2022-08-15

Though you may already know about the NetApp® StorageGRID® object storage platform and all of its great features, did you know it can be easily integrated with open-source solutions to enhance the customer experience?

In this post, I focus on StorageGRID integration with the Elasticsearch, Logstash, and Kibana (or ELK) stack for log aggregation, analysis, and visualization. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline that ingests and transforms data before sending it to a "stash" like Elasticsearch. Kibana provides data visualization dashboards for Elasticsearch.

Three use cases are covered here.

Enhancing customer experience with object’s metadata search

StorageGRID is highly scalable, supporting up to a few hundred billion S3 objects in a single name space. As you increase the number of objects, the capability to quickly search through them and find the information you need becomes crucial.

S3 is a simple key-based object store whose scalability and low cost make it ideal for storing large datasets. Its design enables S3 to provide excellent performance for storing and retrieving objects based on a known key. Finding objects based on other attributes, however, requires doing a linear search using the LIST operation. Implementing attribute-based queries in S3 can be challenging. A common solution is to build an external index that maps query-able attributes to the S3 object key.

In 2018, StorageGRID introduced the search integration feature, which sends StorageGRID S3 object metadata to a configured Elasticsearch index. This feature leverages Elasticsearch data repositories that are built for fast indexing and lookups. This allows customers to search through multiple buckets and quickly find the required information on Elasticsearch while keeping StorageGRID serving S3 clients requests with optimal performance.

Refer to this guide for complete instructions on configuring StorageGRID search integration with Elasticsearch.

Enhancing StorageGRID log analytics and troubleshooting experience

Most businesses are required to archive and analyze logs as part of their compliance regulations. They must regularly perform system log monitoring and analysis to search for errors, anomalies, or suspicious or unauthorized activity. Log analysis allows them to recreate the chain of events that led up to a problem and effectively troubleshoot it.

StorageGRID 11.6 supports external syslog servers:

Messages from various sources (logs and nodes) are collected in an external centralized system (or database)
Centralized system provides a correlated view of all the log data generated by StorageGRID
Prevents logs from filling up the StorageGRID local file system and the deletion of older logs
Logs can be archived to ensure compliance with governance and regulatory mandates

ELK has become one of the most popular log analytics solutions for software-driven businesses, with thousands of organizations relying on ELK for log analysis and management. Follow this NetApp TV video and instruction document on how ELK can enhance the StorageGRID log analysis and troubleshooting experience.

Enhancing Elasticsearch data protection using StorageGRID

Now that you are ready to use Elasticsearch to index StorageGRID metadata and log messages, you might want to protect this valuable data. Elasticsearch's "snapshot and restore” feature backs up snapshots to an off-cluster storage location called a snapshot repository. The snapshot repository can be on cloud storage such as AWS S3, Google Cloud Storage, Microsoft Azure, or an on-premises S3 compatible storage such as StorageGRID.

Create a snapshot repository using StorageGRID

Add the S3 client access key and secret key to Elasticsearch keystore. See Elasticsearch guide for details.

Sample commands:

/usr/share/elasticsearch/bin/elasticsearch-keystore add s3.client.sg.access_key

/usr/share/elasticsearch/bin/elasticsearch-keystore add s3.client.sg.secret_key

2. Create an s3 repository. From the Kibana UI, select Management > Dev Tools > Console.

Sample command:

PUT _snapshot/sg_s3_repository

{

"type": "s3",

"settings": {

"bucket": "es-snapshot", ^a

"client": "sg", ^b

"endpoint": "sgdemo.netapp.com:10443", ^c

"path_style_access": "true"

}

StorageGRID bucket must exist and empty.
Client name must match the name used in step 1.
StorageGRID S3 endpoint fully qualified domain name (FQDN) or IP address. If this endpoint uses a non-public known CA or self-signed cert, add it to Elasticsearch Java CA trust store and restart Elasticsearch before creating this snapshot repository.

StorageGRID can easily integrate with third-party or open-source solutions to enhance usability, meet your application requirements, and protect your data.

Click here for more StorageGRID enablement solutions. Or go to the StorageGRID website for more information on StorageGRID.