Tech ONTAP Blogs

Integrating ServiceNow with Astra Control for IT operations management

MichaelHaigh
NetApp
2,498 Views

 

Introduction 

 

As organizations embrace the transformative capabilities of Kubernetes for modernizing their IT infrastructure, the challenge of effectively managing application data in this dynamic environment becomes increasingly critical. NetApp® Astra™ Control offers a robust and reliable solution for Kubernetes data management, with industry-leading recovery point objective (RPO) and data protection for your Kubernetes applications.

 

Although Astra Control boasts detailed alerting and notification features in its user interface, existing ServiceNow customers might prefer a more streamlined approach to manage their operations. The good news is that the Astra Control API makes it remarkably simple to integrate its notifications seamlessly with the ServiceNow event and/or incident management system.

 

In this blog post, we delve into a couple of techniques for integrating these two platforms, Astra Control and ServiceNow, highlighting the pros and cons of each. By understanding the benefits of this integration, organizations can enhance their data management practices and optimize incident response, improving overall IT service delivery in their Kubernetes environments.

 

Kubernetes CronJob backups

 

A simple approach to integrate Astra Control with ServiceNow is by using a Kubernetes CronJob to manage application backups and/or snapshots. This approach allows setting custom schedules directly within the CronJob definition, and if a backup operation fails, a ServiceNow event and/or incident can be created within the same framework.

 

This is a straightforward and simple solution in smaller environments; however, it can be difficult to manage at scale in large environments. It involves four simple steps:

 

  1. Secret creation
  2. Backup script modification
  3. CronJob update and apply
  4. CronJob verification

 

Secret creation 

 

To initiate application backups against Astra Control, the Kubernetes CronJob must have appropriate access information and privileges mounted to the backup pod. The following example uses the Astra Control SDK, so a config.yaml file is needed, which contains several components.

 

To create this file, run the following commands, substituting your Astra Control account ID, API authorization token, and project name. If you’re not sure about these values, additional information can be found in the authentication section of the main SDK readme page on GitHub.

 

API_TOKEN=NL1bSP5712pFCUvoBUOi2JX4xUKVVtHpW6fJMo0bRa8=
ACCOUNT_ID=12345678-abcd-4efg-1234-567890abcdef
ASTRA_PROJECT=astra.netapp.io
cat <<EOF > config.yaml
headers:
  Authorization: Bearer $API_TOKEN
uid: $ACCOUNT_ID
astra_project: $ASTRA_PROJECT
EOF

 

Your config.yaml file should look like this:

 

$ cat config.yaml
headers:
  Authorization: Bearer NL1bSP5712pFCUvoBUOi2JX4xUKVVtHpW6fJMo0bRa8=
uid: 12345678-abcd-4efg-1234-567890abcdef
astra_project: astra.netapp.io

 

Next, apply your secret to the namespace of the application to be protected:

 

NAMESPACE=wordpress
kubectl -n $NAMESPACE create secret generic astra-control-config --from-file=config.yaml

 

If the Astra Control backup fails, a ServiceNow incident or event can be created automatically with the CronJob script. For this functionality, ServiceNow authentication information (which can open an event or incident via an API call), must be stored as a secret within the Kubernetes namespace.

 

To store this authentication information, run the following command, substituting in the appropriate ServiceNow information:

 

kubectl -n $NAMESPACE create secret generic servicenow-auth \
    --from-literal=snow_instance='dev99999.service-now.com' \
    --from-literal=snow_username='admin' \
    --from-literal=snow_password='thisIsNotARealPassword'

 

Backup script modification

 

The backup script displayed below (and stored in the examples/servicenow section of the SDK repository) carries out the following actions:

 

  • Accepts two arguments, the app ID and the number of backups to keep on the system, both of which are controlled via the cron.yaml environment variables. 
  • Initiates (and monitors) an Astra Control backup. 
  • Checks whether the number of successful backups is greater than the desired amount, and if so, deletes the necessary number of backups. 
  • If there are any errors with the previous two steps, the create_sn_event() function is called, which creates a ServiceNow event with the relevant details. 
  • The file_sn_incident() function, which opens a ServiceNow incident, is not called by default, but it can be invoked in the same manner as the create_sn_event() function, depending on business practices.

 

#!/bin/sh

# This variable is used for uniqueness across backup names, optionally change to a more preferred format
BACKUP_DESCRIPTION=$(date "+%Y%m%d%H%M%S")

# Error Codes
ebase=20
eusage=$((ebase+1))
eaccreate=$((ebase+4))
eaclist=$((ebase+5))
eacdestroy=$((ebase+6))
esnowticket=$((ebase+7))

file_sn_incident() {
    errmsg=$1
    app=$2
    curl "https://${snow_instance}/api/now/table/incident" \
        --request POST \
        --header "Accept:application/json" \
        --header "Content-Type:application/json" \
        --data "{'short_description': \"${app}: ${errmsg}\",'urgency':'2','impact':'2'}" \
        --user "${snow_username}":"${snow_password}"
    rc=$?
    if [ ${rc} -ne 0 ] ; then
        echo "--> Error creating ServiceNow incident with error message: ${errmsg}"
        exit ${esnowticket}
    fi
}

create_sn_event() {
    errmsg=$1
    app=$2
    curl "https://${snow_instance}/api/global/em/jsonv2" \
        --request POST \
        --header "Accept:application/json" \
        --header "Content-Type:application/json" \
        --user "${snow_username}":"${snow_password}" \
        --data @- << EOF
{
    "records": [
        {
            "source": "Instance Webhook",
            "node": "Astra Control",
            "resource": "${app}",
            "type":"Astra Control Disaster Recovery Issue",
            "severity":"3",
            "description":"${errmsg}",
            "additional_info": "{
                \"optional-key1\": \"optional-value1\",
                \"optional-key2\": \"optional-value2\"
            }"
        }
    ]
}
EOF
    rc=$?
    if [ ${rc} -ne 0 ] ; then
        echo "--> Error creating ServiceNow event with error message: ${errmsg}"
        exit ${esnowticket}
    fi
}

astra_create_backup() {
    app=$1
    echo "--> creating astra control backup"
    actoolkit create backup ${app} cron-${BACKUP_DESCRIPTION}
    rc=$?
    if [ ${rc} -ne 0 ] ; then
        ERR="error creating astra control backup cron-${BACKUP_DESCRIPTION} for ${app}"
        create_sn_event $ERR $app
        exit ${eaccreate}
    fi
}

astra_delete_backups() {
    app=$1
    backups_keep=$2

    echo "--> checking number of astra control backups"
    backup_json=$(actoolkit -o json list backups --app ${app})
    rc=$?
    if [ ${rc} -ne 0 ] ; then
        ERR="error running list backups for ${app}"
        create_sn_event $ERR $app
        exit ${eaclist}
    fi
    num_backups=$(echo $backup_json | jq  -r '.items[] | select(.state=="completed") | .id' | wc -l)

    while [ ${num_backups} -gt ${backups_keep} ] ; do

        echo "--> backups found: ${num_backups} is greater than backups to keep: ${backups_keep}"
        oldest_backup=$(echo ${backup_json} | jq '.items[] | select(.state=="completed")' | jq -s | jq -r 'min_by(.metadata.creationTimestamp) | .id')
        actoolkit destroy backup ${app} ${oldest_backup}
        rc=$?
        if [ ${rc} -ne 0 ] ; then
            ERR="error running destroy backup ${app} ${oldest_backup}"
            create_sn_event $ERR $app
            exit ${eacdestroy}
        fi

        sleep 120
        echo "--> checking number of astra control backups"
        backup_json=$(actoolkit -o json list backups --app ${app})
        rc=$?
        if [ ${rc} -ne 0 ] ; then
            ERR="error running list backups for ${app}"
            create_sn_event $ERR $app
            exit ${eaclist}
        fi
        num_backups=$(echo $backup_json | jq  -r '.items[] | select(.state=="completed") | .id' | wc -l)
    done

    echo "astra control backups at ${num_backups}"
}

#
# "main"
#
app_id=$1
backups_to_keep=$2
if [ -z ${app_id} ] || [ -z ${backups_to_keep} ] ; then
    echo "Usage: $0 <app_id> <backups_to_keep>"
    exit ${eusage}
fi

astra_create_backup ${app_id}
astra_delete_backups ${app_id} ${backups_to_keep}

 

If the default script functionality matches your business needs, you can leave it as is. If you need to modify the script (for instance, switching to the file_sn_incident() function, or customizing the event creation payload), then you should either fork the SDK repository or store the file on a local file share under your organization’s control. Be sure to update line 38 of cron.yaml to the new URL (detailed in the next section).

 

CronJob update and apply

 

A sample CronJob is displayed below and can also be found in the examples/service directory of the SDK repository. At a high level, this CronJob is based on a vanilla Alpine docker container, installs the actoolkit Python package, and downloads and executes the backup.sh script. 

 

Customize the following lines for your environment. (Line numbers can be viewed within the GitHub repository.)

 

  • 6. (Optional) Modify the CronJob schedule per business requirements. 
  • 25-26. Update the APP_ID environment variable to match your application ID. 
  • 27-28. (Optional) Update the BACKUPS_TO_KEEP environment variable to set the number of successful backups that should be kept. 
  • 38. If the backup.sh script needs to be modified for your environment, then update the download path to a repository or file share under your organization’s control. 

 

apiVersion: batch/v1
kind: CronJob
metadata:
  name: astra-backup
spec:
  schedule: "0 23 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          volumes:
            - name: astra-control-config
              secret:
                secretName: astra-control-config
          containers:
          - name: alpine-actoolkit
            image: alpine:latest
            imagePullPolicy: IfNotPresent
            envFrom:
            - secretRef:
                name: servicenow-auth
            env:
              - name: ACTOOLKIT_VERSION
                value: "2.6.6"
              - name: APP_ID
                value: "3baa9263-4cac-4168-9556-2bc290539c33"
              - name: BACKUPS_TO_KEEP
                value: "3"
            command: ["/bin/sh"]
            args:
            - -c
            - >
              echo "Starting install" &&
              apk add py3-pip curl jq &&
              python3 -m pip install --upgrade pip &&
              python3 -m pip install actoolkit==$ACTOOLKIT_VERSION &&
              echo "Starting file download and execution" &&
              curl -sLO https://raw.githubusercontent.com/NetApp/netapp-astra-toolkits/main/examples/servicenow/backup.sh &&
              sh backup.sh $APP_ID $BACKUPS_TO_KEEP
            volumeMounts:
              - mountPath: /etc/astra-toolkits
                name: astra-control-config
                readOnly: true
          restartPolicy: Never

 

To apply the Kubernetes CronJob, run the following command: 

 

kubectl -n $NAMESPACE apply -f cron.yaml

 

CronJob verification

 

To verify the status of the CronJob, run the following command: 

 

kubectl -n $NAMESPACE get cronjobs

 

The following output indicates that the CronJob has not yet executed:

 

$ kubectl -n $NAMESPACE get cronjobs
NAME           SCHEDULE       SUSPEND   ACTIVE   LAST SCHEDULE   AGE
astra-backup   */10 * * * *   False     0        <none>          78s

 

Wait the amount of time specified by your schedule, at which point the last schedule field will be populated:

 

$ kubectl -n $NAMESPACE get cronjobs
NAME           SCHEDULE       SUSPEND   ACTIVE   LAST SCHEDULE   AGE
astra-backup   */10 * * * *   False     0        3m57s           86m

 

To view the status of the backup, first get the pod name: 

 

$ kubectl -n $NAMESPACE get pods
NAME                          READY   STATUS      RESTARTS   AGE
astra-backup-27903990-d2w82   0/1     Completed   0          4m10s
wordpress-597fbbf884-pxk58    1/1     Running     0          24h
wordpress-mariadb-0           1/1     Running     0          24h

 

And then view the logs of the astra-backup pod:

 

$ kubectl -n $NAMESPACE logs astra-backup-27903990-d2w82 | tail
Starting file download and execution
--> creating astra control backup
{"type": "application/astra-appBackup", "version": "1.1", "id": "af6ac8e9-ee14-4ea7-a5c3-a75984fe4c67", "name": "cron-20230120183019", "bucketID": "361aa1e0-60bc-4f1b-ba3b-bdaa890b5bac", "state": "pending", "stateUnready": [], "metadata": {"labels": [{"name": "astra.netapp.io/labels/read-only/triggerType", "value": "backup"}], "creationTimestamp": "2023-01-20T18:30:25Z", "modificationTimestamp": "2023-01-20T18:30:25Z", "createdBy": "8146d293-d897-4e16-ab10-8dca934637ab"}}
Starting backup of 30f1b2c2-ff63-4431-a8af-94db8e4671d3
Waiting for backup to complete..complete!
--> checking number of astra control backups
--> backups found: 4 is greater than backups to keep: 3
Backup b6ceaedb-3d07-438b-9f51-86a6bbb58e1d destroyed
--> checking number of astra control backups
--> backups at 3

 

ServiceNow application 

 

Although the workflow just described works well for smaller environments with only a few apps, it can be difficult to manage at scale. For larger environments, NetApp recommends creating a simple application directly within ServiceNow, which then queries the Astra Control notifications API endpoint periodically.

 

This app consists of four main components:

 

  • A table that stores the Astra Control notifications
  • A REST message that queries the Astra Control notifications endpoint
  • Event registration, which allows the creation of new events
  • A scheduled script that periodically:
    • Gets the most recent notification from the table
    • Makes an API call to the notifications endpoint
    • Adds new notifications to the table
    • Creates an event for new notifications
    • Optionally opens an incident for critical or warning notifications (commented out by default)

 

To create this ServiceNow application, you can either follow the steps below or you can import the update set stored in the Astra Control SDK repository. If you are using the update set, be sure to read Scheduled script creation, later in this document, for details about how to update the script for your environment.

 

App creation

 

In the All dropdown menu of your ServiceNow instance, enter System Applications and then click Studio.

MichaelHaigh_0-1690466368255.png

In the new browser tab that opens automatically, click Create Application.

MichaelHaigh_1-1690466368256.png

In the wizard pop-up, click Let’s Get Started.

MichaelHaigh_2-1690466368258.png

Enter a name (such as Astra Control) and a description (such as Astra Control Notifications), and optionally upload an image. Click Create.

MichaelHaigh_3-1690466368259.png

Click Continue in Studio (Advanced). 

MichaelHaigh_4-1690466368260.png

Select the Astra Control application. 

MichaelHaigh_5-1690466368262.png

 

Table creation

 

Click Create Application File and leave Data Model and Table selected in the pop-up. Click Create.

MichaelHaigh_6-1690466368263.png

Create a label (such as AC-Notifications), deselect the Create Module button, and then click Submit.

MichaelHaigh_7-1690466368264.png

Click the New button in the Table Columns section.

MichaelHaigh_8-1690466368266.png

Enter String in the Type field, enter Body in the Column Label field, and enter 255 in the Max Length field (this value is not enforced). Click Submit.

MichaelHaigh_9-1690466368268.png

Close the Body column tab (which returns you to the AC-Notifications table tab), and then click the New button to create another column.

MichaelHaigh_10-1690466368269.png

Enter Integer in the Type field and enter SequenceCount in the Column Label field. Click Submit.

MichaelHaigh_11-1690466368271.png

Scroll down, select the Database Indexes tab, and then click New.

MichaelHaigh_12-1690466368273.png

In the pop-up, select the SequenceCount field in the Available column and move it to the Selected column. Select the Unique Index checkbox and then click Create Index. 

MichaelHaigh_13-1690466368274.png

There’s no need to be notified of when the index is created, because you will be notified immediately due to the empty table.

 

REST message creation

 

Click Create Application File, select Outbound Integrations in the left pane, select REST Message, and then click Create.

MichaelHaigh_14-1690466368276.png

Fill the following fields, leaving all others default, and then click Submit:

 

  • Name: Astra Control - List Notifications
  • Description: Astra Control - List Notifications REST call.
    endpoint = "core/v1/notifications"
  • Endpoint: https://${AstraFQDN}/accounts/${AccountID}/core/v1/notifications
  • HTTP Request: HTTP Header 
    • Authorization
    • ${API_Token}

MichaelHaigh_15-1690466368278.png

Scroll down and select the Default GET method. 

MichaelHaigh_16-1690466368279.png

Append ?orderBy=eventTime%20desc to the end of the endpoint URL and then click Update. Close the Default Get tab.

MichaelHaigh_17-1690466368281.png

 

Event registration creation

 

As mentioned previously, this integration can create events and/or open incidents, depending on whether your organization uses event management. If you do not use event management, skip to the next section.

 

Click Create Application File, select Server Development in the left pane, select Event Registration, and then click Create.

MichaelHaigh_18-1690466368283.png

Fill the Event Registration fields with the following values, or according to business practices, and then click Submit:

 

  • Suffix: AstraControlEventNotification
  • Table: The table previously created (AC-Notifications)
  • Fired by: Scheduled Script Execution - Astra Control List Notifications Cron
  • Description: This event is fired for each Astra Control Notification

MichaelHaigh_19-1690466368284.png

 

Scheduled script creation

 

Click Create Application File, select Server Development in the left pane, select Scheduled Script Execution, and then click Create.

MichaelHaigh_20-1690466368286.png

Enter a name, such as List Notifications Cron, and then change the Run dropdown menu to Periodically. In the Repeat Interval field that appears, choose a minute value according to business practices (for example, anything from 1 to 20). The lower the value, the more frequently ServiceNow will query Astra Control. Optionally modify the starting time, leave all other fields as default, and click Submit.

MichaelHaigh_21-1690466368287.png

Paste the following script in the script box, update the following lines, and click Update. (Line numbers can be found within the GitHub repository.)

 

  • 1-3 need to be updated with your database, REST message, and event queue names, respectively. These values can be found in the open tabs.
  • 8 can be left alone if you’re using Astra Control Service, but if you’re running Astra Control Center, it should be updated with your ACC FQDN or IP.
  • 9-10 can be gathered from the API access section of the Astra Control UI.
  • 42-52 define a function that creates an incident; however, by default it is not invoked. Optionally change any parameters per business practices.
  • 55-60 define a function that creates an event; optionally change any parameters according to business practices.
  • 72 invokes the createEvent() function for each new notification; optionally comment out if not using event management.
  • 75-78 are the commented-out lines that call the openIncident() function for any new warning or critical notification; optionally comment out

 

TABLE_NAME = "x_1067116_astra_co_ac_notifications";
REST_NAME = "x_1067116_astra_co.Astra Control - List Notifications";
EVENTQ_NAME = "x_1067116_astra_co.AstraControlEventNoti";

// Performs a GET API call against the notifications endpoint in Astra Control
function makeApiCall() {
    var r = new sn_ws.RESTMessageV2(REST_NAME, "Default GET");
    r.setStringParameterNoEscape("AstraFQDN", "astra.netapp.io");
    r.setStringParameterNoEscape("AccountID", "12345678-abcd-4efg-1234-567890abcdef");
    r.setStringParameterNoEscape("API_Token", "Bearer thisIsJustAnExample_token-replaceWithYours==");
    var response = r.execute();
    var responseBody = response.getBody();
    return JSON.parse(responseBody);
}

// Finds and returns the largest sequencecount value from the database
function getSequenceCount() {
    var lastSequenceCount = 0;
    var target = new GlideRecord(TABLE_NAME);
    target.query(); 
    while(target.next()) {
        if (Number(target.sequencecount) > lastSequenceCount) {
            lastSequenceCount = Number(target.sequencecount);
        }
    }
    gs.info("last sequencecount in DB: " + lastSequenceCount);
    return lastSequenceCount;
}

// Adds a new notification object to the database
function addToDB(jObject) {
    gs.info("Adding notificationID " + JSON.stringify(jObject.id) + " with sequenceCount " + sequenceCount + " to DB");
    var rinsert = new GlideRecord(TABLE_NAME);
    rinsert.initialize();
    rinsert.setValue("sequencecount", sequenceCount);
    rinsert.setValue("body", JSON.stringify(jObject));
    rinsert.update();
    return rinsert;
}

// Opens an incident based on a notification json
function openIncident(jObject) {
    gs.info("Opening case for notificationID " + jObject.id + " with summary: " + jObject.summary);
    var inc = new GlideRecord("incident");
    inc.initialize();
    inc.short_description = "Astra Control: " + jObject.summary;
    inc.description = jObject.description;
    inc.description += "\n\n" + JSON.stringify(jObject, null, 4);
    inc.impact = 2;
    inc.urgency = 2;
    inc.insert();
}

// Creates an event
function createEvent(jObject, glideRecordObject) {
    gs.info("Creating event for notificationID " + jObject.id + " with summary: " + jObject.summary);
    var summary = "Astra Control: " + jObject.summary;
    var description = JSON.stringify(jObject, null, 4);
    gs.eventQueue(EVENTQ_NAME, glideRecordObject, summary, description);
}

try {
    var lastSequenceCount = getSequenceCount();
    var rjson = makeApiCall();
    // Loop through the notifications response
    for (var i = 0; i < rjson.items.length; i++) {
        var sequenceCount = Number(JSON.stringify(rjson.items[i].sequenceCount));
        // If true, then it's a new notification
        if (sequenceCount > lastSequenceCount) {
            var glideRecordObject = addToDB(rjson.items[i]);
            // Create an event for all notification types
            createEvent(rjson.items[i], glideRecordObject);

            // Optionally uncomment if you prefer to directly open a case
            //var severity = JSON.stringify(rjson.items[i].severity);
            //if (severity.contains("warning") || severity.contains("critical")) {
                //openIncident(rjson.items[i]);
            //}
        }
    }
}
catch(ex) {
    gs.error(ex.message);
}

 

MichaelHaigh_22-1690466368289.png

Either wait for the cron to execute the script or click the Execute Script button to run the script manually.

 

Verification

 

When the script has executed, switch to (or open) the AC-Notifications Table tab, scroll down, and click Show List under Related Links.

MichaelHaigh_23-1690466368291.png

The AC Notifications Table opens, populated with entries that match your Astra Control notifications.

MichaelHaigh_24-1690466368292.png

Note: if you don’t see the table populated with your entries, then something was done incorrectly. Follow the next two steps to view the system log, which should contain additional information about what went wrong. Optionally add gs.info() statements to the script to further debug it.

 

Switch to the main ServiceNow browser tab, enter System Log in the All dropdown menu, and then select All under System Log.

MichaelHaigh_25-1690466368295.png

You should see informational logging about actions carried out during the script execution.

MichaelHaigh_26-1690466368297.png

If you are using event management, enter Event Log in the All dropdown menu and then select Event Log.

MichaelHaigh_27-1690466368299.png

Verify that events have been created as expected.

MichaelHaigh_28-1690466368301.png

If you are using incident creation, enter Incidents in the All dropdown menu, then select Incidents in the Service Desk section.

MichaelHaigh_29-1690466368303.png

Verify that incidents have been created as expected.

MichaelHaigh_30-1690466368305.png

 

Conclusion

 

In this blog, we explored the crucial reasons for integrating NetApp Astra Control with ServiceNow's IT management capabilities for Kubernetes data management. We highlighted two methods for this integration:

 

  • Kubernetes CronJob: Automates Astra Control backups and files ServiceNow events and incidents in case of failures. 
  • ServiceNow application: Queries Astra Control notifications API endpoint to create events and incidents directly in ServiceNow. 

 

By using either of these integration techniques, organizations can proactively monitor their Kubernetes environment, enhance incident management, and drive operational efficiency. Embracing this alliance enables businesses to excel in the fast-evolving digital landscape and stay ahead of the competition.

 

Take advantage of NetApp’s continuing innovation

 

To see for yourself how easy it is to protect persistent Kubernetes applications with Astra Control, by using either its UI or the powerful Astra Toolkit, apply for a free trial. Get started today! 

Public