Tech ONTAP Blogs
Tech ONTAP Blogs
As organizations embrace the transformative capabilities of Kubernetes for modernizing their IT infrastructure, the challenge of effectively managing application data in this dynamic environment becomes increasingly critical. NetApp® Astra™ Control offers a robust and reliable solution for Kubernetes data management, with industry-leading recovery point objective (RPO) and data protection for your Kubernetes applications.
Although Astra Control boasts detailed alerting and notification features in its user interface, existing ServiceNow customers might prefer a more streamlined approach to manage their operations. The good news is that the Astra Control API makes it remarkably simple to integrate its notifications seamlessly with the ServiceNow event and/or incident management system.
In this blog post, we delve into a couple of techniques for integrating these two platforms, Astra Control and ServiceNow, highlighting the pros and cons of each. By understanding the benefits of this integration, organizations can enhance their data management practices and optimize incident response, improving overall IT service delivery in their Kubernetes environments.
A simple approach to integrate Astra Control with ServiceNow is by using a Kubernetes CronJob to manage application backups and/or snapshots. This approach allows setting custom schedules directly within the CronJob definition, and if a backup operation fails, a ServiceNow event and/or incident can be created within the same framework.
This is a straightforward and simple solution in smaller environments; however, it can be difficult to manage at scale in large environments. It involves four simple steps:
To initiate application backups against Astra Control, the Kubernetes CronJob must have appropriate access information and privileges mounted to the backup pod. The following example uses the Astra Control SDK, so a config.yaml file is needed, which contains several components.
To create this file, run the following commands, substituting your Astra Control account ID, API authorization token, and project name. If you’re not sure about these values, additional information can be found in the authentication section of the main SDK readme page on GitHub.
API_TOKEN=NL1bSP5712pFCUvoBUOi2JX4xUKVVtHpW6fJMo0bRa8=
ACCOUNT_ID=12345678-abcd-4efg-1234-567890abcdef
ASTRA_PROJECT=astra.netapp.io
cat <<EOF > config.yaml
headers:
Authorization: Bearer $API_TOKEN
uid: $ACCOUNT_ID
astra_project: $ASTRA_PROJECT
EOF
Your config.yaml file should look like this:
$ cat config.yaml
headers:
Authorization: Bearer NL1bSP5712pFCUvoBUOi2JX4xUKVVtHpW6fJMo0bRa8=
uid: 12345678-abcd-4efg-1234-567890abcdef
astra_project: astra.netapp.io
Next, apply your secret to the namespace of the application to be protected:
NAMESPACE=wordpress
kubectl -n $NAMESPACE create secret generic astra-control-config --from-file=config.yaml
If the Astra Control backup fails, a ServiceNow incident or event can be created automatically with the CronJob script. For this functionality, ServiceNow authentication information (which can open an event or incident via an API call), must be stored as a secret within the Kubernetes namespace.
To store this authentication information, run the following command, substituting in the appropriate ServiceNow information:
kubectl -n $NAMESPACE create secret generic servicenow-auth \
--from-literal=snow_instance='dev99999.service-now.com' \
--from-literal=snow_username='admin' \
--from-literal=snow_password='thisIsNotARealPassword'
The backup script displayed below (and stored in the examples/servicenow section of the SDK repository) carries out the following actions:
#!/bin/sh
# This variable is used for uniqueness across backup names, optionally change to a more preferred format
BACKUP_DESCRIPTION=$(date "+%Y%m%d%H%M%S")
# Error Codes
ebase=20
eusage=$((ebase+1))
eaccreate=$((ebase+4))
eaclist=$((ebase+5))
eacdestroy=$((ebase+6))
esnowticket=$((ebase+7))
file_sn_incident() {
errmsg=$1
app=$2
curl "https://${snow_instance}/api/now/table/incident" \
--request POST \
--header "Accept:application/json" \
--header "Content-Type:application/json" \
--data "{'short_description': \"${app}: ${errmsg}\",'urgency':'2','impact':'2'}" \
--user "${snow_username}":"${snow_password}"
rc=$?
if [ ${rc} -ne 0 ] ; then
echo "--> Error creating ServiceNow incident with error message: ${errmsg}"
exit ${esnowticket}
fi
}
create_sn_event() {
errmsg=$1
app=$2
curl "https://${snow_instance}/api/global/em/jsonv2" \
--request POST \
--header "Accept:application/json" \
--header "Content-Type:application/json" \
--user "${snow_username}":"${snow_password}" \
--data @- << EOF
{
"records": [
{
"source": "Instance Webhook",
"node": "Astra Control",
"resource": "${app}",
"type":"Astra Control Disaster Recovery Issue",
"severity":"3",
"description":"${errmsg}",
"additional_info": "{
\"optional-key1\": \"optional-value1\",
\"optional-key2\": \"optional-value2\"
}"
}
]
}
EOF
rc=$?
if [ ${rc} -ne 0 ] ; then
echo "--> Error creating ServiceNow event with error message: ${errmsg}"
exit ${esnowticket}
fi
}
astra_create_backup() {
app=$1
echo "--> creating astra control backup"
actoolkit create backup ${app} cron-${BACKUP_DESCRIPTION}
rc=$?
if [ ${rc} -ne 0 ] ; then
ERR="error creating astra control backup cron-${BACKUP_DESCRIPTION} for ${app}"
create_sn_event $ERR $app
exit ${eaccreate}
fi
}
astra_delete_backups() {
app=$1
backups_keep=$2
echo "--> checking number of astra control backups"
backup_json=$(actoolkit -o json list backups --app ${app})
rc=$?
if [ ${rc} -ne 0 ] ; then
ERR="error running list backups for ${app}"
create_sn_event $ERR $app
exit ${eaclist}
fi
num_backups=$(echo $backup_json | jq -r '.items[] | select(.state=="completed") | .id' | wc -l)
while [ ${num_backups} -gt ${backups_keep} ] ; do
echo "--> backups found: ${num_backups} is greater than backups to keep: ${backups_keep}"
oldest_backup=$(echo ${backup_json} | jq '.items[] | select(.state=="completed")' | jq -s | jq -r 'min_by(.metadata.creationTimestamp) | .id')
actoolkit destroy backup ${app} ${oldest_backup}
rc=$?
if [ ${rc} -ne 0 ] ; then
ERR="error running destroy backup ${app} ${oldest_backup}"
create_sn_event $ERR $app
exit ${eacdestroy}
fi
sleep 120
echo "--> checking number of astra control backups"
backup_json=$(actoolkit -o json list backups --app ${app})
rc=$?
if [ ${rc} -ne 0 ] ; then
ERR="error running list backups for ${app}"
create_sn_event $ERR $app
exit ${eaclist}
fi
num_backups=$(echo $backup_json | jq -r '.items[] | select(.state=="completed") | .id' | wc -l)
done
echo "astra control backups at ${num_backups}"
}
#
# "main"
#
app_id=$1
backups_to_keep=$2
if [ -z ${app_id} ] || [ -z ${backups_to_keep} ] ; then
echo "Usage: $0 <app_id> <backups_to_keep>"
exit ${eusage}
fi
astra_create_backup ${app_id}
astra_delete_backups ${app_id} ${backups_to_keep}
If the default script functionality matches your business needs, you can leave it as is. If you need to modify the script (for instance, switching to the file_sn_incident() function, or customizing the event creation payload), then you should either fork the SDK repository or store the file on a local file share under your organization’s control. Be sure to update line 38 of cron.yaml to the new URL (detailed in the next section).
A sample CronJob is displayed below and can also be found in the examples/service directory of the SDK repository. At a high level, this CronJob is based on a vanilla Alpine docker container, installs the actoolkit Python package, and downloads and executes the backup.sh script.
Customize the following lines for your environment. (Line numbers can be viewed within the GitHub repository.)
apiVersion: batch/v1
kind: CronJob
metadata:
name: astra-backup
spec:
schedule: "0 23 * * *"
jobTemplate:
spec:
template:
spec:
volumes:
- name: astra-control-config
secret:
secretName: astra-control-config
containers:
- name: alpine-actoolkit
image: alpine:latest
imagePullPolicy: IfNotPresent
envFrom:
- secretRef:
name: servicenow-auth
env:
- name: ACTOOLKIT_VERSION
value: "2.6.6"
- name: APP_ID
value: "3baa9263-4cac-4168-9556-2bc290539c33"
- name: BACKUPS_TO_KEEP
value: "3"
command: ["/bin/sh"]
args:
- -c
- >
echo "Starting install" &&
apk add py3-pip curl jq &&
python3 -m pip install --upgrade pip &&
python3 -m pip install actoolkit==$ACTOOLKIT_VERSION &&
echo "Starting file download and execution" &&
curl -sLO https://raw.githubusercontent.com/NetApp/netapp-astra-toolkits/main/examples/servicenow/backup.sh &&
sh backup.sh $APP_ID $BACKUPS_TO_KEEP
volumeMounts:
- mountPath: /etc/astra-toolkits
name: astra-control-config
readOnly: true
restartPolicy: Never
To apply the Kubernetes CronJob, run the following command:
kubectl -n $NAMESPACE apply -f cron.yaml
To verify the status of the CronJob, run the following command:
kubectl -n $NAMESPACE get cronjobs
The following output indicates that the CronJob has not yet executed:
$ kubectl -n $NAMESPACE get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
astra-backup */10 * * * * False 0 <none> 78s
Wait the amount of time specified by your schedule, at which point the last schedule field will be populated:
$ kubectl -n $NAMESPACE get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
astra-backup */10 * * * * False 0 3m57s 86m
To view the status of the backup, first get the pod name:
$ kubectl -n $NAMESPACE get pods
NAME READY STATUS RESTARTS AGE
astra-backup-27903990-d2w82 0/1 Completed 0 4m10s
wordpress-597fbbf884-pxk58 1/1 Running 0 24h
wordpress-mariadb-0 1/1 Running 0 24h
And then view the logs of the astra-backup pod:
$ kubectl -n $NAMESPACE logs astra-backup-27903990-d2w82 | tail
Starting file download and execution
--> creating astra control backup
{"type": "application/astra-appBackup", "version": "1.1", "id": "af6ac8e9-ee14-4ea7-a5c3-a75984fe4c67", "name": "cron-20230120183019", "bucketID": "361aa1e0-60bc-4f1b-ba3b-bdaa890b5bac", "state": "pending", "stateUnready": [], "metadata": {"labels": [{"name": "astra.netapp.io/labels/read-only/triggerType", "value": "backup"}], "creationTimestamp": "2023-01-20T18:30:25Z", "modificationTimestamp": "2023-01-20T18:30:25Z", "createdBy": "8146d293-d897-4e16-ab10-8dca934637ab"}}
Starting backup of 30f1b2c2-ff63-4431-a8af-94db8e4671d3
Waiting for backup to complete..complete!
--> checking number of astra control backups
--> backups found: 4 is greater than backups to keep: 3
Backup b6ceaedb-3d07-438b-9f51-86a6bbb58e1d destroyed
--> checking number of astra control backups
--> backups at 3
Although the workflow just described works well for smaller environments with only a few apps, it can be difficult to manage at scale. For larger environments, NetApp recommends creating a simple application directly within ServiceNow, which then queries the Astra Control notifications API endpoint periodically.
This app consists of four main components:
To create this ServiceNow application, you can either follow the steps below or you can import the update set stored in the Astra Control SDK repository. If you are using the update set, be sure to read Scheduled script creation, later in this document, for details about how to update the script for your environment.
In the All dropdown menu of your ServiceNow instance, enter System Applications and then click Studio.
In the new browser tab that opens automatically, click Create Application.
In the wizard pop-up, click Let’s Get Started.
Enter a name (such as Astra Control) and a description (such as Astra Control Notifications), and optionally upload an image. Click Create.
Click Continue in Studio (Advanced).
Select the Astra Control application.
Click Create Application File and leave Data Model and Table selected in the pop-up. Click Create.
Create a label (such as AC-Notifications), deselect the Create Module button, and then click Submit.
Click the New button in the Table Columns section.
Enter String in the Type field, enter Body in the Column Label field, and enter 255 in the Max Length field (this value is not enforced). Click Submit.
Close the Body column tab (which returns you to the AC-Notifications table tab), and then click the New button to create another column.
Enter Integer in the Type field and enter SequenceCount in the Column Label field. Click Submit.
Scroll down, select the Database Indexes tab, and then click New.
In the pop-up, select the SequenceCount field in the Available column and move it to the Selected column. Select the Unique Index checkbox and then click Create Index.
There’s no need to be notified of when the index is created, because you will be notified immediately due to the empty table.
Click Create Application File, select Outbound Integrations in the left pane, select REST Message, and then click Create.
Fill the following fields, leaving all others default, and then click Submit:
Scroll down and select the Default GET method.
Append ?orderBy=eventTime%20desc to the end of the endpoint URL and then click Update. Close the Default Get tab.
As mentioned previously, this integration can create events and/or open incidents, depending on whether your organization uses event management. If you do not use event management, skip to the next section.
Click Create Application File, select Server Development in the left pane, select Event Registration, and then click Create.
Fill the Event Registration fields with the following values, or according to business practices, and then click Submit:
Click Create Application File, select Server Development in the left pane, select Scheduled Script Execution, and then click Create.
Enter a name, such as List Notifications Cron, and then change the Run dropdown menu to Periodically. In the Repeat Interval field that appears, choose a minute value according to business practices (for example, anything from 1 to 20). The lower the value, the more frequently ServiceNow will query Astra Control. Optionally modify the starting time, leave all other fields as default, and click Submit.
Paste the following script in the script box, update the following lines, and click Update. (Line numbers can be found within the GitHub repository.)
TABLE_NAME = "x_1067116_astra_co_ac_notifications";
REST_NAME = "x_1067116_astra_co.Astra Control - List Notifications";
EVENTQ_NAME = "x_1067116_astra_co.AstraControlEventNoti";
// Performs a GET API call against the notifications endpoint in Astra Control
function makeApiCall() {
var r = new sn_ws.RESTMessageV2(REST_NAME, "Default GET");
r.setStringParameterNoEscape("AstraFQDN", "astra.netapp.io");
r.setStringParameterNoEscape("AccountID", "12345678-abcd-4efg-1234-567890abcdef");
r.setStringParameterNoEscape("API_Token", "Bearer thisIsJustAnExample_token-replaceWithYours==");
var response = r.execute();
var responseBody = response.getBody();
return JSON.parse(responseBody);
}
// Finds and returns the largest sequencecount value from the database
function getSequenceCount() {
var lastSequenceCount = 0;
var target = new GlideRecord(TABLE_NAME);
target.query();
while(target.next()) {
if (Number(target.sequencecount) > lastSequenceCount) {
lastSequenceCount = Number(target.sequencecount);
}
}
gs.info("last sequencecount in DB: " + lastSequenceCount);
return lastSequenceCount;
}
// Adds a new notification object to the database
function addToDB(jObject) {
gs.info("Adding notificationID " + JSON.stringify(jObject.id) + " with sequenceCount " + sequenceCount + " to DB");
var rinsert = new GlideRecord(TABLE_NAME);
rinsert.initialize();
rinsert.setValue("sequencecount", sequenceCount);
rinsert.setValue("body", JSON.stringify(jObject));
rinsert.update();
return rinsert;
}
// Opens an incident based on a notification json
function openIncident(jObject) {
gs.info("Opening case for notificationID " + jObject.id + " with summary: " + jObject.summary);
var inc = new GlideRecord("incident");
inc.initialize();
inc.short_description = "Astra Control: " + jObject.summary;
inc.description = jObject.description;
inc.description += "\n\n" + JSON.stringify(jObject, null, 4);
inc.impact = 2;
inc.urgency = 2;
inc.insert();
}
// Creates an event
function createEvent(jObject, glideRecordObject) {
gs.info("Creating event for notificationID " + jObject.id + " with summary: " + jObject.summary);
var summary = "Astra Control: " + jObject.summary;
var description = JSON.stringify(jObject, null, 4);
gs.eventQueue(EVENTQ_NAME, glideRecordObject, summary, description);
}
try {
var lastSequenceCount = getSequenceCount();
var rjson = makeApiCall();
// Loop through the notifications response
for (var i = 0; i < rjson.items.length; i++) {
var sequenceCount = Number(JSON.stringify(rjson.items[i].sequenceCount));
// If true, then it's a new notification
if (sequenceCount > lastSequenceCount) {
var glideRecordObject = addToDB(rjson.items[i]);
// Create an event for all notification types
createEvent(rjson.items[i], glideRecordObject);
// Optionally uncomment if you prefer to directly open a case
//var severity = JSON.stringify(rjson.items[i].severity);
//if (severity.contains("warning") || severity.contains("critical")) {
//openIncident(rjson.items[i]);
//}
}
}
}
catch(ex) {
gs.error(ex.message);
}
Either wait for the cron to execute the script or click the Execute Script button to run the script manually.
When the script has executed, switch to (or open) the AC-Notifications Table tab, scroll down, and click Show List under Related Links.
The AC Notifications Table opens, populated with entries that match your Astra Control notifications.
Note: if you don’t see the table populated with your entries, then something was done incorrectly. Follow the next two steps to view the system log, which should contain additional information about what went wrong. Optionally add gs.info() statements to the script to further debug it.
Switch to the main ServiceNow browser tab, enter System Log in the All dropdown menu, and then select All under System Log.
You should see informational logging about actions carried out during the script execution.
If you are using event management, enter Event Log in the All dropdown menu and then select Event Log.
Verify that events have been created as expected.
If you are using incident creation, enter Incidents in the All dropdown menu, then select Incidents in the Service Desk section.
Verify that incidents have been created as expected.
In this blog, we explored the crucial reasons for integrating NetApp Astra Control with ServiceNow's IT management capabilities for Kubernetes data management. We highlighted two methods for this integration:
By using either of these integration techniques, organizations can proactively monitor their Kubernetes environment, enhance incident management, and drive operational efficiency. Embracing this alliance enables businesses to excel in the fast-evolving digital landscape and stay ahead of the competition.
To see for yourself how easy it is to protect persistent Kubernetes applications with Astra Control, by using either its UI or the powerful Astra Toolkit, apply for a free trial. Get started today!