Solved: Re: Forcing OCUM to update from WFA

Jim_Robertson · ‎2017-10-13

When I run a workflow from our WFA 4.1 server the WFA server does not seem to be immediately aware that the workflow has been run. For example, we have workflows to incrementally name volumes: Vol_Prod1, Vol_Prod2, Vol_Prod3, etc. If I run the workflow back to back, it creates the first volume successfully, but when I try to run it again, it tries to create the same volume name again. If I manually do a re-discover in OCUM and then do an "Acquire Now" from WFA, it becomes aware of the new volume and the workflow then runs correctly.

I was under the impression that WFA had its own local database that should be updated when the workflow runs so that it should know about the changes it just made. Isn't that what the reservations are for?

If this is not the case, is there a way to get WFA to force OCUM to do a rediscover and then an "Acquire Now" before kicking off a workflow to make sure that the WFA database has all the most recent information in its database?

mbeattie · ‎2017-10-15

Hi Jim,

I looked into this for you, as of WFA4.0, the commands to Refresh a Cluster in OCUM and Acquire the Data Source are certified content within WFA.

The certified WFA commands are:

Refresh a Cluster discovery in OCUM:

Refresh cluster on OnCommand Unified Manager server
Wait for cluster refresh on OnCommand Unified Manager server

Acquire the OCUM Data Source in WFA:

Acquire data source
Wait for data source acquisition

Thanks to @sinhaa for providing the above details.

It's possible to use these to force a cluster rediscovery and OCUM datasource update however I wouldn't doing that in each workflow. Reservations are designed to avoid this.

It sounds like a reservation issue within the volume creation. Did you use the certified command content or a a custom command to create the volume? Are there any errors with reservations in the logs?

/Matt

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

View solution in original post

coreywanless · ‎2017-10-13

Check to make sure the workflow you are using is considering reserved elements.

1. Edit the workflow

2. Go to the Details tab

3. Make sure "Consider Reserved Elements" is checked.

Most of the built in commands in WFA will create a reservation entry that will reserve the name in the database. This will force the next name algorithm to know that volume already exists.

mbeattie · ‎2017-10-15

Hi Jim,

I looked into this for you, as of WFA4.0, the commands to Refresh a Cluster in OCUM and Acquire the Data Source are certified content within WFA.

The certified WFA commands are:

Refresh a Cluster discovery in OCUM:

Refresh cluster on OnCommand Unified Manager server
Wait for cluster refresh on OnCommand Unified Manager server

Acquire the OCUM Data Source in WFA:

Acquire data source
Wait for data source acquisition

Thanks to @sinhaa for providing the above details.

It's possible to use these to force a cluster rediscovery and OCUM datasource update however I wouldn't doing that in each workflow. Reservations are designed to avoid this.

It sounds like a reservation issue within the volume creation. Did you use the certified command content or a a custom command to create the volume? Are there any errors with reservations in the logs?

/Matt

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

Jim_Robertson · ‎2017-10-18

Thanks @mbeattie!

Just out of curiousity, what is the downside to running a rediscover before each workflow? Obviously that would add to the time it takes for the workflow to complete, but I can't see any other downside. And, if the customer isn't too worried about the runtime, it may make sense to prevent this issue.

I did just confirm that if I run a workflow twice in a row, it tries to create the same volume name twice (despite the fact that it is supposed to increment the name) unless I force the updates between runs. The workflow(s) I'm using originated from the certified workflows, but have been heavily modified from them. Some of them were modified before I started here by someone who is no longer with the company.

In the Details tab of the workflow, I do have the box checked for "Consider Reserved Elements". After I run the workflow, if I look at "Reservations", it lists the job as successful, but it says "NO" under "Cache Updated." If I wait a few minutes, that eventually changes to "YES".

On the one I just ran, I do see a few error messages similar to the following in the workflow logs:
2017-10-18 10:54:35,390 INFO [com.netapp.wfa.engine.reservations.ReservationsRunnerImpl] (default task-30) Failed to run reservation script for reservation with id '11': Column count doesn't match value count at row 1

This is the code under the "Reservation Script" of the Vol Creation Command. It looks like it's the same as the certified command, but maybe when we modified the command it invalidated something in this code? This reservation script seems like a lot more code than is necessary for just a volume creation:

# Create a new volume
INSERT
INTO
    cm_storage.volume
    SELECT
        NULL AS id,
        vs.id AS vserver_id,
        ag.id AS aggregate_id,
        NULL AS parent_volume_id,
        '${VolumeName}' AS name,
        '${Size}' AS size_mb,
        0 AS used_size_mb,
        IF('${SnapshotReservePercentage}' IS NULL || '${SnapshotReservePercentage}'=-1,
        '${Size}',
        '${Size}'-'${SnapshotReservePercentage}'*'${Size}'/100) AS available_size_mb,
        IF('${VolumeType}' IS NULL,
        "rw",
        '${VolumeType}'), -- type
        IF ('${State}'<>'',
        '${State}',
        'online') AS state,
        IF('${JunctionPath}' IS NULL,
        CONCAT("/",
        '${VolumeName}'),
        '${JunctionPath}') AS junction_path,
        '${SpaceGuarantee}' AS space_guarantee,
        0 AS snapshot_used_mb,
        IF('${SnapshotReservePercentage}' IS NULL || '${SnapshotReservePercentage}'=-1,
        0,
        '${SnapshotReservePercentage}') AS snapshot_reserved_percent,
        TRUE AS snapshot_enabled,
        'flex' AS style,
        '${AutosizeMaxSize}' AS max_autosize_mb,
        '64-bit' AS block_type,
        '${SecurityStyle}' AS security_style,
        '${Deduplication}' AS dedupe_enabled,
        '${AutosizeIncrementSize}' AS auto_increment_size_mb,
        sp.id AS snapshot_policy_id,
        NULL AS export_policy_id,
        NULL AS autosize_enabled,
        '${Compression}' AS compression,
        NULL AS deduplication_space_saved_mb,
        NULL AS compression_space_saved_mb,
        NULL AS percent_deduplication_space_saved,
        NULL AS percent_compression_space_saved,
        NULL AS hybrid_cache_eligibility,
        NULL AS inode_files_total,
        NULL AS inode_files_used,
        NULL AS auto_size_mode,
        NULL AS sis_last_op_begin_timestamp,
        NULL AS sis_last_op_end_timestamp,
        NULL AS flexcache_origin_volume,
        NULL AS flexcache_min_reserve_mb,
        NULL AS constituent_role,
        NULL AS is_managed_by_service,
        NULL AS storage_class,
        NULL AS snap_diff_enabled,
        NULL AS max_namespace_constituent_size_mb,
        NULL AS max_data_constituent_size_mb,
        NULL AS efficiency_policy_id,
        qos.id AS qos_policy_group_id,
        IF('${Language}' IS NULL,
        vs.language,
        '${Language}') AS language,
        NULL AS data_daily_growth_rate_mb,
        NULL AS data_days_until_full,
        0 AS auto_delete_enabled,
        NULL AS auto_delete_commitment,
        NULL AS auto_delete_delete_order,
        NULL AS auto_delete_defer_delete,
        NULL AS auto_delete_target_free_space,
        NULL AS auto_delete_trigger,
        NULL AS auto_delete_prefix,
        NULL AS auto_delete_destroy_list
    FROM
        cm_storage.aggregate ag
    JOIN
        cm_storage.node n
            ON ag.name = '${AggregateName}'
            AND ag.node_id = n.id
    JOIN
        cm_storage.vserver vs
            ON vs.name = '${VserverName}'
    JOIN
        cm_storage.cluster cl
            ON (
                cl.primary_address='${Cluster}'
                OR cl.name='${Cluster}'
            )
            AND vs.cluster_id = cl.id
            AND n.cluster_id = cl.id
    LEFT JOIN
        cm_storage.export_policy ep
            ON ep.vserver_id = vs.id
            AND ep.name = '${ExportPolicy}'
    LEFT JOIN
        cm_storage.qos_policy_group qos
            ON qos.name = '${PolicyGroupName}'
            AND qos.vserver_id = vs.id
            AND vs.cluster_id = cl.id
    LEFT JOIN
        cm_storage.snapshot_policy sp
            ON sp.name = CAST( IF('${SnapshotPolicy}' is not null,
        '${SnapshotPolicy}',
        'default') AS CHAR(255) )
        AND sp.vserver_id = vs.id
        AND vs.cluster_id = cl.id ;
#If the '${SnapshotPolicy}' input is not specified, set the volume.snapshot_policy_id to the default snapshot policy id associated with the admin vserver
        IF '${SnapshotPolicy}' is null THEN UPDATE
            cm_storage.volume vol
        JOIN
            cm_storage.vserver vs
                ON vol.name = '${VolumeName}'
                AND vs.type = 'admin'
        JOIN
            cm_storage.cluster cl
                ON (
                    cl.primary_address = '${Cluster}'
                    OR cl.name = '${Cluster}'
                )
                AND cl.id = vs.cluster_id
        JOIN
            cm_storage.snapshot_policy sp
                ON sp.cluster_id = cl.id
                AND sp.name= 'default'
        SET
            vol.snapshot_policy_id = sp.id;
        END IF;
# change aggregate's used capacity
        IF '${SpaceGuarantee}' = 'volume' THEN UPDATE
            cm_storage.aggregate ag
        JOIN
            cm_storage.node n
                ON ag.name = '${AggregateName}'
                AND ag.node_id = n.id
        JOIN
            cm_storage.cluster cl
                ON (
                    cl.primary_address='${Cluster}'
                    OR cl.name='${Cluster}'
                )
                AND n.cluster_id = cl.id
        SET
            ag.used_size_mb = ag.used_size_mb + '${Size}',
            ag.available_size_mb = ag.available_size_mb - '${Size}'
        WHERE
            ag.available_size_mb > 0;
        END IF;
# change aggregate's volume count
        UPDATE
            cm_storage.aggregate ag
        JOIN
            cm_storage.node n
                ON ag.name = '${AggregateName}'
                AND ag.node_id = n.id
        JOIN
            cm_storage.cluster cl
                ON (
                    cl.primary_address='${Cluster}'
                    OR cl.name='${Cluster}'
                )
                AND n.cluster_id = cl.id
        SET
            ag.volume_count = ag.volume_count + 1 ;

Reservation Representation:
"Create volume `" + VolumeName + "` of size " + Size + "MB in aggregate `" + AggregateName + "` of cluster `" + Cluster + "` (Storage Virtual Machine `" + VserverName + "`)"

mbeattie · ‎2017-10-18

Hi Jim

"what is the downside to running a rediscover before each workflow?" . I do not recommend this approach, it is not scalable or efficent. Doing so would require OCUM to invoke ZAPI's to perform a storage rediscovery to the cluster and then have WFA update the OCUM datasource each and every time the workflow is executed. Yes it will work however this may have unintended consequences in terms of performance and load. I guess it depends on your environment and the intended frequency in which the workflow will be invoked. Reservations are designed to avoid having to immediately rediscover the storage and WFA datasource. I'd be more inclined to work on resolving the error with SQL reservation code.

/Matt

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

sinhaa · ‎2017-10-25

@Jim_Robertson @mbeattie

what is the downside to running a rediscover before each workflow?"

There are pros and cons in both approaches.

I first made these redicover and DS acquire commands when WFA didn't expose the reservation tab for the commands. So this approach works even for old WFA versions. The biggest advantage is its great simplicity and absolute reliability. This is mostly for beginner to intermidiate level WFA users.

The biggest problem of using Reservation appraoch is the exact one you are facing, i.e. complexity in building the right query which can work well. This was the very reason Reseravtion wasn't exposed for editing in the old WFA. You need to take care of lot of things and all on your own and that can be a problem to many WFA users. Ese you'll be spinned into a rabbit hole which can be difficult to get out of. Reservation is recommended for Advanced WFA users.

There are other issues as well. The advantage is this doesn't make any API calls to OCUM or any DS aquisition. So that time is saved. Plus other very useful WFA features like reservation within workflows, element existance validation, incremental naming etc are all possible with this approach and NOT with the previous.

sinhaa

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

Jim_Robertson · ‎2018-12-06

Reviving this old thread because I have a related question...

After this original discussion, I was able to create a pretty simple workflow that refreshes the cluster data on OCUM, and then acquires the data from OCUM. It's been working very well, so thanks again to everyone for the help putting that in place.

The only problem I have now is that I have to manually define the data source and OCUM server names as strings in the User Inputs. This isn't a huge deal, but if I have multiple environments (i.e. R&D vs. Production), I have to remember to change that input when I copy the workflow between environments. Is there a way to automatically pull those names? It looks like that information is available in the "wfa.data_source" table, but when I try to run a SQL Query against it, I'm getting the error:

"Failed to execute query for user input '$WFADataSource':
SELECT command denied to user 'restricted'@'localhost' for table 'data_source'"

Is there another way to pull that data, or to give that user access to that table? The user I'm logged into WFA as, and the wfa local user are both set to be admins in WFA.

Thanks!