WFA issue with cache updates

dcornely1 · ‎2013-11-12

Hello,

I'm running the latest version of WFA v2.1.0.70.32 in my environment and I've come across an issue I thought wouldn't exist because I'm using certified commands. The issue is that WFA does not appear to be aware of changes it has made before the OnCommand cache updates occur. Here is the scenario:

Step 1)

I'm creating a new CDOT export policy, export rule, and volume. This flow works without issue and is called ss_cdot_CreateNFSVolumeWithExport

Step 2)

Before WFA and/or OnCommand has had a chance to learn about the change from the step 1 workflow via scheduled cache updates I attempt to run a workflow

that creates a new rule in the policy created from step 1. This fails and will continue to fail until WFA's cache is updated from OnCommand and it learns about this new policy.

This flow is called ss_cdot_CreateExportRule

All the commands in both flows are certified so I had thought that would avoid this issue. I had originally been using a modified No-Op command in the first create flow for the reasons behind this post but even after removing that single non-certified command the problem remains. The only thing I can think of is that I created these flows in WFA v2.0 and recently upgraded to v2.1.

I'm either missing something or have encountered a bug in WFA regarding export policies and cache awareness of them, although I'm leaning towards an error I made somewhere but haven't found it yet. I'm attaching both flows in the hopes that they will reveal where I've tripped up. Hopefully it's something simple, thanks in advance.

-Dave

ag · ‎2013-11-13

Dave,

How do you give the input for export rule specification? I need a lot of time to figure that out from the workflow internals

Will be easier if you let me know. Please specify an example.

dcornely1 · ‎2013-11-13

The input is based on one of the example workflows that came with WFA, specifically "Create a Clustered Data ONTAP NFS Volume". It's a loop that will add each rule --- the number of loops will equal the number of rules requested leveraging a couple different functions.

Both workflows I attached do work successfully independent of each other. The issue is that WFA for some reason is not aware of the export policy it created for the new volume before the next OnCommand cache update occurs, something I thought I was able to work around by using certified commands.

Here is the description from the command I copied:

Export Specification Rule is an input that combines all the Export Rules for a Policy. Each Export Specification Rule is of the form:

<client-specification IP>;read-only rule;read-write rule;superuser rule

Individual rules are separated by an ampersand (&).

For example, the export rule spec '10.10.10.10;ntlm;krb5;sys&20.20.20.20;krb5;sys;ntlm' specifies two export rules:

Export Rule 1) client-specification = 10.10.10.10

read-only rule = ntlm

read-write rule = krb5

superuser rule = sys

Export Rule 2) client-specification = 20.20.20.20

read-only rule = krb5

read-write rule = sys

superuser rule = ntlm

mgoddard · ‎2013-11-13

Hi Dave,

If you check the Reservations tab, you can see which commands get reservations created. And it appears the Create Export Policy/Create Export Rule certified commands do not add a reservations. Looks like a bug to me. I've tried creating a few other types of objects with in-built commands for cDOT and they don't have reservations too.

I got around this by defining the object if the search fails and I know its there, using other attributes from objects that did exist. However that means if its not actually there it fails at execution rather than evaluation time which isn't ideal.

Here's my initial list of cDOT commands missing reservations so you don't hit them. It would be really nice if we could create custom reservations for our commands easily in the field!

Missing Reservations

- Create Export Policy

- Create Export Rule

- Create CIFS Share

- Remove CIFS share ACL

- Add CIFS share ACL

Sorry I don't have a better answer, but I can definitely reproduce your problem (and have been working around it myself).

For now, I would suggest defining the object instead of searching for it, since you are passing in the rule name anyway, and the vserver search will work because its not a new item.

cheers,

- Michael.

ag · ‎2013-11-19

Hi Dave,

I am able to reproduce the issue.

To get around this issue, You can reduce the DFM option "sharemoninterval" to a low value like 30 seconds or 1 minute. That will acquire the data on export-policies. Also reduce the acquire interval for data sources on WFA to a low value. But then again, it is a tedious job and you may end up waiting a few minutes between the two workflows.

As michael said, export policies do not have reservations. It is not a problem with the command.

I am curious as to why you are not using the certified workflow "Create a clustered Data ONTAP NFS volume" which does the same job as the two of your workflows combined?

Thanks,

Anil

dcornely1 · ‎2013-11-20

Thanks for the info on the timeouts - I've got the updates down low already but don't want to make the OnCommand one so low that it add unnecessary load to the cluster.

I've split the 2 flows out because I have to solve for a use case where a customer first provisions a new volume/NFS share and then immediately turns around to add more export rules, perhaps because they forgot about them initially. I don't make up these use cases nor do I establish what is a reasonable SLA for these use cases. I just have to do my best to achieve the SLA for the use cases.

mgoddard · ‎2013-11-19

Hi Dave,

Good news! I've created a custom Create Export Policy command that includes reservations missing from the certified command, you could use it to also avoid the problem in a more robust manor, attached below.

I tested by replacing the certified command in the first workflow (CreateNFSVolWithExport), and the second workflow now finds the policy before a polling cycle.

Hope that's useful!

Kind Regards,

- Michael.

dcornely1 · ‎2013-11-20

Michael, thank you very much! I'll see if I have time to get this in play before we deploy CDOT this weekend.

francoisbnc · ‎2014-02-27

Michael,

I'm looking for a way to use reservation in my custom commands, for caching purpose.

I saw in xml file:

INSERT INTO cm_storage.export_policy

SELECT NULL as id,

PolicyName as name,

vs.id as vserver_id

FROM cm_storage.vserver vs

JOIN

cm_storage.cluster cl

ON (cl.primary_address=Cluster OR cl.name=Cluster)

AND vs.cluster_id = cl.id

AND vs.name = VserverName;

Is it the clue?

Regards,

François

abhit · ‎2014-02-27

Reservation is not supported in custom commands.

This feature may be available in a future release.

Michael, you have used a certified command in the workflow

which has a reservation script.

Regards

Abhi

francoisbnc · ‎2014-02-28

Hi abhit,

I exported my custom command to dar and changed xml to integrate <reservationScript> section from certified command "Clone Volume" .

My problem is fixed.

Do you see something dangerous to use in this way?

François

abhit · ‎2014-02-28

Hi Francois:

Wow. That is a great workaround.

You have to test extensively and qualify it for usage

since it is not a supported or recommended process.

Regards

Abhi

yannb · ‎2014-03-17

Great tip François.

When I did the same thing, it seemed to work, until you do an acquisition in WFA.

I don't know why yet, but what I got, looking at the reservations in WFA web UI, was a "Cache Updated" YES, for an export policy that was not refreshed in OCUM yet (It was "NO" before acquisition). The volume reservation had a good status of refreshed to "NO" (i.e. waiting for OCUM to report it).

I might have done something wrong, I need to do some research

francoisbnc · ‎2014-03-18

Hi Yann,

Where do you retrieve <reservationScript> xml portion? As certified command are not exportable, it was necessary to take a look directly in MySQL DB.

For that, I installed a separate MySQL server, where I have root access and I restored the full DB of WFA. wfa.command_definition was now accessible.

All the informations was in

SELECT reservation_script FROM wfa.command_definition

WHERE command LIKE '%clone%';

François

yannb · ‎2014-03-18

I used your actually, copy and pasted the reservation block, but keeping my own copy of the Export Policy Create command.

It looked consistent, even if I did not understand how variable substitution was done.

yannb · ‎2014-03-18

Well, actually it is even weirder... Qtree creation is not populated in the qtree table either, but that Command is supposed to use reservation... really odd

ag · ‎2014-03-18

Yann,

To answer this specific question, reservations are not directly committed to the specific tables. In your case, qtrees that are newly created are not directly committed to the qtree table. Rather they are stored in the wfa.reservation table and will be committed once the acquisition from OCUM confirms the objects presence. However from the WFA UI, if you were to use a filter to find the newly created qtree, it will use this reservation data and make it appears as though it were taken from the qtree table itself.

yannb · ‎2014-03-19

Yep, I finally figured, thanks for the answer!

So, I have that workflow that creates a qtree and an export policy, empty at first

Then another workflow that adds rules to the export policy.

Here is what happens when I run the first workflow in the reservations :

Here, cache is not updated, good, I expect that. My understanding is that "NO" means "I didn't get that one from OCUM yet"

Then I run "Acquire now" on my OCUM data source in WFA, and here is how it changes reservations :

Export policy is then marked as cache updated... but not the Qtree. It does not make sense because OCUM did not discover that export policy yet.

So now, my second workflow will fail, saying that "No results were found. The following filters have returned empty results:".

It looks like there is a incoherence between the process that refreshes cache and the one that populates the database : i.e. reservations says I got the export policy from OCUM, but the export_policy table does not list it.

If I re-discover in OCUM, then run Acquisition from WFA, everything is back to normal and I can reference my export policy again, and both entries are marked as Cache updates

Does that make sense ? Would that be a problem with the "hack" or the SQL query defined in the reservation section ?

yannb · ‎2014-03-19

Ok, I got it...

What is missing with this "hack" is the "congruence_test", I guess that one is used in the middle to check the cache for an object.

Full DAR file attached

For the record, here is the test I implemented :

SELECT e.id FROM cm_storage.export_policy e JOIN cm_storage.vserver vs ON e.vserver_id = vs.id AND vs.name = '${VserverName}' JOIN cm_storage.cluster c ON (c.primary_address='${Cluster}' OR c.name='${Cluster}') AND vs.cluster_id = c.id WHERE e.name='${PolicyName}';

francoisbnc · ‎2014-03-19

Anil,

I experience the same behavior, event certified commands "clone volume", "remove volume"

my simple workflow delete cloned volume first, before clone again.

In the first round:

Tested the existence with "if volume was not found: disable this command" , the remove volume step was omitted. Good

Clone created successfully.

Second round cache works fine, because " remove volume" is executed, and clone works fine.

As Yann said, if I force OCUM acquire, cache updated change to YES and I tried to relaunch workflow, the first step is ommited again. So delete doesn't occur and clone failed.

What is wrong?

ag · ‎2014-03-19

Francois,

This looks a bit strange because i find that there are reservations and congruence tests in both the remove volume and clone volume command.

With the given description i cannot figure out much.

I will be able to help if you can attach both workflows and a backup of your WFA.