Hi Craig, Thanks for the info, I'll give that a try and report back. I'd still like to know why the limitation exists at all. I'm sure there's a good reason for it, but for the life of me I can't think what reason there can be to stop sdw from snapshotting a cloned vol. C
... View more
Hi, We are using SM SAP 3.3 and SDW 6.4.2 on Win2008 Server (VM). We've successfully created a clone of a database and mounted it on another server - this works very well. However, we would like to be able to use SMSAP to take snapshots of the cloned database. It appears that Snapdrive (and so SMSAP) doesn't support taking snapshots of LUN's backed be a clone. We have used the volume clone method, so the cloned LUN's are within their own volume (effectively) so I don't understand why there is a limitation in Snapdrive? We can snapshot the cloned volume on the storage console, so I don't understand why snapdrive cannot do the same. Does anyone have a view on this, or know if there's a workaround, or if this is a limitation that might disappear in a future release? Looking at the docs for SDW 5.0, it appears to still be a limitation in that release also. Thanks for any help, Craig
... View more
Thanks for the response Thomas, I agree totally, and I'm not worried, just curious. Well...that and people keep asking why DFM is alerting for high CPU and I'm having trouble explaining why it's nothing to worry about!! It would be good to have an understanding of what is happening, either an error in the cpu_busy counter, or a domain that isn't included in the sysstat/statit output???
... View more
Having the exact same problem here. Did anyone ever find a solution? First login attempt via LDAP/AD fails every time. Second attempt is OK. Thanks, Craig
... View more
Hi Tony, Hope you are well. Can you check if the filer(s) are set to send SNMP traps to the DFM server? Check the output of the 'snmp' command on the filer(s). Traphost should include the hostname or ip of the DFM server and 'init' should be set to '1'. If not... snmp traphost add <hostname | IP address> snmp init 1 ...should do the trick. Regards, Craig
... View more
It's a limitation of a given host, ESX or otherwise. The 256 LUN limit applies to each ESX host, but since an RDM needs to be presented to all ESX hosts, the limit is effectively per cluster. It's also a concern here, but so far I've drawn a blank as far as finding a solution (other than creating a new cluster, that is!!). If anyone has any solutions, I'd love to hear them - I know we're not alone!! Craig
... View more
Yes, that's correct. Just as long as the block is locked by other snapshot(s) and/or the AFS, deleting the snapshot will not allow it to be freed. You may already know this, but you can also use the 'snap reclaimable' command to find out how much space would be reclaimed by deleting a specified snapshot or snapshots.
... View more
Comments below in BOLD: matt090385 wrote: Hi, Could someone help me get my head around snapshots and how in some instances deleting one snapshot doesn't save me space. My understanding of snapshot technology is also a little sketchy and this doesn't help me with my question above. Here a step by step guide of what I think I know. AFS contains blocks A, B, C T0 snapshot occurs - Size of snapshot is zero (although I'm assuming this is not strictly true as what I believe happens is the pointers to the active blocks are copied into the snapshot, so ill say zero even though it might be a couple of Kb - (Blocks A,B,C are locked in place) Block D is added, this goes into the active file system the snapshot doesn't change size at all as new data has no effect T1 snapshot occurs Size is zero - blocks locked by this snapshot are A,B,C,D or is it just Block D as A,B,C are locked by the previous snapshot? >>> a snapshot 'locks' every block in the AFS at the time regardless of whether they are already locked by any other snapshot. T0 Snapshot is still zero Block D is deleted T1 snapshot grows by the size of Block D T0 snapshot is still zero >>> Correct, but see below... Block E is created - This has no affect on any snapshots Block A is deleted T1 snapshot grows by the size of Block A (so now is A+D big) T0 snapshot stays at zero >>> Here's probably where you are getting confused. I find it's better not to think of individual snapshots having sizes, rather the size of all snapshots in a given volume. The reason is the .snapshot usage at this point will be the sum of blocks locked by snapshots that are not currently part of the active filesystem. So, at this point the snapshot's will consume A and D. Both T0 and T1 have locks on block A. Deleting one snapshot or the other will not release block A, only if you delete both snapshots. T2 Snapshot is created and is zero - BLOCKS being locked are B,C,E Block E is deleted T2 snapshot goes up by the value of E T1 is still A+D T0 is still Zero >>> think of it in terms of blocks, and what has a lock on what. So (and I think I've followed you correctly so far): A is locked by T0, T1 B is locked by T0, T1, T2, AFS C is locked by T0, T1, T2, AFS D is locked by T1 E is locked by T2 Is this all correct so far? Right, so if I were to delete T0 I would gain back nothing - Sure I get that. If I were to delete T1 which currently holds A+D, the space reclaimable would be zero as the 2 blocks would role over to T0 as they are still referenced in this snapshot.. correct? >>> Not quite; T1 only has an exclusive lock on block D, so only block D would be freed by deleting snapshot T1. Does anyone have a way of summing this up therefore making it easier to understand? >>> Hope that helps.
... View more
Ah, OK, I see:- so if either of those options are selected it changes the criteria from 'do this if not found" to "do this if found". Subtle difference, but it works as you say. Thank you very much for the help, Craig
... View more
Thanks Sivaprasad K, I appreciate your help with this!! Here's how I had it set up originally (by specifying the dataset name implicitly): Here's the new setup (searching for dataset by name):
... View more
Hi Sivaprasad K, Thanks for the reply. That works, but it introduces another problem for me:- I'm also using the Advanced tab in the command to determine whether to run the command based on the value of the user variable $ProtectionLevel. So, if $ProtectionLevel is Gold, Silver or Bronze it will execute the command, otherwise it will not. When I set the Dataset name option to "by searching for an existing Dataset" as you suggest, it changes the 'Advanced' tab for the command - it now only executes the command if the Dataset was not found AND $ProtectionLevel is Gold, Silver or Bronze. What I want to do is run the command if the $ProtectionLevel variable is correct and if the Dataset was found. Thanks in advance, Craig
... View more
Hi All, Scratching my head here - wonder of someone can help? I have 4 datasets which do the same thing, but at staggered schedules. This is because I can't update all our Oracle mirrors using the same schedule as it overloads the source filers. So I have datasets called 'ORA 1', 'ORA 2', 'ORA 3' & 'ORA 4'. When we provision storage for a new database, it gets added to the dataset with the least resources (thus keeping them evenly balanaced, more or less). So I want to do the same thing within a workflow using the 'Add volume to dataset' command. I've created a filter which finds datasets by name prefix, and returns the dataset names in order of the number of resources, using the following SQL: SELECT dataset.name, dataset.qtree_id, dataset.id, dataset.dfm_name, dataset.volume_id, dataset.uuid, count(*) FROM storage.dataset WHERE dataset.name like '${name}%' GROUP BY dataset.name ORDER BY count(*) asc So, if $name is 'ORA %' then it returns my 4 datasets in the correct order. The question is, how can I use this filter/finder in the 'Add Volume to Dataset' command? In the Dataset tab, I only get the option of using an 'Incremental Naming Wizard' for the dataset name field, but not sure this is what I'm looking for. Any thoughts? WFA 2.0, Windows. Many thanks, Craig
... View more
We're having similar concerns with the number of LUN's per ESX cluster. To answer the question about the number of LUN's, one case in point is with SAP (Windows). In order to meet best practice (SAP and NetApp's for SM SAP) we need 7 RDM LUN's per SAP instance (sapdata1-4, origlogA-B, oraarch). We've moved as much as we can onto vmdk's, but 7 seems to be the least we can have in our environment. As a result we are limited to around 37 SAP instances per cluster, which is not many when you account for dev/test and clones used for verifications, etc. Craig
... View more
Hi, I’ve configured a Protection Manager dataset with a Provisioning policy to create secondary (SnapMirror) storage when a primary storage volume is added to the Dataset. The dataset protection policy is just a single Mirror. The provisioning policy defines a resource group containing 4 aggregates at DR, and has the default options (ie just requires RAID-DP) . The process is: Add volume to dataset. Dataset defines a Snapmirror relationship using a ‘Secondary’ type provisioning policy for the Mirror, so it will create a secondary volume using a resource pool called ‘DR – SATA’. Resource pool 'DR - SATA' contains 4 aggregates, all using 1TB SATA, all the same size. The utilization on these aggrs is as follows: drfiler1:aggr00_sata = 69% drfiler1:aggr01_sata = 74% drfiler2:aggr02_sata = 33% drfiler2:aggr03_sata = 40% The question is about how the Provisioning Policy selects the aggregate to provision the SnapMirror destination volumes. I’ve tested this but strangely, it is selecting aggr00_sata for the mirror destination volumes. Based on usage, I would expect it to choose the one with the most free space (drlfiler2:aggr02_sata). Generally, the disk I/O and filer cpu is significantly lighter on drfiler2, so I don't think it can be selecting drfiler01 based on performance. Does anyone know if there are any logs, etc which can be used to determine what the decision making process was? Thanks, Craig
... View more
Hey Jeremy, OK, now things are making more sense. The db_vol.array.ip did actually work, but this is because I'd previously tried setting the db_vol.array field to 'db_vol.aggregate.array.ip' - after unchecking 'Show only attributes used by Create Volume'. If I remove this form db_vol.array, you are correct it doesn't work. With it removed, I changed the filter to use db_vol.aggregate.array.ip as you suggest and it works again. So I guess you could use either, but I think I'll go with your suggestion. Thanks, Craig
... View more
Hi All, What I'm trying to do is a little hard to explain, but I'll give it a go... I'm using WFA 2.0 / DFM 5.0.2. I'm creating an Oracle provisioning workflow based on our specific requirements. Mostly there, but got stuck on one part. In an earlier step (db_vol), I create a volume allowing WFA to select the aggregate by available space from a resource pool. This works just fine, but later I want to create a qtree for redo logs. Say we have 2 filers, one has a volume called ORA_REDO_01, the other a vol called ORA_REDO_02, for example. These exist prior to running the workflow. I want to create the qtree in either ORA_REDO_01 or ORA_REDO_02 depending which filer the 'db_vol' was created on. Here's how the workflow looks so far: I'm trying to use a filter to identify the volume for the 'redo_qtree' command: I'm trying to identify the volume for this qtree (should be ORA_REDO_01 or ORA_REDO_02 depending on the filer) I use the filter 'volume in array by name pattern', then specify 'db_vol.array' for the 'Array IP or Name' field, and 'ORA_REDO' as the pattern to search for. This fails with the error: "Failed to evaluate resource selector. Found variable - expected literal At command 'Create QTree', tab 'Qtree', variable 'redo_qtree', property 'volume'" So it's not expecting a variable here. If I replace the db_vol.array variable with a string value (one of the filer names in quotes) to test, and it works. I've also tried the db_vol.array variable in quotes in the filter and run a preview I get the error: "Workflow aborted. No results were found. The following filters have returned empty results: volume in array by name pattern At command 'Create QTree', tab 'Qtree', variable 'redo_qtree', property 'volume'" Can anyone give me any pointers to resolve this? Thanks in advance, Craig
... View more
Hi, "-Initiate a backup of datastore A in site A with our templates on the primary site using VSC, which will cause snapmirror replication to take place from site A to site B." >> Correct. "-Site A must refrain from writes to datastore A until snapmirror replication completes, but Site A can continue reads" >> No, Site A can continue reads and writes, but it would make sense to have the templates in a static state (ie not being changed) when the SnapMirror update is initiated (ie the VSC job is run). The filer will create a snapshot of datastore A which will be used for the mirror update to site B. You can still write to Datastore A, but only data that was present when the snapshot was created will be replicated. "-Site B can continue reads from datastore B during the replication. (It can never write to datastore B)" >> Technically, Site B can continue reads from Datastore B, but it is possible data will change as the mirror update completes. It would be good practice not to deploy from any templates at site B until the mirror update is completed. The mirror destination volume cannot be written to from Site B unless the mirror is broken. "-After the replication, Site A can resume writes." >> As above. I'm assuming you have a dedicated datastore for Templates? If not, I would recommend this. Hope that helps, Craig
... View more
Thanks Bill and Adai, Yesterday I enabled a tree quota on that qtree, and sure enough it generates alerts/alarms in DFM. All good. This makes sense. I'm now trying to figure out why the qtree was showing full from the client, however, and this I'm struggling with. So, to recap, the volume has 197GB free, and only one qtree. The 'effective' used, accounting for dedupe savings meant the qtree was effectively full. That made sense in a way, until I decided to look as some other volumes which also have /etc/quota entries that apply no limits, (just there so I get capcity stats in DFM). Here's an example that appears to break the previous theory: df -h xxxxxx_fsdep2 Filesystem total used avail capacity Mounted on /vol/xxxxxx_fsdep2/ 3250GB 2931GB 318GB 90% /vol/xxxxxx_fsdep2/ /vol/xxxxxx_fsdep2/.snapshot 812GB 74GB 737GB 9% /vol/xxxxxx_fsdep2/.snapshot df -hs xxxxxx_fsdep2 Filesystem used saved %saved /vol/xxxxxx_fsdep2/ 2931GB 1112GB 28% The effective used is 2931GB + 1112GB = 4043GB which is more than the total size of the volume, right? This volume has 24 qtrees, which is the only significant difference I can see. /etc/quotas entry looks like the following, and I've confirmed quotas are on for this vol: * tree@/vol/xxxxxx_fsdep2 - - - - - From a Windows CIFS client, I can mount one of the qtrees in this volume and Windows reports 318GB free of 3.17TB, which matches the df -h output, not the effective used accounting for dedupe. Am I missing something? Thanks, Craig
... View more
Hi, We’ve come across a strange situation today, I was wondering if you could advise how best to deal with this. We have a qtree which is very nearly full. It is the only qtree in the containing volume and there is no other data outside the qtree in that volume. The containing volume has 197GB free: Filesystem total used avail capacity Mounted on /vol/xxxxxx_fsdata1/ 820GB 622GB 197GB 76% /vol/xxxxxx_fsdata1/ /vol/xxxxxx_fsdata1/.snapshot 502GB 355GB 147GB 71% /vol/xxxxxx_fsdata1/.snapshot There is only one qtree in that volume (called Profiles). According to quota report there is 806GB used in this qtree: vfilerxx@filerxx> quota report K-Bytes Files Type ID Volume Tree Used Limit Used Limit Quota Specifier ----- -------- -------- -------- --------- --------- ------- ------- --------------- tree * xxxxxx_fsdata1 - 0 - 0 - * tree 1 xxxxxx_fsdata1 Profiles 845595168 - 12409822 - /vol/xxxxxx_fsdata1/Profiles Note that there is no usage limits set on this qtree – it should be using free space in the volume. The /etc/quotas entry looks like this: * tree@/vol/xxxxxx_fsdata1 - - - - - The Profiles qtree is shared via CIFS. When connecting via a Windows client, it shows nearly full (eg 818GB used of 820GB): I believe this difference is due to the dedupe savings. If I look at the volume usage including dedupe savings, it looks like this: vfilerxx@filerxx> df -hs xxxxxx_fsdata1 Filesystem used saved %saved /vol/xxxxxx_fsdata1/ 622GB 208GB 25% …so I’m assuming the qtree used is accounting for effective used, rather than the actual used values. The main area of concern here is that Operations Manager (version 5.0.1) has not generated any alerts for this condition. Looking at the Event History for the qtree I see no alerts at all. The quota nearly full and quota full alert global thresholds are set to defaults (ie 80% and 90%) and the quota object in Operations Manager does not have any custom alerts set. If I look at the quota summary in Operations Manager, it shows the Capacity Used as 99%, so why no alert? Any help to understand this would be appreciated. Craig
... View more