About fletch2007

fletch2007 · ‎2011-09-16

The vFilers are showing up now - but it really SUCKS there is no support for VAAI with vFilers Anyone know when VAAI + vFilers is coming? thanks!

fletch2007 · ‎2011-09-15

So I found some vFilers are being discovered now, but others are showing up as unknown and it seems those don't have SSL enabled. secureadmin shows the ssh is enabled, but where is ssl configured on the vfiler level? secureadmin status ssh2 - active ssh1 - active thanks

fletch2007 · ‎2011-09-15

According to http://communities.netapp.com/docs/DOC-6534 vFilers should be discovered as of VSC 2.0 and Ontap 7.3.1.1 - so I'll need to troubleshoot why they are not in my case MultiStore vFiler unit support Does VSC support MultiStore vFiler units? Yes, vFilers are now supported as of VSC 2.0. vFilers are discovered in the same manner as physical controllers. vFilers have the same ONTAP version restrictions as physical controllers. All vFiler hostnames will be pre-pended with "MultiStore:" in the hostname of the Overview panel. VSC only communicates with the vFiler. VSC cannot determine the physical controller (aka vFiler0). vFiler storage totals will be calculated by adding up all the available storage owned by the vFiler. vFilers don't have a concept of disks like physical controllers do. The Total Capacity and Total Allocated columns will be the same. What version of Data ONTAP is required for vFiler support? Data ONTAP 7.3.1.1+ is required for VSC to discover vFilers. Please see the NetApp Interoperability matrix for specific support configurations. What VSC features are not available on a MultiStore vFiler unit? vFilers do not support SSL. Supported protocols column – The column displays the license protocols allowed by the controller. As mentioned above, VSC only talks to the actual vFiler. When a controller is a vFiler unit, the "supported protocols" column will be presented as the protocols "in-use". For instance, if an NFS export is found by VSC, NFS will be listed. If an iSCSI LUN is discovered as well, both NFS and iSCSI will be listed. Overall Status column – This column displays the operational state of the controller. If VSC is communicating with the vFiler, it's in the "running" state. To be consistent with physical controllers, a state of "running" will be presented as "Normal". vFilers in any of the other 4 states (these are non-operational states) will not be discoverable by VSC. Overall Status Reason – This column displays the details status returned by the controller. vFilers do not return a status reason, so the following string will be printed for the status reason: "This controller is a MultiStore vFiler unit." The following columns are not applicable for vFilers. These will be left blank. Controller partner CFmode CFmode status FilerView cannot be launched from VSC on vFilers

fletch2007 · ‎2011-09-14

Hi, I just upgraded to the latest VSC 2.1.1 and was hoping to see the long awaited multi-store (vFiler) support. We are 100% Netapp vfiler datastores and with vSphere 5 we're looking forward to the VAAI goodness. But is VSC 2.1.1 supporting vFilers ? - I'm running NetApp Release 7.3.5.1P2 I don't see them appearing in VSC - its all blank: vfilers are still not supported? If so - what extra config is needed? thanks http://vmadmin.info

fletch2007 · ‎2011-08-23

I opened a case to make sure all the issues are covered and to formulate a plan for 1st reallocation then scheduled reallocation jobs. For instance I read there are free space requirements for volume reallocate jobs. I'll post our findings once we get some numbers for before and after reallocation. thanks

fletch2007 · ‎2011-08-22

We added a 3rd DS4243 (24 disk 15K RPM) shelf to our existing aggr. I ran some benchmarks before and after: http://www.vmadmin.info/2011/08/more-spindlesmore-vm-throughput.html Which led me to the whole reallocate volume question. I've run the reallocate measure and the values returned are recommending reallocation be run on about 1/2 my volumes. These volumes are 2-4Tb and run 200 VMs on a 3270 cluster Is it safe to the reallocate live after hours? Or will this cause too much load? What other things are there to be aware of? I am looking to optimize the volumes for VM throughput and latency and plan to re-run the benchmark once reallocation is complete thanks

fletch2007 · ‎2011-08-17

The good news is our before and after VM benchmarks show a near linear spindle:vm throughput improvement with the addition of the 21 disks to the 46 disk aggregate: http://www.vmadmin.info/2011/08/more-spindlesmore-vm-throughput.html One open question is: will WAFL re-stripe the existing VM volumes over the newly added disk spindles over time? (we CLONED the VM to ensure the VM was striped across all disks for the AFTER benchmark) Will existing VMs need a similar operation to gain the full benefits of the added spindles? thanks

fletch2007 · ‎2011-08-17

I know this will all go away when we upgrade to 64bit - but this was very frustrating: we have an existing aggregate of 2 x DS4243 shelves (48 disks x 266gb right sized) Ontap shows the existing capacity of 9.8Tb (via df -Ah) We went to add one more shelf (24 x 266Gb) and found the size slightly larger than the 16Tb limit: aggr add aggr2 -d 3d.02.0 3d.02.1 3d.02.2 3d.02.3 3d.02.4 3d.02.5 3d.02.6 3d.02.7 3d.02.8 3d.02.9 3d.02.10 3d.02.11 3d.02.12 3d.02.13 3d.02.14 3d.02.15 3d.02.16 3d.02.17 3d.02.18 3d.02.19 3d.02.20 3d.02.21 Note: preparing to add 20 data disks and 2 parity disks. Continue? ([y]es, [n]o, or [p]review RAID layout) y Aggregate size 16.08 TB exceeds limit 16.00 TB File system size 16.08 TB exceeds maximum 15.99 TB aggr add: Can not add specified disks to the aggregate because the aggregate size limit for this system type would be exceeded. Ok, fine, we will add 21/24 disks instead - (overkill on the spares for now) Now that is not the most frustrating part (remember, we will get this back when when we go 64bit) The frustrating part is when we add 21 disks we ended up with not ~15.8Tb (as you'd expect 16.08 - .266 = 15.817Tb) No we end up with only 14 (note the aggregate snapshot is disabled/zero): df -Ah Aggregate total used avail capacity aggr2 14TB 8324GB 6257GB 57% aggr2/.snapshot 0TB 0TB 0TB ---% Can someone explain why ontap appears to use two sets of books for these calculations If there is overhead it should be included in the final useable calculations so these discrepencies are eliminated thanks Fletcher http://vmadmin.info

fletch2007 · ‎2011-07-04

Yes, so the GOS is aligned, but some operations may not be aligned (like Oracle logging) Other Questions: 1) How do we guage the relative significance of this unaligned IO? 2) Why are there multuple counters listed for the same file? 3) What do the values of the counters mean? 4) When should any action be taken on this data? http://vmadmin.info

fletch2007 · ‎2011-07-01

Well the Netapp engineer not so helpfully just emailed a link - completely ignoring the nfsstat -d output: Please refer to the following knowledge base article link which shows how to identify and fix misaligned Windows Virtual Machine disks in your environment: https://kb.netapp.com/support/index?page=content&id=1011402 Please let me know if you need any further assistance in this regard. I just came across a "soon to come" teaser from http://www.vmdamentals.com/ "The devil is in the details: How aligned VMs may still be misaligned" Sounds like our issue...

fletch2007 · ‎2011-06-28

Friday 8pm we experienced a latency event (spike) which was logged by 30+ VMs I've opened a case on this to see what role misaligned IO as reported by nfsstat -d is playing thanks

fletch2007 · ‎2011-06-24

Keith, thanks for the tip on scheduling the dedup BEFORE the nightly snapshots - I moved the dedup job up 2 hours before the midnight snapshots: Its too early to tell for sure, but last night's snapdelta 47.48Gb (as reported by filerview) was less than half the pre-schedule change deltas: On this volume we have 94 VM images with an average of 22Gb per VM used: I moved the dedup job up for our 2nd biggest VM volume to see if the snap delta reduction is realized there too Our goal is to be able to retain more daily snaps in the same space thanks!

fletch2007 · ‎2011-06-23

We're doing nightly snapshots which by default (can you change the timing of nightly snaps?) happen at midnight. Dedup for this volume was schduled for midnight as well I've moved the dedup to 10pm and we'll see if this helps There has to be a way to tell which files (VMs) are contributing the most to snap deltas - if Netapp is serious about being a cloud storage solution they need a tool for guaging this I don't have an obvious VM like DB or defrag thanks

fletch2007 · ‎2011-06-20

Hi - we have a 3.5Tb NFS datastore running about 30 vmware virtual machines. We try to maintain 21 days of daily netapp snapshots, but lately the daily deltas are > 100gb/day and its becoming challenging to keep 21 days without growing the volume. (the data&colon;snapshot space ratio is about 1:2 - the snapshots take twice as much space as the actual VM images - and this is with dedup ON) How can we best determine which set of the VMs is contributing the most to the daily snapshot delta (that 100Gb)? Armed with this information we can then make decisions about potentially storage vMotioning VMs to other datastores to meet the 21day retention SLA. thanks http://vmadmin.info

fletch2007 · ‎2011-06-20

Hi - we run a daily report using the mbrscan utility to check all vmdk files for alignment. Recently (> 7.3.5?) Netapp added an nfsstat -d switch to report "Files Causing Misaligned IO's" I am finding mbrscan is reporting vmdk's aligned: Yes but nfsstat -d is reporting the same file's misaligned IO counter increasing I zero'ed out the stats with nfsstat -z to be sure, and yes, the counters are increasing several 1000 between nfsstat -d runs (about 1 minute apart) in the case mcomm below root@backup-02 mcomm]# /opt/netapp/santools/mbrscan *flat*vmdk -------------------- mcomm_1-flat.vmdk p1 (EBR ) lba:64 offset:32768 aligned:Yes mcomm_1-flat.vmdk e1 (NTFS) lba:128 offset:65536 aligned:Yes -------------------- mcomm-flat.vmdk p1 (NTFS) lba:64 offset:32768 aligned:Yes nfsstat -d output: Files Causing Misaligned IO's [Counter=3404], Filename=vm65/mcomm/mcomm-flat.vmdk Which tool is correct? FWIW - the Partial Write over limit (pwol) counter is not increasing: http://www.vmadmin.info/2010/07/quantifying-vmdk-misalignment.html Also nfsstat -d lists a record without a filename - how do I determine what this is ? Files Causing Misaligned IO's [Counter=4093], FSID=95966634, Fileid=21607809 thanks

fletch2007 · ‎2011-06-08

VMware released an award winning product called vCenter Operations - it takes in all the diagnostic information and uses dynamic heuristics to identify resource constraints in key areas. This tool helped me identify some important issues I could not have seen unless I was constantly watching and collecting data and analyzing it I realize I need the same kind of tool for ONTAP Our fiber loops have a switch for 1,2,4 Gb - they are currently set to 2 Gb, so one question I have is - how close are we getting to loop saturation? thanks

fletch2007 · ‎2011-06-06

By watching the /etc/log/auditlog on the snapmirror source I was able to determine the root cause was a snapmirror.access config issue This log entry clued me in to check the snapmirror options. options snapmirror.checkip.enable The other (working) heads were set to legacy for snapmirror.access I resolved this with "options snapmirror.access legacy" thanks

fletch2007 · ‎2011-06-05

Rebooted (actually upgraded the destination cluster to 7.3.5.1P2 and rebooted) but the error persists: irt-na02> vfiler dr configure pw-vf-01@irt-na04 irt-na04's Administrative login: root irt-na04's Administrative password: irt-na04: An error occurred while trying to commit the registry (Registry operation error). I've opened a case on this thanks

fletch2007 · ‎2011-06-04

Having successfully migrated the vfilers to the new 3270, I now want to re-establish the dr vfiler relationships in the reverse direction I am running into this error on this pair of heads (irt-na02 & irt-na04) - the other pair accepts the vfiler dr configure fine irt-na02> vfiler dr configure vdi-vf-01@irt-na04 irt-na04's Administrative login: root irt-na04's Administrative password: irt-na04: An error occurred while trying to commit the registry (Registry operation error). irt-na02> vfiler status vfiler0 running irt-na02> version NetApp Release 7.3.3: Thu Mar 11 22:29:52 PST 2010 there are no errors reported on irt-na04 console thanks for any tips/info, http://vmadmin.info

fletch2007 · ‎2011-06-02

Ok, I've re-engaged with support to update the testpoint to workaround the max volume issue yes, we can resolve the volume name conflicts separately on our own thanks

fletch2007 · ‎2011-06-02

the destination is a 3270 (we are using a testpoint supplied by netapp support to bypass the model check). Apparently it is defaulting to a low max volume value for migrations! How can I persuade NMC & DFM to use the correct higher value for the 3270 destination? This is the last vfiler I need to migrate - so any help is much appreciated thanks

fletch2007 · ‎2011-06-02

Am I forced to do an offline migration for this 7 volume vfiler? thanks === SEVERITY === Error: Attention: Failed to select a resource. === ACTION === Select a destination resource for migrating === REASON === Storage system : 'irt-na03.'(119): - Destination storage system 'irt-na03.'(119) can contain maximum of 4 volumes for Automated Online Migration. But the migrating vFiler unit 'str-vf-02'(1838) has 7 migrating volumes for Automated Online Migration - Volumes by same name as 'strdata', 'strweb2' already exist on the storage system 'irt-na03.'(119). === SUGGESTION === Suggestions related to storage system 'irt-na03.'(119): - Choose/Add a storage system, which can contain minimum of 7 volumes per vFiler unit. - Destroy the volumes on the storage system and refresh host information either by executing 'dfm host discover' CLI or navigate to 'Hosts > Storage Systems' page in Management Console and press 'Refresh' button.

fletch2007 · ‎2011-06-02

Hi, we've just undergone a major upgrade from 3040 clusters with 1 Gb vifs to 3270s with 10Gb networking and new 15K RPM aggrs. I was reviewing a Performance Tuning doc by Netapp's Tom Hamilton and it seems a ripe opportunity for a netapp tool to take the statit, sysstat outputs and perform a least the initial analysis of thresholds for latency, CPU domain, loop saturation etc checks Netapp is famous for creating great tools for everything - Does Netapp have or plan to create such an "expert system" tool for dianosing performance bottlenecks? My sense is with our upgrade our bottleneck has shifted from network to somewhere else (loop saturation?) - For example the analysis for diagnosing loop saturation is not clear to me. thanks for any feedback http://vmadmin.info

fletch2007 · ‎2011-05-31

Yes, did a "vfiler context vfiler0" (eventhough the context prompt did not show it was in the problem vfiler) and it succeeded thanks!

fletch2007 · ‎2011-05-31

Anyone seen this error? I'm doing a vFiler migrate via NMC 3.0.2 from a 3170 running 7.3.3 to a 3270 running 7.3.5.1P2 The error occurs during cutover: right after a successful "putting vfiler unit 'xyz-vfiler' in 'migrate prepare' mode" thanks

VSC supports vFilers?

VSC supports vFilers?

VSC supports vFilers?

VSC supports vFilers?

Quantifying (spindle count:vm throughput) relationship + reallocate

Quantifying (spindle count:vm throughput) relationship + reallocate

Re: Aggregate max size calculations off by 2/16Tb (12.5%)?

Aggregate max size calculations off by 2/16Tb (12.5%)?

Re: Files Causing Misaligned IO's (nfsstat -d) and mbrscan don't agree

Files Causing Misaligned IO's (nfsstat -d) and mbrscan don't agree

Files Causing Misaligned IO's (nfsstat -d) and mbrscan don't agree

How to determine which VM's (NFS files) are contributing the most to snapshot deltas?

How to determine which VM's (NFS files) are contributing the most to snapshot deltas?

How to determine which VM's (NFS files) are contributing the most to snapshot deltas?

Files Causing Misaligned IO's (nfsstat -d) and mbrscan don't agree

Expert System for diagnosing high level performance bottlenecks? - eg loop saturation

vfiler dr configure - An error occurred while trying to commit the registry (Registry operation erro...

vfiler dr configure - An error occurred while trying to commit the registry (Registry operation erro...

vfiler dr configure - An error occurred while trying to commit the registry (Registry operation erro...

vfiler Migration fails: maximum of 4 volumes for Automated Online Migration

vfiler Migration fails: maximum of 4 volumes for Automated Online Migration

vfiler Migration fails: maximum of 4 volumes for Automated Online Migration

Expert System for diagnosing high level performance bottlenecks? - eg loop saturation

failed to migrate vfiler unit. reason: operation not allowed, vfiler has other references

failed to migrate vfiler unit. reason: operation not allowed, vfiler has other references