Hi,
one question I have is can you assign all data partitions to a single node and put them into a single aggregate? (ie to either have an active/passive node setup, or to just have all the adp disks on one node while other shelves are used on the other node)
This means that the single data aggregate would have 2 raidgroups: 23 x data1 partitions and 23 x data2 partitions.
Is this supported?
Cheers,
John
... View more
Hi Philip, did you ever get an answer to this one? I'm looking at doing the same thing and I think the way you go about it is to enable this option in your vfiler: dns.update.enable secure You can only have secure updates if you have Windows DNS servers and CIFS is configured on your controller(s). The ONTAP Network Admin Guide has further information on this. Cheers, John
... View more
Hi, I've seen SQL LUNs in the past only doing partial writes (because of the nature of the DB which writes in 512B chunks) which could be misinterpreted as misalignment (or indeterminate). In this case wouldn't need to worry about it. Windows 2008 LUNs shouldn't have issues with alignment (if you use the correct multiprotocol type as you have) as it uses a 1MB offset. You can check the offset in msinfo32 by going to Components-->Storage-->Disks, then checking the value of "Partition Starting Offset" - as long as it's divisible by 4KB then you should be ok. Cheers, John
... View more
If you're after IOPS on each disk then the easiest way to see them all at a glance is with 'statit'. Run 'statit -b', wait a period of time for it to gather data (30-60 seconds), then run statit -e. Make sure you have a wide CLI session with sufficient buffering/logging. Look for the section which looks like this: Disk Statistics (per second) ut% is the percent of time the disk was busy. xfers is the number of data-transfer commands issued per second. xfers = ureads + writes + cpreads + greads + gwrites chain is the average number of 4K blocks per command. usecs is the average disk round-trip time per 4K block. disk ut% xfers ureads--chain-usecs writes--chain-usecs cpreads-chain-usecs greads--chain-usecs gwrites-chain-usecs /aggr0/plex0/rg0: 0a.27 1 1.26 0.32 1.00 17000 0.95 1.00 0 0.00 .... . 0.00 .... . 0.00 .... . 0a.26 1 1.26 0.32 1.00 17000 0.95 1.00 0 0.00 .... . 0.00 .... . 0.00 .... . 0a.28 2 2.52 2.52 1.00 24500 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.29 2 2.21 2.21 1.00 18000 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.22 1 1.89 1.89 1.00 14167 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.21 1 0.95 0.95 1.00 17000 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.20 1 1.26 1.26 1.00 18000 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.19 2 1.26 1.26 1.00 49000 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.25 2 0.63 0.63 1.00 92500 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.17 1 1.89 1.89 1.00 14667 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.24 2 2.52 2.52 1.00 25375 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0a.16 3 1.58 1.58 1.00 58600 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . Aggregate statistics: Minimum 1 0.63 0.32 0.00 0.00 0.00 0.00 Mean 2 1.58 1.26 0.00 0.00 0.00 0.00 Maximum 3 2.52 2.52 0.95 0.00 0.00 0.00 This is a very quiet aggregate with only 1 raidgroup, but you will be able to see all of your disks and compare their IOPS. This is a much easier way to see the performance of all disks at a single point in time rather than using Performance Advisor (where you need to view each disk separately)... Performance Advisor is better if you want to see individual disk performance over time. Cheers, John
... View more
Hi Mark, I'm having the same issue by the looks of it with SC not calling the SMSQL backup command, but if I manually run the command displayed in the agent log from the command prompt it completes successfully in ~30 secs (my SC_AGENT_TIMEOUT is currently at 300 - tried 900 but it didn't make any difference). I've tried replacing all the hyphens in the powershell parameters in the config file but still no dice - here are my key SMSQL settings: SMSQL_PS_CONF="C:\Program Files\NetApp\SnapManager for SQL Server\SmsqlShell.psc1" SMSQL_BACKUP_OPTIONS=-server 'VM-JOHNH-W2K3' -d 'VM-JOHNH-W2K3', '2', 'AdventureWorks', 'JHtest' -RetainBackups 7 -lb -bksif -RetainSnapofSnapInfo 7 -trlog -gen -mgmt standard On top of this, port 9090 for the agent appears to get stuck in CLOSE_WAIT status and I need to restart the agent service to clear it before trying again. I have these problems with both SC 3.5 and 3.6. Here is the output from the server debug log: ===================================================== [Fri Aug 17 08:26:05 2012] DEBUG: GMT - Thu Aug 16 22:26:05 2012 [Fri Aug 17 08:26:05 2012] DEBUG: Version: NetApp Snap Creator Framework 3.5.0 [Fri Aug 17 08:26:05 2012] DEBUG: Profile: test_profile [Fri Aug 17 08:26:05 2012] DEBUG: Config Type: STANDARD [Fri Aug 17 08:26:05 2012] DEBUG: Action: snap [Fri Aug 17 08:26:05 2012] DEBUG: Application Plugin: smsql [Fri Aug 17 08:26:05 2012] DEBUG: File System Plugin: null [Fri Aug 17 08:26:05 2012] DEBUG: Policy: hourly [Fri Aug 17 08:26:05 2012] DEBUG: Snapshot Name: sc_smsqltest-hourly_recent [Fri Aug 17 08:26:05 2012] INFO: Logfile timestamp: 20120817082605 ########## Parsing Environment Parameters ########## [Fri Aug 17 08:26:05 2012] DEBUG: Parsing VOLUMES - controller: 10.10.150.93 volume: jhsql_db_1 [Fri Aug 17 08:26:05 2012] DEBUG: Parsing VOLUMES - controller: 10.10.150.93 volume: jhsql_log_1 [Fri Aug 17 08:26:05 2012] DEBUG: Parsing VOLUMES - controller: 10.10.150.93 volume: jhsql_snapinfo_1 [Fri Aug 17 08:26:05 2012] DEBUG: Parsing NTAP_USERS - controller: 10.10.150.93 user: sc_user [Fri Aug 17 08:26:05 2012] DEBUG: Parsing NTAP_SNAPSHOT_RETENTIONS - policy: hourly retention: 4 ########## PRE APPLICATION QUIESCE COMMANDS ########## [Fri Aug 17 08:26:05 2012] INFO: No commands defined ########## PRE APPLICATION QUIESCE COMMANDS FINISHED SUCCESSFULLY ########## ########## Application quiesce ########## [Fri Aug 17 08:31:06 2012] ERROR: 500 read timeout at /<C:\Program Files\Netapp\NetApp_Snap_Creator_Framework35\scServer3.5.0\snapcreator.exe>SnapCreator/Agent/Remote.pm line 474 [Fri Aug 17 08:31:06 2012] [10.10.150.158:9090(3.5.0.1)] ERROR: [scf-00053] Application quiesce for plugin smsql failed with exit code 1, Exiting! ########## Application unquiesce ########## [Fri Aug 17 08:36:06 2012] ERROR: No valid response [Fri Aug 17 08:36:06 2012] ERROR: [scf-00054] Application unquiesce for plugin smsql failed with exit code 1, Exiting! ########## PRE EXIT COMMANDS ########## [Fri Aug 17 08:36:06 2012] INFO: No commands defined ########## PRE EXIT COMMANDS FINISHED SUCCESSFULLY ########## [Fri Aug 17 08:36:06 2012] DEBUG: Exiting with error code - 2 ===================================================== And here is the agent debug log: ==================================================== [Fri Aug 17 08:26:06 2012] DEBUG: Executing command [%SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe -psconsolefile "C:\Program Files\NetApp\SnapManager for SQL Server\SmsqlShell.psc1" -command "new-backup -server 'VM-JOHNH-W2K3' -d 'VM-JOHNH-W2K3', '2', 'AdventureWorks', 'JHtest' -RetainBackups 7 -lb -bksif -RetainSnapofSnapInfo 7 -trlog -gen -mgmt standard"] [Fri Aug 17 08:26:06 2012] INFO: Starting watchdog with [-3964], forced unquiesce timeout [305] second(s) [Fri Aug 17 08:31:12 2012] INFO: Skipping unquiesce, nothing needed for SMSQL integration ==================================================== Was it only the hyphens in the SMSQL_BACKUP_OPTIONS field you replaced? Cheers, John
... View more
Hey Keith, that looks very promising being able to specify multiple agents to run commands on - do you have a timeframe of when SC 4.0 will be released? Cheers, John
... View more
Not sure if you've seen the update for TSB-1111-01 which was sent out today, however it's now official that: Data ONTAP 8.1.1 (7-mode only): FAS/V3210 systems running Data ONTAP 8.1.1 (7-mode) will support the 256GB Flash Cache in the same manner as with Data ONTAP 8.0.x releases. FAS/V3140 systems running Data ONTAP 8.1.1 will support the full 256GB for wear-leveling, but to deliver on the full value of Data ONTAP 8.1.1, the maximum working dataset supported with the in the 256GB Flash Cache on FAS/V3140 will be 128GB.
... View more
Hi All,
I have a customer with a large number of AIX servers who just experienced timeouts on 3 servers during an NDU ONTAP upgrade, and again during a non-disruptive Flash Cache card install (the same 3 servers both times).
During both times the takover/giveback took over 30 seconds. I wouldn't expect this to be a problem seeing as NetApp openly advertise that it won't take more than 180 seconds and these servers had the Host Utilities for AIX installed/configured.
Turns out that IBM identified a few MPIO patches missing on these servers which is more than likely the root cause, but it got me thinking: if a takeover/giveback can take up to 180 seconds, why do the Host Utilities set the timeout value to 30 seconds?
I've attached a screenshot which shows the output of lsattr which highlights the rw_timeout value:
The Host Utilities manual also states that this is the correct value: http://support.netapp.com/knowledge/docs/hba/aix/relaixhu50/pdfs/host_set.pdf (page 7)
All of the other servers continue to run normally during the takeover/giveback, but I'm now interested if anyone is able to explain why the timeout is set to 30 seconds but a takeover can take 180 seconds?
Many thanks,
John
... View more
Hi, I'm seeing the same issue with trying to add a Windows 2003 OSSV host - OSSV 3.1 and agent 2.7 with OnCommand 5. I rectified an issue where the NDMP port 10000 on the host was already utilised by using port 10099 instead, but I still get stuck in this loop when trying to add this host. Svinstallcheck comes up clean, and I already have some OSSV relationships established back to the secondary FAS, they're just not visible to DFM/OnCommand as could be expected. Anyone had any luck? Cheers, John
... View more
Hi Amir, to answer your Qs: - DFM will be running on a Windows (2003 or 2008) server/vm. - This will be used for separate environments across multiple customers, so it could be as small as a HA-pair (2 controllers) to 14+ controllers. - The key one I'm trying to set is snapshot_clone_dependency, but I'm also looking at having a template/structure so that I can use it for other options/values in the future also. Thanks for your help! Cheers, John
... View more
Hi All, I would like to create a custom provisioning script for Provisioning Mgr that sets some specific volume options, but I'm having trouble finding any information/documentation on exactly how to do this. Does the post-provisioning script: Run from the OnCommand server or from the workstation where NMC is installed? Have to be scripted in a way that it rsh/ssh onto the controller and then run each command for the target volume etc? Is there a list anywhere of the variables required? If anyone has an example of a post-provisioning script they would like to share, as well as the syntax which is used in the provisioning policy's post-provisioning command field, it would be greatly appreciated. Many thanks, John
... View more
Thanks for that - that's the conclusion I came to as well although I couldn't find it documented that Citrix actually recommends that. Do you have a URL (off the top of your head:) that shows that? I found some posts which stated that any XS VMs which had Snapdrive installed (for use with SnapManager etc) had trouble booting and running normal backups with VSS if their non-NetApp disks (ie C: drive) were not on NFS. Something to do with XS StorageLink creating a separate LUN for every disk on a VM (including the C: drive) and mapping through the physical device name (ie "NETAPP LUN") which was confusing Snapdrive when the VSS hardware provider was called. Cheers, John
... View more
Hmmm.. just found this on FieldPortal which says NO to all the above: https://fieldportal.netapp.com/ci_getfile.asp?method=1&uid=6347&docid=32820
... View more
Hi Gerry, just wondering how you got on with this, as I've got a similar environment where XenServer 5.6 with FC LUNs will be used and they would also like to use SMSQL & SME for the MSSQL and Exchange VMs. Did you need to directly map the LUNs via iSCSI to the VMs for Snapdrive/snapmanager to work, or were you able to get FC LUNs to work ok with Snapdrive on the VM through XenServer? Cheers, John
... View more
Hi, thanks for your response. The whole purpose of CSVs is to allow multiple Hyper-V servers to store & access VMs on the same LUN simultaneously (similar to a VMFS datastore in ESX), so following a guideline of only allowing a single Hyper-V server to store VMs on a specific CSV kinda defeats the purpose. There is no mention in TR-3805 (SMHV best practices) of any best practices in regard to use of CSVs together with SMHV, but you may be thinking of the traditional shared storage LUNs which were previously used in Hyper-V clusters - the best practice for these types of LUNs is to only have one VM per LUN. My customer currently has 180 VMs in their Hyper-V environment & growing, so going down this road would make the storage a nightmare to manage as well as negate all of the dedup benefits which we're currently achieving. The example I gave was greatly simplified - the real environment contains: - 8 x Hyper-V servers in a failover cluster. - 24 x 2TB CSV LUNs located in separate qtrees within 12 x FlexVols. (The 2 x 2TB LUNs per vol is a workaround for a UCS bug in their FCoE firmware - UCS wouldn't recognise any LUN over 2TB over FCoE with Hyper-V) (The 4TB per FlexVol allows us to maximise the dedup on their FAS3140) - 180+ x VMs which are grouped on the storage according to their data type for maximum dedup benefits. The SMHV datasets are broken down to include all VMs residing on the 2 x LUNs in a particular FlexVol. (eg all VMs on CSV LUNs 1 & 2 which reside in FlexVol # 1 are configured in dataset # 1 etc) If I were to create separate SMHV datasets based on which Hyper-V server the VMs contained in a CSV were running on: PROs: I would at least know that the VMs in that snapshot were consistent... until they start to migrate to different Hyper-V servers over time. CONs: I would still have as many snapshots of the CSV FlexVol per daily backup as I do now. The SMHV interface is a nightmare in large-scale deployments - you can't sort VMs when creating a dataset and they aren't listed in any sort of order.. you have to trawl the list time and again to find the specific VMs you want. Try this when you have 180 VMs which have very similar names. It took forever to just create 12 x datasets for their 180 VMs - this could conceivably mean I'd need to create & manage 96 datasets. This might work on day # 1, but will require constant maintenance to ensure that each VM is in the correct dataset as they migrate between servers over time. Hmmmm... the more I think about it the more SMHV doesn't appear ready for enterprise environments (but then that could be said for Hyper-V in general!). Thanks, John
... View more
Hi, that looks like a good solution if you're able to keep all of the VMs on a CSV running on a particular Hyper-V server in the cluster. I believe you may have issues in the future, however, if VMs on a single CSV begin to shuffle between Hyper-V cluster nodes (due to performance management, outages etc), as SMHV will take separate snapshots for each Hyper-V server which is running VMs in the Dataset even if they're contained on the same CSV. EG: (this is a greatly simplified view of my customer's environment) * 4-node Hyper-V cluster. * 1 x 4TB CSV LUN. * 12 x VMs stored on the same CSV LUN. * Each Hyper-V server in the cluster has 3 x VMs each from the CSV running on it. * 1 x Dataset within SMHV to backup all VMs on this CSV once per day. My customer only has a requirement to snapshot his VMs once per day. Since SMHV takes 2 snapshots per Hyper-V server per Dataset (the base & then the "xxxxx_backup" snap), I will end up with a total of 8 snapshots for each daily SMHV backup. The problem with this situation is that there are only 3 VMs which are consistent in any one "xxxxxxxx_backup" snapshot - the other 9 VMs would be crash consistent. The other 9 VMs would be consistent within the other 3 x "xxxxxx_backup" snapshots. This being the case you would then need to greatly complicate your post-script to determine which VMs were quiesced as part of that snapshot; only backup those VMs from your backup server; mount the snapshot for the next Hyper-V server in this Dataset and repeat the process. I would be greatly interested if anyone has a solution to this conundrum, as I'm trying to find a way to SnapVault this environment for long-term retention. Currently the easiest option for long-term management appears to be installing OSSV on all of the VMs and have them each individually replicate to the DR site and manage with Protection Manager. Cheers, John
... View more
Hi Amrita, any updates on the VSC 2.0.1 release date? I heard from the local NetApp guys that it was supposed to be Sept 30 but that has come and gone and v2.0.1 is still not available for download. Cheers, John
... View more
Hi, Radek's on the money here - you would be better off creating a single RAID_DP aggregate with 11 x 1TB disks and leave 1 x hot spare. This would give you ~6.7TB usable capacity (with aggregate snap reserve set to 0). I would recommend that you upgrade to ONTAP 7.3.3 (if not there already) as it will allow you to expand the aggregate past 7TB in the future without needing to leave a 2nd hot-spare (or have a separate root aggregate). This is a restriction with FAS2020s on ONTAP 7.3.x. I would steer well clear of having 1TB disks in a RAID4 configuration - you can only have a max of 7 disks in each RAID4 raidset (so you would need to use a 2nd parity disk for the 2nd raidset anyway), and the long RAID rebuild time of a failed 1TB disk increases your risk of then having a double-disk failure while it's rebuilding. RAID_DP will protect you against a double-disk failure during a RAID rebuild. Hope that helps. Cheers, John
... View more
Hi, the FAS3020 used to use hardware-based disk ownership by default which made moving between controllers much easier. If 'disk show -v' shows you a list of all the disks on your controller, then you're using software-based disk ownership. These steps would assume that your new destination controller needs to remain up while the shelves are attached. If you are moving to a different ONTAP version, then you should verify compatibility in your environment via the matrix on NOW (ie key things such as Windows iSCSI initiators, NetApp Host Utilities, Snapdrive and Snapmanager products -for starters- will need to be checked). Basically the steps would involve: Rename any aggregates and volumes which are moving so that they don't conflict with any already on the destination controller. Shutdown any apps/servers which are served from the aggrs you want to move. Remove any CIFS shares / NFS exports / LUN mappings for data contained on the aggr. Take the aggregate(s) contained on the shelves offline. Shutdown the FAS3020 (and the cluster partner if in a HA-pair) Disconnect and move the shelves/loops to the new location (do not connect to new destination tho). Check that shelf ID numbers are correct for whatever loops you're connecting them to (remember that you can only connect to loops of the same type - ie SATA shelves to SATA loops, FC shelves to FC loops) Power on the moved shelves and connect them together as desired - DO NOT connect them to any existing loops until you've connected the moved shelves to each other in the manner you need for the end result. Connect the shelves to the new controller(s). Check that you can see all of the disks with 'disk show -v' and 'sysconfig -a' Assign the disks to the correct controller by running this command for each disk: disk assign <disk> -o <new_filername> -f OR If you're using hardware-based disk ownership, then just type "disk assign all" If you're running ONTAP 7.3.x then you'll see a stack of 'critical' errors scroll up your screen for each of these disks which you can ignore. 12. Check that all disks are assigned to the correct controller by running: disk show -v 13. You should've seen messages saying that it detected a new aggregate - run this command to see if your aggregate is listed and offline: aggr status -v 14. If all of the correct disks are present for the aggregate, bring the aggregate online by running: aggr online <aggr_name> 15. Check that all vols contained on the aggr are online and then recreate any CIFS shares / NFS exports / LUN mappings from the new controller. These steps don't include any SAN zoning, persistent binding, DFS updates etc or any host-based reconfiguration if required. HTH. Cheers, John
... View more
Yeah but the problem is that the LUNs are being created by XenServer StorageLink - NOT manually within ONTAP and then mapped to an igroup etc. There is no facility in StorageLink to specify what LUN multiprotocol type should be used (ie either 'xen' or 'linux' in this instance). This core problem is more likely an issue with XenServer StorageLink itself rather than anything to do with ONTAP, but the question I need answered is will using LUNs with 'image' as the multiprotocol type cause any block alignment issues or will it be ok in this instance? Many thanks, John
... View more
Hi, I'm installing a FAS2040A w/ ONTAP v7.3.2 into my first XenServer v5.5 environment and the customer has chosen to use the 'Citrix Essentials' StorageLink way to share storage (ie as opposed to basic iSCSI or FC LUNs manually created and assigned to each XenServer). I've just been checking over his environment and it appears that StorageLink has created all of the LUNs in his environment with a multiprocol type of 'image' rather than 'linux'. I think I remember reading somewhere that linux could use 'image' or 'linux' multiprotocol types in previous versions of ONTAP, but I just wanted to check that this is normal or if it's going to create any issues down the track (ie block alignment etc). Does anyone else with XS 5.5 use StorageLink and see this behaviour? Cheers, John
... View more