About shaunjurr

shaunjurr · ‎2011-06-21

'qtree status -v <vol_name>' is your friend... 😉

shaunjurr · ‎2011-06-17

No, you said you didn't want to grant every box root permissions... By exporting the volumes 'anon=0' you just make everyone root... There is a bit of a difference... I know of no other way of dealing with filesystem permissions than from remote systems. You are probably going to have to accept the fact that some server that has "root" mount rights will need to be used to setup rights for the filesystem for others. The situation is the same for CIFS filesystems. EoD from me.

shaunjurr · ‎2011-06-15

CASCADIAN wrote: ... Then we have another group of small databases on the same server that share a LUN/Volume for their .MDFs, etc. ... Don't do this. Each database should have a LUN for database files and a LUN for logs. Remember, snapshots are volume based, so putting a number of databases in the same LUN (and volume) is just going to cause immense problems. Use volume mountpoints to get passed the overuse of "drive letters".

shaunjurr · ‎2011-06-14

Like I mentioned earlier, I've used all the protocols with VMWare and basically, it's a toss-up for me between iSCSI and NFS. The advantages of the one over the other are mitigated a lot by what your organization can deliver, not just what the VMWare or storage guys want. iSCSI does allow multi-pathing (which probably doesn't work quite right on VMWare) but is a definite advantage bandwidth-wise on the small systems with 1 GE interfaces where NFS is basically not going to balance evenly using vif's/ifgrp's. The benchmarks are dated and CPU load isn't a problem unless you are CPU bound. CPU capacity is growing faster than disk I/O anyway, so the problems are probably going to come down to not enough disk before either your VMWare or filer cpu are red-lining, at least on the small units. Benchmarking on a 3070 with 84 disks is going to be a different world from a 2020 with 14 disks. My experience is that NFS response time is higher, even if it tolerates it better than iSCSI. Both are problematic in large environments without dedicated networks. The people working with VMWare usually don't know enough not to just dump everything in one datastore so some of the advantages get lost anyway.

shaunjurr · ‎2011-06-14

I have VMWare setups running FC, iSCSI and NFS. The least problematic are frankly the FC setups. It's pretty much just set it up and let it run. iSCSI can be a real PITA if you don't have your network people behind you. Then you have to really school your windows people on getting all the OS timeouts right... which should be the rule generally anyway, but iSCSI seems to be a bit less tolerant than NFS. In large traditional IT companies, if you have SAN, then there is very little interest in setting up "storage networks" to run NFS with a high level of reliability and security. The fact that NFS might be free for the junior boxes doesn't play much of a role for most of us. It's a market positioning strategy to get NetApp in the door (the first fix is always free, hehe). If the numbers crunch correctly when these people graduate to the 3000 series is a different cup of tea. VMWare guys that have used NFS generally like it a lot, but I don't find that it makes much difference to me on the storage side. They VMWare guys usually abuse the storage enough that most of the advantages are lost.

shaunjurr · ‎2011-06-14

Hi, NFS is usually about 10% slower than iSCSI, at least if you read the tests that NetApp does. VMWare's NFS implementation is also still maturing and isn't the world's best, imho. If it's linked to linux NFS in any way, then this is probably understandable. iSCSI and FC SAN are really not pretty pictures either given their lack of true multi-pathing over many years. The main element here is probably cost of implementation. NFS licenses are hugely expensive compared to iSCSI, so if you don't have a large unix environment or can get a good rebate, then NFS for VMWare just isn't feasible financially when weighed against iSCSI.

shaunjurr · ‎2011-06-13

Hi, So what did 'ssh -o BatchMode=yes -2 -ax orion -n -l root "version' give you for output? What user are you supposed to be using on the filer if not root?

shaunjurr · ‎2011-06-13

Hi, Your best option, even if it is going to mean changes for your users one time, is to implement DFS. This way you can have any number of underlying shares that can be moved around even to other machines and just require an update to the DFS information. Such moves will be much easier in the future and you will be able to balance growth and backup windows and such with very little disturbance to your users. You might be able to add a symlink in the CIFS filesystem itself pointing to the new directory's destination, but I am unfortunately not too versed on such matters. I didn't have any luck using widelinks a few years back because one needed to, iirc, have the data in a unix qtree, but I might have forgotten the details. Good luck.

shaunjurr · ‎2011-06-13

Hi, Key authentication works just fine. You need to actually read what the script is telling you in your example "Perfstat: Invalid perfstat invocation: './perfstat.sh perfstat -S orion -t 2 -i 2 -I' Please consult ./perfstat.sh -help." You just haven't read carefully. Why are you writing ':/perfstat.sh perfstat...' ? That makes no sense and it is no wonder that the script is telling you that you are doing something wrong... because you are. I can't tell if the last switch/option is -(capital i, or small L). This should work (if the user that you are using to invoke it has its public key on the filer): "perfstat.output" will contain all of the output besides what goes to STDERR (standard error) ./perfstat.sh -f filer_name -t 2 -i 2 -S -I > perfstat.output (the last option is capital "i") You might want to add the -F option to avoid collecting everything about the linux/unix host, perhaps also -p to just capture performance info instead of also getting a full configuration dump of the filer, if it's not necessary. -t should probably be a lot longer and you can adjust the sampling lengths within "-t" by using the "-i" option with multiple values... like "-i 10,10" or something. Again, ./perstat.sh --help will tell you all of this should you actually want to take the time to read it carefully. The script works fine with public keys. As far as I can see from your examples, you are just using it wrong.

shaunjurr · ‎2011-06-12

Hi, You will perhaps get a good deal of differing opinions here, but some fundamental common sense is going to probably get you the farthest in the long run. 'A-SIS' is a data manipulation (filesystem) tool that is primarily concerned with removing (consolidatïng) duplicate blocks from the filesystem. Even if it has been refined overtime to try to reduce fragmentation, the tool main goal is to optimize space savings. How performance is affected is largely going to be the result of how deduplication affects seek times and the number of reads it needs to access the blocks requested. This would seem to be relatively easily deducible from a simple understanding of disk based harddrives. 'sis' is a tool with a specific use in mind. Like most tools, you can try to use it for other things but the results may be suboptimal. (You can use a knife to loosen a screw, but you might ruin the screw or the knife in doing so). There is one complicating/mediating factor here as well: system memory. The larger the system memory, the more easily the system can cache frequently accessed blocks without having to access the disks. This is also why PAM-II cards can have an amplified advantage on de-duplicated data. The 2050, unfortunately, isn't going to have many advantages here. We might then, deduce that 'sis' isn't the right tool for filesystems (flexvols) that require optimal access times. We can probably reasonably estimate a number of scenarios where 'sis' would be useful and some where it wouldn't be. The key here, to reiterate, is to segregate the data in a way that will make these decisions more clear-cut. 1) VMWare volumes: datasets that are highly duplicate, relatively static, and require little or only slow access: system "C:" drives, for example. The similarity of the data here should lead one to perhaps have exclusively C drives together (without pagefile data). 2) VMWare volumes: datasets that are moderately duplicate and require moderate access times. 3) CIFS/NFS data that moderately duplicate and require moderate/slow access times. The sizes of the file systems here can result in significant savings in terms of GB for normal user data. Conversely, there are probably many datasets where 'sis' is going to give minimal savings or sub-optimal performance. 1) Datasets with files that have random and/or unique files. siesmic data, encrypted files, compressed files, swap files, certain application data 2) Datasets that have files with application data which have optimized internal data structures or that require fast access times: databases 3) Datasets that are too small for significant savings. 4) Datasets that are dynamic and require fast access The common sense comes, then, in using the tool for what it was meant for. Segregate the data into reasonably good sets (flexvols) where 'sis' can be used with success and where it shouldn't be used at all. In the end, the goal for most IT operations isn't to save file system blocks at all costs, i.e. without considering performance. There are other maintenance routines that can help with access times, like the use of 'reallocate', but one needs to read the docs and use a little common sense here too. Normal fragmentation will affect most filesystems over time, but that isn't a situation that is limited to de-duplicated filesystems. Hope this helps.

shaunjurr · ‎2011-06-11

Hi, As far as I know, you need to have your mailbox databases and log databases on separate volumes and lun's. You can put your "snapinfo" directory in the same LUN as the logs, so links can be used to minimize storage requirements. I believe all of this is in the Best Practices documentation for SME, but it has been while since I read the docs.

shaunjurr · ‎2011-06-11

I sympathize with your confusing research. The "Library" is incredibly hard to search through and the results of at least some of the documents results in confusing or at times, conflicting information. Levels of detail are different. Some documents will lead you to a setup that you later can't even use well with snapshots or SnapManager products (my experience with Exchange "Best Practices"). Basically, the volume is the unit you need to remember when dealing with snapshots. One LUN per volume is a good idea unless you are going to exceed the platform limit in doing so. There are only a few small disadvantages in doing so. You need a little more time for allocation, and boot times for the filer get just a bit longer, but you win in flexibility. Like most such situations, there is a trade-off between complexity and flexibilty. Getting it done quickly at the start might mean a lot more work later. Snapshot functionality is basically the same as it has been the last 20+ years. There have been many improvements in many areas over the years, but as far as provisioning LUN's on NetApp storage, I don't think there is anything specific. The whole development clone discussion is more or less a matter of available space. You can use consistant snapshots, but you can't keep them lying around very long if you have lots of block changes in your volumes. You will have to "split" the clones off to separate volumes (an internal ONTap function) if you want to keep them longer than you keep your normal snapshots if you don't want to fill up your volumes. Hope this helps.

shaunjurr · ‎2011-06-11

Hi, It is a little hard to comment on the setup without knowing what you are going to use the SATA disks for. Option 1 doesn't look good. Wasting 3 disks on such a small system just for the root volume. I guess I would set up filer 1 with all 12 300GB SAS disker and run raid4 (I can already hear the NetApp "Bible-bangers moaning"). Make sure you set the raid size to 11 before you get going. You can convert a raid_dp aggr to raid4 just by setting the option 'aggr options aggr0 raidsize 11' Setting raidgroup size to 11 saves you from accidently adding your only spare to the aggregate. It can always be changed later if you buy more disks. Basically, you are going to need the I/O so the more disks you can write to/read from, the better off you are. Filer 2 can use the SATA disks for both root aggregate and data. Again, you don't have a lot of disks. Here raid4 would be useful as well but I think you will just end up with a small raidgroup size (which needs a hack to change) so no real win. Just run with raid_dp and perhaps set the raidgroup size to 13 to start with. That will prevent anyone from adding the last disk and leaving you with no spares. Spares are per controller and per disk size. A controller failure will result in a failover of functionality to the surviving controller. Assuming you make no configuration errors, you should not lose any functionality. You will, of course, have potentially reduced controller performance due to having both instances running on the same controller hardware. You have a lot more to read about setting up your VMWare instances correctly and what sort of storage maintenance routines you need to do when using LUN's There is also potentially a lot of design planning to be done to match your datastores to result in optimal use of NetApp functionality and backup/restore SLAs. There are thousands of pages on VMWare usage with NetApp...or you hire in someone that has done this before. Good luck

shaunjurr · ‎2011-06-10

Not sure if this will help or not. Older version of DFM used to manipulate the snapmirror.conf file. Basically, you should either use protection manager or local configurations, but not both. I believe newer versions of DFM/OM leave this alone. At least you have the previous version on a snapshot on the root volume.

shaunjurr · ‎2011-06-10

Just export the filesystems with 'anon=0". Then everyone is root. It is going to be a complete mess, but it will be very easy to setup. I guess security and authentication mechanisms are unknown where you are. Otherwise, just export the volume to an admin machine where you set the rights (+sticky bits) on the underlying qtree and then export the qtree to your lusers.

shaunjurr · ‎2011-06-10

The correct invocation can be found, with examples, on the page preceding the download page for perfstat. I would suggest you try reading it again. Try using a switch that avoids collecting local stats from your ubuntu server if you don't need them.

shaunjurr · ‎2011-06-10

See the SAN Compatabilty Matrix on the NOW site.

shaunjurr · ‎2011-06-10

Hi, What OS and version is accessing the files? Are you using terminal servers? Are you using vscan services? Are there indexing/content or accounting services scanning your CIFS shares? Have you checked for AD authentication or network erorrs? Are you PC clients syncronized with a time server (NTP/AD server with NTP)? What, if any, messages do you see in the message file on the filer?

shaunjurr · ‎2011-06-10

There is very little information here. LUN's for which OS? Which applications are manipulating data on the LUN's? Have you read any documentation?

shaunjurr · ‎2011-06-10

Hi, Being new to NetApp, I would suggest you take a good deal of time and research the Best Practices documentation and the system documentation. Some of these things are just too time-consuming and too long to explain in forums. Having a well-running storage operation, like most things in life, requires a certain amount of knowledge and experience. While I can appreciate the desire to simplify by using automagical GUI's and minimal installs, understanding a NetApp storage system really can't be achieved this way. Reading and using the cli are going to give you more information. 1)A short summary of the advantages of one volume per lun and one lun per volume might be: a) Control over growth/encroachment over other datasets (databases, NAS shares, etc) b) Control over data integrity. Errors will affect smaller amounts of your services (deletions, software corruption, user errors) c) More fine-grained snapshot and snapmirror/snapvault policies are possible d) Better performance control with reallocate and priority settings 2) Read up on thin-provisioning and auto-grow and perhaps snap autodelete. Not having the system adjust volume sizes when you use snapshots is a sure fire method to get burned. Unless you have 100% control over block changes, snapshots will at some point grow unexpectedly and fill up your volume and take your lun offline. +20% is a very conservative estimate depending on how long you are going to keep your snapshots. A single decision to increase logging one day to monitor some little problem could easily fill up a log lun in just a few hours, ending with the same volume full and offline lun situation 3) This is basically part of thin-provisioning. There is a TR on this as well. Removing all guarantees works if you monitor your systems and use volume autosizing. That is the short version.

shaunjurr · ‎2011-06-10

Hi, I never suggested that you use a single server. I'm not sure where you got that idea. Just because you can run linux on VMWare doesn't mean that it is always a good idea for everything. I can't really see how one can say that it is "natural" to migrate this into a VMWare environment. Without some idea of how it is going to perform, there is no logical reason to do this at all except for not having to buy more equipment. The age of your previous machines isn't really a factor so long as they work and their costs are not higher than new machines. The NFS load, and the load in general, will probably increase now that you are adding a number of machines that have their OS's on the same storage in addition to cramming your mysql database over to vmware. All I can say is good luck. You are making for the most part, a wild guess that this migration will give you satisfactory results.

shaunjurr · ‎2011-06-09

Hi, This is a little beside the point perhaps, but I always cringe a bit (mainly because I have been unix administrator for a lot of years) when I see so many "drive letters" in use. One of the really useful abstractions that Microsoft came up with already in w2k3 was the concept of "volume mount points", that means the submounting of resources below the drive letter level. If you ever want to run a lot of instances on a server (the servers are generally powerful enough to run a number of instances simultaneously) you will run out of "drive letters" before you are actually utilizing your hardware. We run all of our MSSQL implementations the last couple of years with one drive letter per instance. The setup sort of looks like this: E:\ -> lun for binaries for entire server (not shared in a cluster environment) F:\ (for example) --> 5GB lun for additional mountpoints and system DB F:\instance_data --> lun for database files, sized as needed F:\instance_log --> lun for log files, sized as needed F:\instance_tempdb --> lun for tempdb F:\instance_snapinfo --> generally 4GB or so for snapinfo You need 5 luns (one lun per volume) and some idea of how to use mountpoint disks and this works very well, even for MSCS clusters (you'll need a shared lun for the quorum disk, of course). You can then run up to like 20 instances per server/cluster with the available drive letters. It also lends some structure to what belongs to an individual instance. This way you can also split your log lun, for example, and put it on a different aggregate or controller to balance the load. Reallocation really only needs to be run on the database lun, for example, as well. This saves you some cycles too. You can use 'priority' to prioritize I/O to each volume (and lun if you use one lun per volume) to tweak I/O for individual databases too. Hope this helps.

shaunjurr · ‎2011-06-09

Hi, You probably need to brush up a bit on the documentation. There is also a "TR" on RBAC for NetApp, even if it is not an easy concept to implement. Basically, you just need to add a domain user with significant rights. 'useradmin' is your friend here. Something like 'useradmin domainuser add YOUR_AD\the_admin_group -g administrators' should do the job.

shaunjurr · ‎2011-06-09

Hi, There are a few software bugs in 8.0.1Px... that can cause this. IIRC, they are fixed in 8.0.1P4 or P5. This might be the cause of your LED error. I actually have a P3 3070 cluster that also has an amber status led on with no apparent reason. I guess I'll have to get around to trying the upgrade too. Hope this helps.

shaunjurr · ‎2011-06-09

Hi, I think raid-dp and a-sis/dedupe are, at best, tangental. Raid-dp is a parity protection mechanism for data. 'sis' identifies duplicate blocks (although there are checksum calculations in de-duplication as well) and moves pointers and removes duplicate blocks from the file system. Not being part of the larger discussion you refer to makes it difficult to understand the angle you are coming from, however. 'sis' is a subset of WAFL functionality. WAFL can do "de-dupe" just as well with raid4 and probably with raid0 you just lose some of the protection you have against disk/hardware failures affecting data integrity. Comparing results of compression of other vendors like the former Data Domain that used tons of CPU cores for compression with a-sis deduplication (and compression) would be an interesting discussion. Even the total cost of the disk infrastructure for a given data set to reach a certain level of saving from de-dupe or compression (or both where possible) would be interesting as far as "paring down" primary storage. RAID overhead is sort of on the fringe of such discussions, however, even if it is important to "get it out" that WAFL is optimized for raid4 and raid-dp raid types. 🙂

Is there anyway to find which vfiler is using a particular volume?

Default UID/GUID on volume creation

SQL Server LUNs, Volumes, best practices

FAS2020 Aggr design for vSphere environment

FAS2020 Aggr design for vSphere environment

FAS2020 Aggr design for vSphere environment

Re: Issues with running pefstats on a unix system.

Splitting CIFS volume

Re: Issues with running pefstats on a unix system.

in the long run, in using ASIS really saves space (VMware Vol.)

Storage layout, Exch2007 with SME etc

New SQL volume and autogrow questions

FAS2020 Aggr design for vSphere environment

Snapmirror.conf file regenerated by registry?

Default UID/GUID on volume creation

Issues with running pefstats on a unix system.

multipath driver question

NetApp CIFS/Microsoft Excel 2003 Time Stamp Issue

Consistent LUN backup with Snapdrive and Protection Manager

New SQL volume and autogrow questions

New project: (quite) high traffic web sites on a vSphere cluster and FAS2020

Re: SQL Server LUNs, Volumes, best practices

SID to netapp user mapping

Alarm LED on without any visible reason

Using RAID-DP to pare down primary storage?