About joostvandrenth

joostvandrenth · ‎2011-05-24

I was wondering whether having a Qtree filled up entirely has an adverse effect on performance as would have a filled volume or aggregate? (even though the vol in which the qtree lives might have some spare capacity)

joostvandrenth · ‎2011-05-16

I have noticed the same problem on some of our customer sites, very annoying! It has been a problem for us through 3.8, 4.0, 4.0x I suppose it has to to with some java settings, I have tried to force the nmc to use a different (older) version but have not been succesful in doing so.

joostvandrenth · ‎2011-03-31

Keith Aasen has given his input to this over at his blog: http://blogs.netapp.com/virtualization/2011/03/the-4-most-common-misconfigurations-with-netapp-deduplication-.html Basically saying to run dedupe every night, which in practice should shorten the runtime enough to prevent resource contention.

joostvandrenth · ‎2011-03-31

Maybe the question was somewhat vague, if the switch (it being an edge device) to which the NetApp HA pair is connected does not directly suffer a failure, but other connections to the central network core do (thus resulting in loss of communications for servers and clients to the storage) am I able to 'catch' this failure and instigate a failure without manual intervention?

joostvandrenth · ‎2011-03-16

Having evolved from a traditional volume (which can be seen as 1-1 relation between a RaidGroup and a volume) I see an aggregate as the physical to virtual transformation of storage capacity - it is the construct from which you can provision volumes.

joostvandrenth · ‎2011-03-15

I do not know why you would want to, but maybe traditional volumes are something to look into?

joostvandrenth · ‎2011-03-14

We were wondering whether a VIF failure would lead to a cluster failover. The first step would be to enable CF failure on network failure and set NFO on the appropriate interface. Our setup involves a Core switch with servers, clients and storage that connect through intermediary other switches - this means a LINK FAILURE as such might not occur on the switch the NetApp controllers are connected to, while the underlying connection between central and decentral switches will be affected. Is there a way to initiate or detect a failure of this kind? This would be more a failure of communications that a direct link failure. Same question for a multilevel VIF: multimode connection to 1 stack with single mode on top to another network stack, will it failover when there is not a direct link failure but an underlying one?

joostvandrenth · ‎2011-02-21

Indeed, to order a CIFS or any other a la carte license (or bundles) on a system you order 1 for a single controller system and 1 order for a CIFS license for a HA pair. Cost wise ordering CIFS for a HA pair of controllers is (of course!) not the same as ordering 2x a single controller CIFS license (though technically this is exactly how it works).

joostvandrenth · ‎2011-02-09

I was just wondering about deduplication schedules. Now that we have more and more systems running dedupe these days scheduling the runs can be a bit tricky, notably when we deal with older systems with maybe not much IOPS and CPU to spare. My question is whether there is a possibility to either schedule dedupe runs per month or to set it to run automatically but have it honor a runtime of 20:00 -> 06:00 when it runs. The default schedule option per week is not enough when dealing with many volumes in my opinion. can operation manager help out in some way?

joostvandrenth · ‎2011-01-19

First of all, I would advise you to check performance advisor - imho a much better view on performance information in that tool (find it in the DFM website). Performance advisor will allow you to create easy total overviews of multiple volumes, it will also have the option to create overviews via CLI. Rogue latency on vols that have no apparent I/O against them is 9 out of 10 times a scan of some kind on the volume or aggregate - a good way to find out is to create an alert against a condition where both IOPS AND latency are occuring at the same time. At my customer sites alerts would normally all but disappear. Disk utilization, normally, you can count on a certain amount of IOPS per disk depending on the disk type (SATA, FC/SAS 10k, 15k, etc) the percentage will give an indication of how much strain is put on the disks. This is a little simplistic but I find it to cover most use cases. When you reach the max amount of IOPS a disk can give at a reasonable latency, it can still give you more IOPS but the latency will go up exponentially. (2-3 is normal, volume latency above 20 ms is not good news)

joostvandrenth · ‎2011-01-14

I have always wondered about this, is there any update on this?

joostvandrenth · ‎2011-01-11

Is it possible to share an NFS datastore between different ESX clusters (ESX 3.5 and 4)? I have come across a site that has the same mountpoint on different clusters - this sounds dangerous to me, but the only reference I could find on the VMware communities site on this said it is no problem. In the admin manual ESX and NetApp TR's I could not find a reference on this.

joostvandrenth · ‎2010-12-02

I understand it is already up in a lab but we will have to wait for it for some time. I wish we could hurry this up...

joostvandrenth · ‎2010-11-24

I actually like SAN volumes to be thin and hold a snap reserve to monitor the level of snap capacity usage against: the thin provisioned vol will always leave some additonal capacity while having a snap reserve will allow a level of automation to curtain unwanted snap growth. just my 2 cents.

joostvandrenth · ‎2010-11-21

We have the same issue with a customer over at the IBM side of things, will try to get a case opened as well - let's see if we can get this fixed. In the mean time we use the -f option, which pretty much delivers the same result right? I mean if it will rewrite current data across all new and existing RGs the layout will be fixed as well.

joostvandrenth · ‎2010-10-26

I am still running into this problem. We had an AT-FCx module fail on us, luckily the passive module of the two. Having received a replacement with firmware version 35 (IBM shame on you!) twice (shame on you... twice? ) we had no way of updating that particular module only. A command to update only a channel of even a channel and shelf combination will first update the targeted module(s) and THEN will run the update on ALL eligible shelves on the system.... Which would have brought down services to 160 TB of storage for a long time.... Even though NDU upgrades are possible, I still run into sites without the necessary cabling or software versions to support it. Is there a way around this?

joostvandrenth · ‎2010-10-26

spot on! much obliged!

joostvandrenth · ‎2010-10-26

I have often wondered why NetApp would not release an more unlimited version of this software, maybe with a fee? I see HP making money with the LeftHand software to create very rudimentary DR sites...

joostvandrenth · ‎2010-09-14

No Snapvault licenses, NDMP would actually be possible - will need to check on this. Sysstat -x shows me overall performance, with another 10 aggregates active on the system I see no way to isolate this to anything particular to my aggregate....

joostvandrenth · ‎2010-09-14

I am looking at a filer with a lot of aggregates consisting of SATA. To make out what is what and reorganize some vols to higher performance aggregates I am trying to use the perf advisor to help me figure out which vols are using a lot of I/O. I made some groupings of all the vols inside an aggregate and compared top objects -> vol IOPS to the IOPS generated by the underlying aggregate, suprise surprise: I have 3 aggregates consisting of 16 drives SATA each generating a consistent 1000-1200 IOPS with NO I/O showing for the volumes/LUNs inside each aggregate. No reallocates are running (also no measurements done), no snapmirrors on those aggrs. I cannot find out which process is generating those 1.000+ I/O per aggregate.

joostvandrenth · ‎2010-03-31

Although general guidelines for disk queue length would indicate a problem, I would focus on other stats as well - latency. You could check the latency from the windows or even SQL side of things as well, some high disk queue spike in and of itself is nothing to worry about. Especially since you probably have a greater number of disks in the aggregate than you would have with a regular RAID set. Another thing could be the queue depth, SQL likes it to be somewhat higher (depending on the SQL instance) that regular apps. Also MPIO and iSCSI is a nice subject as well, can you confirm from the Windows side of things you are actually using more than 1 interface simultaneously for the SQL data? Keywords to look for when investigating is MCS or multiple connections per session. But first thing is to establish whether you really have a problem in the first place.

joostvandrenth · ‎2010-03-30

I am sorry for this simple question, but I just cannot find the answer. I suppose there is nothing special between a shelf and controller connection, so normal distance/speed/cable limitations apply, but just to be sure: are there any caveats when placing shelves a good distance from the controllers (for instance 100m @ 2 Gb)?

joostvandrenth · ‎2010-03-03

I Just to clarify a little further, we have indeed a FAS3020A or cluster, but this should not impact the initial question I think. X1035B 3 cards in the system and 1x X1007A card to add. So one X1035B card will have to go. This part is central in answering the question and I am a bit unclear on the exact meaning: "When looking at network adapters, you will see that the "Maximum Number of Adapters" will span multiple part numbers. In this case no combination of those part numbers can exceed the max number of cards for that particular type. The fact that you are crossing over in to two different types means that you now may have only a max in the particular category. " The logic in your list of preferred slot uses does not correspond with how I read that last part: I can have 3 1035B cards and 2 X1007A cards in the system max - as per the table. I take it to mean that both maximum adapter numbers apply to their respective cards: i.e. If we assume there was no maximum overall limit of 3 network cards, we could have 3 1035B cards and 2 X1007A cards without violating the rules.

joostvandrenth · ‎2010-03-03

Silly question, I have a FAS3020 with 3 dual port ethernet cards. Now we want to put in a quad port ethernet card, so one 2 port card has to go. The config manual says I can have 3 -2 ports and 2 -3 ports, now the question becomes can I have 2 - 2 ports and 1 - 3 ports in the system?

joostvandrenth · ‎2010-03-02

Somebody please speak to IBM about this as they bumped my tech request on approving any version of switch firmware since the ones available are newer than stated in the matrix. Please note that the current listed fw for our switches is not available anymore, which means we aways need to contact tech support.

Impact of a full Qtree on performance

Re: Performance Advisor Hosts invisible

Re: Deduplication schedule options

Re: Cluster Failover on link communcation failure

Re: aggregate need?

Re: aggregate need?

Cluster Failover on link communcation failure

Re: FAS3200 Licensing

Deduplication schedule options

Re: Relation between latency and IOPS

Re: Events in performance advisor don't appear

Share NFS datastores across datacenter

Re: FAS 3200 Metro Cluster Config rules

Re: Help: Understanding Volume available space

Re: reallocate on a metrocluster - split the mirror?!

Re: Shelf firmware upgrade disruption

Re: Mystical Aggregate Performance

Re: Simulator disk limitation

Re: Mystical Aggregate Performance

Mystical Aggregate Performance

Re: High disk queue readings on SQL server

Maximum distance between filer/controller and shelf

Re: Slot expansion question

Slot expansion question

Re: Switch firmware version Metrocluster