About aborzenkov

aborzenkov · ‎2011-07-01

NetApp does provide witness support (MetroCluster tiebreaker); in the past it was separate solution (I believe integrated with OM); today it is offered as part of ApplianceWatch PRO. See as example http://communities.netapp.com/servlet/JiveServlet/downloadBody/6314-102-1-9571/Partner%20Academy%20Workshop%20MetroCluster%20June%202010.pptx or http://communities.netapp.com/servlet/JiveServlet/download/49558-22659/ApplianceWatchPROBestPracticesGuide.pdf Unfortunately it is very poorly documented and marketed; the only available link is NetApp internal, couple of paragraphs in ApplianceWatch PRO documentation and whatever you can find on community or kb sites. You mention in your blog that NetApp MetroCluster needs 4 FC connections – do you count backend only? Because MC requires 2 ISLs; 4 can be used but is optional. I wonder how VPLEX implements simultaneous write support on both sites without introducing read latency for local access (due to necessity to verify that data had not been changed remotely).

aborzenkov · ‎2011-06-30

No, this configuration is not OK. Running NetApp system with a single disk is not likely to be supported. You need at least two for RAID protection. Please check knowledge base article 3011274 which contains detailed description about NetApp is using disk space.

aborzenkov · ‎2011-06-29

Depending on amount of existing data it may be possible to copy existing volumes onto aggregate on another head before destroying aggregate. Or temporary take all spare and d-parity disks to build aggregate to copy data over.

aborzenkov · ‎2011-06-28

You need to tell for every interface over which partner interface should traffic go in takeover. It is done by “partner” argument of ifconfig. You can find detailed description in normal documentation. It would help if you provided output if ifconfig –a on both heads and told which exactly interface(s) should be taken over. It was always possible to specify either partner interface name or partner IP address, but I have heard that recent releases support only interface name.

aborzenkov · ‎2011-06-28

You can’t map one CIFS user to another CIFS user. What you can is to let domain users connect resources as local users, but as already mentioned it quickly becomes rather unmanageable …

aborzenkov · ‎2011-06-27

Queue depth on Qlogic (if you really mean queue depth, not execution throttle) is actually per LUN, so what is relevant is number of LUNs each host sees, not directly number of hosts. That is, if you expect all hosts to be active at full speed at the same time. And that is not from “optimal” PoV but simply to prevent error responses/delays due to queue full condition. Personally I’d leave it on default unless you have real evidences that queue depth is causing issues.

aborzenkov · ‎2011-06-27

I checked external library before posting link but it was not available. TR is not marked as NDA so I presume it is just matter of time.

aborzenkov · ‎2011-06-27

Could you remind where your blog is?

aborzenkov · ‎2011-06-27

I was always uneasy about read reallocation. It appears to take place at exactly wrong time – after we already paid penalty of non-sequential read ☺ Depending on environment, next time we need to read data it may have been fragmented again … so it apparently makes sense only in highly static environment. I wish this TR explained how can we estimate (or better – get real counters) of how effective read reallocation was. Something about how often reallocated data was read subsequently before being fragmented again.

aborzenkov · ‎2011-06-26

TR-3929 Reallocate Best Practices Guide https://fieldportal.netapp.com/viewcontent.asp?qv=1&docid=33904 Not that it contains anything that was not already beaten to death here ...

aborzenkov · ‎2011-06-20

Someone or something removed base snapshot. You could try “snapmirror resync” to see if any common snapshot is still available. If not, your only option is to reinitialize snapmirror from scratch. You have to check why it happened. It could be too aggressive snap autodelete policy.

aborzenkov · ‎2011-06-19

Site requirements are available at http://now.netapp.com/public/knowledge/docs/hardware/NetApp/site/pdf/site.pdf Fieldportal does not have anything with filed-tech BTW ☺

aborzenkov · ‎2011-06-17

Use “iscsi interface disable” on filer to disable iSCSI protocol on specific interface(s).

aborzenkov · ‎2011-06-16

During acceptance tests customer noticed that interconnect link failure does not produce any visible notification. There is apparently no trap, no autosupport being sent and overall system status remains OK. Is there any notification sent from filer when one link breaks? This is with 7.3.5; if something changed in 8.x it is also good to know. Thank you!

aborzenkov · ‎2011-06-10

This is 7.3.5.1.

aborzenkov · ‎2011-06-08

You need to configure VLAN interfaces for failover.

aborzenkov · ‎2011-06-07

After rechecking cabling I realized that it did not correspond to pictures in tr-3548. Although nowhere is stated that cabling shown is mandatory (and not just an example) I changed it to be precisely as shown in tr (specifically Appendix G, figure 22). I did not observe Mixed-HA warning any more since then. Of course it may be just a coincidence. I wish NetApp were more explicit about requirements. E.g. – does it matter that port 0a is connected to ESH B and port 0b to ESH A? Or could it just as well be reversed? Etc …

aborzenkov · ‎2011-06-03

FAS3140 FMC. After testing power feeds to cabinet (with switching off half of PSUs including one of Brocade switches) I noticed alarm LED on one node (I do not know whether another node had it as well, it was a bit too far away to check). Filer View claimed status is normal, nothing to worry about. /etc/messages confirmed it by message "status returned to normal". All disks were properly MP-HAd. Cluster was enabled. There was no environment failures on any shelf or head. I am not sure what else to check. How can I find out why alarm LED is lit? More importantly, how should I explain it to customer The only missed feature right now is that setup is not yet complete so I need to mirror root aggregate. But if this is the reason, alarm LED was not lit before, although aggregate was not mirrored ...

aborzenkov · ‎2011-06-02

What about total load on aggregate? Number of total IOPS, how busy disks are, etc? Aggregate is shared resource, so may be other systems keep it busy? In this case playing with flexshare could offer some improvements.

aborzenkov · ‎2011-06-02

I believe I have seen similar problem posted recently. FMC with 2 x FAS3140, 7.3.4 (factory delivery), 4 HBA used. After fully booting node it complaints that some disks are not multipathed. Indeed, out of 4 available pathes Data ONTAP selectes two going via the same switch (i.e. both to A/B side of shelf). After several unplugging and replugging A and B channels for this stack it suddenly setlles on correct A/B pathes. But is not clear why it selects suboptimal pathes nor how to force path re-selection using less intrusive means. Is it a known issue? I am going to update to 7.3.5.1P3 anyway (but am a bit uneasy as it still is not explicitly listed in compatibility matrix); it is the first time system is booted after assembling and switch interconnect is finished.

aborzenkov · ‎2011-05-26

Have you tried changing wafl.default_qtree_mode? Volumes are qtrees in some sense, so this may apply to root volume directory as well.

aborzenkov · ‎2011-05-25

my entire aggr went inconsistent because one ofthe disks has failed and with no spare disk, the DOT couldn't reconstruct the raid, causing (after the 24 h grace) multiple panics If you had muti disk failure, WAFL_check is not going to help you. OTOH I have yet to see true multi-disk failure - all of them were caused by loop stability, environmental (like switching off shelf) or operational (disk assigned off working head ... ) issues. So I would repeat advice you were given - contact support to determine and fix the cause for multiple failures. Hmm ... metnioning that system "paniced" exactly after 24 hours makes me suspect that there was actually no panic at all, just system warning. This is normal behaviour - NetApp will shutdown after 24 hours if degraded raid group is not repaired. This is to protect you from possible second disk failre and losing data. Any chance you misinterpreted system message?

aborzenkov · ‎2011-05-19

So under “duplicated” you mean “having the same name”? It is still not clear what do you mean under “group and personal qtrees”. Every qtree is just a directory inside a volume. Deleting removes this directory together with files. It is up to you to know whether data inside two qtrees is the same and one copy can be removed. BTW there is no way to remove qtree on NetApp. You need to mount volume (CIFS or NFS) and use standard host command to delete directory.

aborzenkov · ‎2011-05-19

How do you change permissions? CLI, Filer View, Windows MMC?

aborzenkov · ‎2011-05-18

Please explain what do you mean under "duplicated qtrees". The best is to show command output that displays them. If these qtrees are in different volumes, they are totally independent and are not duplicates.

Re: Netapp FAS vs EMC VNX

Re: Please Help With this

Re: netapp 2040

Re: CF Takeover Question

Re: CIFS Multi Domain

Re: QLogic Execution Throttle Setting

Re: reallocate TR?

Re: reallocate TR?

Re: reallocate TR?

Re: reallocate TR?

Re: lag hours are high in snapvault status

Re: Where can I find the power draw/usage and heat output of the DS4243 shelf?

Re: RH Linux/Snapdrive - discovering new lun(s) ...failed

Notification in case of one interconnect link failure in (F)MC

Re: Alarm LED on without any visible reason

Re: 10gBit LACP VIFs and partner failover confiuration

Re: Fabric MetroCluster sporadically selectes non-HA pathes to shelf

Alarm LED on without any visible reason

Re: Improving Disk Latency Issues and Overall Performance for Microsoft SQL

Fabric MetroCluster sporadically selectes non-HA pathes to shelf

Default UID/GUID on volume creation

Re: inconsistent root aggregate

Re: Need Help Understanding What Happens When I Delete Duplicate Qtrees

Re: Administrativ Shares on NTFS Volumes

Re: Need Help Understanding What Happens When I Delete Duplicate Qtrees