we likely missing each other. just to be sure i checked with a packettrace. the client never receives the list of delegates servers. it's up to the DNS to go up to the filer.
We don't have stretched VLAN's to DR so i can't take the IP's with me to the DR site. and for legacy reasons i cannot use new separate zone brined CNAME DNS. i must always use the SVM name, (I can in thorny just add to the DR procedure ad and remove IP’s from there. But then there a dependency on the actual person and replication, and will probably want to failback to old Fushun dynamic DNS just so the process will be short as possible)
Ive set now a delegation 20 fake address and one true with 1 min TTL everywhere. i'm going to test it through the weekend with nslookup every X random minutes and test the response time.
unfortunately i can't think on a better test for now.
OK - have the result, and I’m not happy with : (Again DNS -RFE/Microsoft to blame, not NetApp) ....:
Every time the TTL of the referrals expires the client sees around a second delay per dead referral. In the test I described earlier with 20 dead LIFs, I got around 7-13 sec of hung (as it goes on all the LIFS until it fins it in random) . Every time I removed a few it got better result – proving direct relation to that. (where I have healthy LIF(s) only the latency is around 25ms before it caches it until TTL expiration)
I guess it’s most environments this might still be acceptable (only 1-2 sec, every few min for one random client). But here I don’t want to limit myself with the scalability (adding more LIFS when the cluster expands), and put a known latency cause (even if it small) when I can implement it differently and avoid it.