2009-03-26 03:07 AM
I read in the Protection Manager Best Practices Guide that the maxActiveDataTransfers should be set to 140.
What is special about the number 140?
Can this number be calculated somehow depending on the number of primary and secondary filers and the size (model) of the systems?
2009-03-26 03:22 AM
Better was is to calculate based on the number of relationships per filer (both incoming/outgoing)and
the amout of head room youwould like to have for user operations so that the system cpu is not being hogged by the transfers alone.
The max streams supported on each filer differs
2009-03-26 03:28 AM
Thanks for your answer.
So is there a formula for calculating the maxActiveDataTransfers?
For example, I have 2 6080 filers as primaries and 1 3170 as secondary. I have about 1000 relationships. All systems are running 7.3.1P2.
How would I calculate the maxActiveDataTransfers in this scenario?
2009-03-26 11:39 AM
Nothing is really special about 140, but it is based upon the number of simultaneous transfer streams that a storage system controller could support. In Protection Manager (PM) scalability testing, we had controllers (FAS6070 running GB with nearstore) that could handle 128 streams, but because PM processing overhead is not zero we do not immediately start a new transfer as soon as one completes. We set maxActive… a bit higher to have some threads available to start transfers as soon as possible.
These extra threads hang around and retry periodically when no session is available. There is some concern that this puts a burden on the storage controller – all the retries. At this point, that concern is unsubstantiated. Certainly if a controller can support 128, then maxActive… may be set to 128 or below with no problem. Setting it at or below the filer’s session limit may potentially lower the overall throughput as some potential transfer sessions will be idle. Setting higher than 128 may increase the throughput but at some risk of taxing the filer with retries (in theory this means overall lower performance, not higher).
I suspect (also unsubstantiated) that any given load profile may have its own ideal limit on transfers. A good strategy would be to start with 140. Then try something a bit lower to see if your performance improves.
Data Fabric Manage (DFM) version 3.8 will support a new mechanism. The maxActive setting is a per storage controller limit. Storage controllers running 7.3.1 and beyond will support a new zapi that indicates how many transfers are available. The 3.8 version of DFM will use this to set an equivalent, automatic limit on those controllers supporting the new zapi. We still respect the limit if maxActive is set, but if not, we automatically pick the limit for you.
2009-03-26 11:59 AM
I generally set this based upon the limits described in the DOT online backup and recovery guide.
Retries are logged in the database job log. If you have a lot of retries it can cause massive database bloat, which can then results in database instability and performance problems. In 2 weeks we watched a customer's database grow from 2G. to nearly 10G. After we set maxactivedatatransfers to sane levels on all the filers, pruned the job logs, and executed a database reload the database shrunk from 10G back down to 1.5G. After that, the database continued to grow, but at a much more sane level.
Professional Services Consultant
PS - North Amer. - Northwest
(408) 822-3289 Direct
(408) 203-4446 Mobile
2009-03-26 12:03 PM
You will set this option on a per host basis. That is, each storage controller may have its own limit. Protection Manager (PM) will not exceed the limit to/from any storage system. For instance, if you only had two controllers (Pri and Sec), and Pri supported 500 streams while Sec supported 100. Then you could set the limits to 500 and 100 respectively. When managing transfers between Pri and Sec, PM will not start more than 100 streams.
If you added another controller (Sec2) with a limit of maxActiveDataTransfers=100, PM would be allowed to start another 100 streams between Pri and Sec2. Thus, Pri may have 200 simultaneous transfers running while Sec and Sec2 each have 100.
I personally do not know whether the limits allowed are optimal for all load profiles. I suspect not, but don't have data to back this up. However, my guess is that there will be a different optimum between profiles consisting of many, long transfers, and profiles of many short transfers. The latter represents a high overhead to data ratio.
What is my point? If I had one, it would be that limits and optimum values are different. Actual mileage may vary.