Hi,
You will perhaps get a good deal of differing opinions here, but some fundamental common sense is going to probably get you the farthest in the long run. 'A-SIS' is a data manipulation (filesystem) tool that is primarily concerned with removing (consolidatïng) duplicate blocks from the filesystem. Even if it has been refined overtime to try to reduce fragmentation, the tool main goal is to optimize space savings. How performance is affected is largely going to be the result of how deduplication affects seek times and the number of reads it needs to access the blocks requested. This would seem to be relatively easily deducible from a simple understanding of disk based harddrives. 'sis' is a tool with a specific use in mind. Like most tools, you can try to use it for other things but the results may be suboptimal. (You can use a knife to loosen a screw, but you might ruin the screw or the knife in doing so).
There is one complicating/mediating factor here as well: system memory. The larger the system memory, the more easily the system can cache frequently accessed blocks without having to access the disks. This is also why PAM-II cards can have an amplified advantage on de-duplicated data. The 2050, unfortunately, isn't going to have many advantages here.
We might then, deduce that 'sis' isn't the right tool for filesystems (flexvols) that require optimal access times. We can probably reasonably estimate a number of scenarios where 'sis' would be useful and some where it wouldn't be. The key here, to reiterate, is to segregate the data in a way that will make these decisions more clear-cut.
1) VMWare volumes: datasets that are highly duplicate, relatively static, and require little or only slow access: system "C:" drives, for example. The similarity of the data here should lead one to perhaps have exclusively C drives together (without pagefile data).
2) VMWare volumes: datasets that are moderately duplicate and require moderate access times.
3) CIFS/NFS data that moderately duplicate and require moderate/slow access times. The sizes of the file systems here can result in significant savings in terms of GB for normal user data.
Conversely, there are probably many datasets where 'sis' is going to give minimal savings or sub-optimal performance.
1) Datasets with files that have random and/or unique files. siesmic data, encrypted files, compressed files, swap files, certain application data
2) Datasets that have files with application data which have optimized internal data structures or that require fast access times: databases
3) Datasets that are too small for significant savings.
4) Datasets that are dynamic and require fast access
The common sense comes, then, in using the tool for what it was meant for. Segregate the data into reasonably good sets (flexvols) where 'sis' can be used with success and where it shouldn't be used at all. In the end, the goal for most IT operations isn't to save file system blocks at all costs, i.e. without considering performance.
There are other maintenance routines that can help with access times, like the use of 'reallocate', but one needs to read the docs and use a little common sense here too. Normal fragmentation will affect most filesystems over time, but that isn't a situation that is limited to de-duplicated filesystems.
Hope this helps.