Network and Storage Protocols

How do you size VDI accurately?

chriskranz
5,781 Views

Taking into account a variety of factors, and bundling a few questions together...

  • Storage Capacity vs. Storage & Protocol Performance
    • Given that NFS is slightly favoured, how does the lack of multipathing effect the storage performance? (looking forward to pNFS?)
    • How do we size PAM cards and can this be actively tuned depending on the workloads?
    • Given all the space savings of A-SIS and RCU and the performance gains of PAM cards, are we seeing a move towards low spindle SATA configurations in larger VDI deployments?
  • Total Users vs. Active Users
    • This is always a difficult question in any environment, but a customer may have 5000 users, but only 3000 active at any time. Should we size for worst case scenario? Boot storms and other such events can cause major troubles on a system that is sized for everyday use and no more.
  • Shared Storage Platforms (looking at the bigger picture)
    • NetApp is multi-protocol, so how do we size all the above factors if it is a shared platform and we have the applications to also consider and give priority to? In many cases you may be reading data from one area of a filer to present to another area of a filer (Exchange, Home Directories, etc. -> VDI).
    • How does this tie in with the PAM cards, could you for instance dedicate a PAM card to a specific area of storage?
    • This is potentially more important with V-Series and multiple storage tiers. We may have RAM-SAN for VDI, FC for Exchange and SATA for home directories. How can we make the best use of the storage system as a whole in these tiered environments?
    • Are there any ways to help fence off storage to prevent out of control systems effecting performance of the entire system?

Sorry, quite a few areas and questions, but I'm very keen on being able to size VDI accurately and making it hugely scalable. With all the new features, plugins and filer functionality, there is a lot to consider and a lot of different routes that we could potentially go down.

1 ACCEPTED SOLUTION

keitha
5,782 Views

Wow Chris, That is quite the batch of questions. Sizing is actually very easy, Step 1, You call Abhinav Step 2 Your done!

Actually you are right on, this is a very complex topic and one that is still evolving from project to project as we learn and have have new technologies in our toolbox. I will leave many of the sizing questions to Abhinav and Trey as I suspect they have sized many more environments than I have but I will take a stab at some of your other questions.

    • Given that NFS is slightly favoured, how does the lack of multipathing effect the storage performance? (looking forward to pNFS?)

You can in fact multipath with NFS. It just doesn't do it automatically and you have to do a little planning up from. How to create redundant load balanced paths with NFS datastores is covered in the NetApp best practice guide TR-3428. You do have to size your datastores such that you don't exceed what you can do across a single link but I think you might be surprised how many VMs can run across a 1GB link!  Certainly pNFS and 10G is going to help here.

    • How do we size PAM cards and can this be actively tuned depending on the workloads?

By sizing do you mean how many do you need? If so, the very cool VDI sizer that Abhinav and Chris Gebhart does factor in how many PAM cards you should factor in based on the working memory set of the images you are using. I am going to have to differ to them to provide the link as I can't find the external link right now.

    • Given all the space savings of A-SIS and RCU and the performance gains of PAM cards, are we seeing a move towards low spindle SATA configurations in larger VDI deployments?

This is certainly something I can see happening although I haven't made the move on one of my projects yet. Once the PAM card grows in size (and it will!) then this will certainly become more more of a potential solution.

    • NetApp is multi-protocol, so how do we size all the above factors if it is a shared platform and we have the applications to also consider and give priority to? In many cases you may be reading data from one area of a filer to present to another area of a filer (Exchange, Home Directories, etc. -> VDI).

One of the best kept secrets at NetApp helps here. FlexScale allows you to adjust the priority of data volumes and insures your high priority applications receive the disk performance they need.  When adding a VDI environment to an existing controller though, you must be very though and understand what the impact of that environment will be to the storage controller. The last project I worked on knew that although they were only starting with 2000 users that the environment would grow so they opted for seperate dedicated controllers that they can tune for VDI and  grow if needed.

    • This is always a difficult question in any environment, but a customer may have 5000 users, but only 3000 active at any time. Should we size for worst case scenario? Boot storms and other such events can cause major troubles on a system that is sized for everyday use and no more.

True, but the good news is that events like boot storms are when the PAM card really shines. You do still have to size for events like these but of that sizing is insuring you have suffcient CPU on the controllers and suffcient fabric bandwidth from the servers to the storage (and enough CPU cycles on the virtualization servers too!) So I do tend to size for the work case scenarios but the difference between worst case and normal state isn't as much as you might expect.

    • How does this tie in with the PAM cards, could you for instance dedicate a PAM card to a specific area of storage?

Yes you can dedicate the PAM card to a particular volume or eliminate a volume from being cached on the card.

    • This is potentially more important with V-Series and multiple storage tiers. We may have RAM-SAN for VDI, FC for Exchange and SATA for home directories. How can we make the best use of the storage system as a whole in these tiered environments?

I'm sure if I understand this question but I think is might be out of my realm. What I will say is that technologies such as ASIS and File Level Flexclone can really help with the business case of the more expensive tiers and improve the performance of the lower cost tiers. I know that likely doesn't help but it's true!

    • Are there any ways to help fence off storage to prevent out of control systems effecting performance of the entire system?

Again, this can be handled by FlexScale. I told you it was a great secret!

Whew. That was a lot for a Sunday afternoon.

Keith

View solution in original post

3 REPLIES 3

keitha
5,783 Views

Wow Chris, That is quite the batch of questions. Sizing is actually very easy, Step 1, You call Abhinav Step 2 Your done!

Actually you are right on, this is a very complex topic and one that is still evolving from project to project as we learn and have have new technologies in our toolbox. I will leave many of the sizing questions to Abhinav and Trey as I suspect they have sized many more environments than I have but I will take a stab at some of your other questions.

    • Given that NFS is slightly favoured, how does the lack of multipathing effect the storage performance? (looking forward to pNFS?)

You can in fact multipath with NFS. It just doesn't do it automatically and you have to do a little planning up from. How to create redundant load balanced paths with NFS datastores is covered in the NetApp best practice guide TR-3428. You do have to size your datastores such that you don't exceed what you can do across a single link but I think you might be surprised how many VMs can run across a 1GB link!  Certainly pNFS and 10G is going to help here.

    • How do we size PAM cards and can this be actively tuned depending on the workloads?

By sizing do you mean how many do you need? If so, the very cool VDI sizer that Abhinav and Chris Gebhart does factor in how many PAM cards you should factor in based on the working memory set of the images you are using. I am going to have to differ to them to provide the link as I can't find the external link right now.

    • Given all the space savings of A-SIS and RCU and the performance gains of PAM cards, are we seeing a move towards low spindle SATA configurations in larger VDI deployments?

This is certainly something I can see happening although I haven't made the move on one of my projects yet. Once the PAM card grows in size (and it will!) then this will certainly become more more of a potential solution.

    • NetApp is multi-protocol, so how do we size all the above factors if it is a shared platform and we have the applications to also consider and give priority to? In many cases you may be reading data from one area of a filer to present to another area of a filer (Exchange, Home Directories, etc. -> VDI).

One of the best kept secrets at NetApp helps here. FlexScale allows you to adjust the priority of data volumes and insures your high priority applications receive the disk performance they need.  When adding a VDI environment to an existing controller though, you must be very though and understand what the impact of that environment will be to the storage controller. The last project I worked on knew that although they were only starting with 2000 users that the environment would grow so they opted for seperate dedicated controllers that they can tune for VDI and  grow if needed.

    • This is always a difficult question in any environment, but a customer may have 5000 users, but only 3000 active at any time. Should we size for worst case scenario? Boot storms and other such events can cause major troubles on a system that is sized for everyday use and no more.

True, but the good news is that events like boot storms are when the PAM card really shines. You do still have to size for events like these but of that sizing is insuring you have suffcient CPU on the controllers and suffcient fabric bandwidth from the servers to the storage (and enough CPU cycles on the virtualization servers too!) So I do tend to size for the work case scenarios but the difference between worst case and normal state isn't as much as you might expect.

    • How does this tie in with the PAM cards, could you for instance dedicate a PAM card to a specific area of storage?

Yes you can dedicate the PAM card to a particular volume or eliminate a volume from being cached on the card.

    • This is potentially more important with V-Series and multiple storage tiers. We may have RAM-SAN for VDI, FC for Exchange and SATA for home directories. How can we make the best use of the storage system as a whole in these tiered environments?

I'm sure if I understand this question but I think is might be out of my realm. What I will say is that technologies such as ASIS and File Level Flexclone can really help with the business case of the more expensive tiers and improve the performance of the lower cost tiers. I know that likely doesn't help but it's true!

    • Are there any ways to help fence off storage to prevent out of control systems effecting performance of the entire system?

Again, this can be handled by FlexScale. I told you it was a great secret!

Whew. That was a lot for a Sunday afternoon.

Keith

abhinavj
5,783 Views

Keith has covered it all. I will touch on a few points here. Sizing VDI is a complex process and an easy 2-3 hour discussion.

The VDI sizer is intelligent caching/PAM aware and factors that into sizing. This has been developed with lessons learnt from real world deployments, internal scalability testing, continuous feedback etc.  Please contact the NetApp SE or Partners to help you size the customer environment correctly and factor in the savings and performance acceleration achieved as a result of flexclone, dedupe, intelligent caching, PAM. I did a brief whiteboard about the VDI sizer here:

http://www.youtube.com/watch?v=NUNYWXCc_GQ&feature=channel_page

Also, check this blog post by Chris Gebhardt on Intelligent caching/PAM:

http://blogs.netapp.com/virtualization/2009/03/netapp-and-vmware-view-vdi-best-practices-for-solution-architecture-deployment-and-management-part-7.ht...

Once you know the customer IOPS requirement and read/write ratio (typically 70-75% reads), the VDI sizer will determine what percentage of the IOPS require disk drives. Next it will give options for selecting the type of disk drive and output the # of spindles you need for that disk type. From footprint aspect, since the NetApp storage efficiency capabilities significantly reduce the capacity requirements, performance becomes critical for sizing. With the cost of disks going down and one 15K RPM FC disk can serve much more IOPS than a 7200 RPM SATA disk, currently I still see FC disks for VDI. This will also keep the footprint smaller with fewer spindles.

NetApp has custom sizers for mixed workload. NetApp SEs or partners can definitely help you size in scenarios where there are multiple workloads.

Ultimately it comes down to cost of the solution and providing the best in class end user experience. The best thing to do is present the solution design, associated cost, pros and cons for both the worst case and not so worst case scenarios to the customer and let them decide which way they want to go. The decision will vary from customer to customer.

Feel free to email me (abhinavj@netapp.com). Will be happy to discuss this in lot more details.

Hope this helps.

Regards,

Abhinav

chriskranz
5,782 Views

Brilliant, cheers guys! Sorry, I got a bit carried away as it was my first question and put in a lot. Sizing is always tricky, and you are completely right, it can be quite time consuming, but it really pays off to get it right.

I've been to visit customers who have had deployments that weren't sized properly at the start. The overall result is that the end-user and application administrators get the impression that it is the new system that has caused a performance drop, and so they lose faith in the technology. In the future, as soon as there's a problem, the new system instantly gets the blame before any proper troubleshooting is done.

I'm always keen to set the expectations and make sure things are sized thoroughly and properly from the start. This makes sure there is no performance or functional drop to users, if anything they get more!

Public