VMware Vmotion Best Practices

tkutil1962 · ‎2008-07-14

I'm looking for some suggestions since this will be our first attempt at using our Netapp 2050 iSCSI storage & Vmotion.

#1 I am assuming that I need to create a lun that will be mapped by all of my ESX servers that I intend to use vmotion. Is this assumption correct?

#2 Should I create on gianormous lun and put all of my VMs in that lun or should I create multiple LUNS and make point all ESX servers to each LUN for vmotion purposes.

ex. 9 ESX servers each with 5 VMs

1 Big LUN accessible by all 9 ESX servers

3 Smaller LUNS accessible by all 9 ESX servers and VMs divided amoung those 3 LUNS

How do backup play into this? If I want to backup a lun I am assuming more smaller luns is better than one big lun???

TIA

Troy Kutil

reinoud7 · ‎2008-07-14

All the luns must be accessed by all the esx servers. We put not more than 10 virtual machines in 1 lun (when you use FCP or iSCSI).

When you are thinking of using nfs, than you can use a very large volume if you like that.

rickymartin · ‎2008-07-30

#1 I am assuming that I need to create a lun that will be mapped by all of my ESX servers that I intend to use vmotion. Is this assumption correct?

Correct the vmware datastore needs to be visible to each ESX host that will participate in Vmotion

#2. Should I create on gianormous lun and put all of my VMs in that lun or should I create multiple LUNS and make point all ESX servers to each LUN for vmotion purposes

The general best practice from most SAN vendors is no more than 10 virtual machines in a single datastore,

Having said that, like most best practices, this is a conservative setting that may not be the best for your specific environment.

The limit of 10 machines is related to the relatively high cost of writing new metadata into the VMFS filesystem and the performance impact this may have. Because this filesystem will be shared by miltiple servers, updating the metadata requires a SCSI reserve and release. Vmware's efficiency in using these scsi reserve and release requests has improved in the latest versions of ESX, however it still pays to be cautious.

The things that cause metadata updates include

1. Starting and stopping a VM

2. Vmotioning a VM

3. Using Vmware snapshots

4. Using VCB (Vmware consolidated backups - as these use vmware snapshots)

If you dont think you'll be doing much if any of these kinds of operations, then having more than 10 virtual machines per VMFS datastore/LUN may be OK, if on the other hand you think you'll be using lots of Vmware snapshots with lots of write I/O, then 10 may be too many.

If you use NFS, these issues no longer apply as the WAFL handles metadata updates without needing SCSI reserve/release mechanisms. As a result you can safely put a lot more VMs into a datastore without having to worry about performance, this is particularly useful in VDI deployments.

The trouble with lots of smaller LUNs is that you have the potential to lose a lot of space in the unused portions of your VMFS datastores. This can be mitigated through the use of thin provisioning of the LUNs, which works beautifully and elegantly, however it also requires that you set up the appropriate monitoring/alerting mechanisms to ensure that you dont run into any problems.

If you're new to NetApp, or are unfamiliar with thin provisioning, it would be wise to take the time to learn how this works before rushing into things.

Regards

John

rickymartin · ‎2008-07-31

Oh I forgot your last question.

How do backup play into this? If I want to backup a lun I am assuming more smaller luns is better than one big lun???

If you are using VCB, then having multiple LUNs may improve your backup performance as you may be able to backup multiple VM's simultaneously. As mentioned previously VCB style backups can cause a reasonable amount of metadata to be written to the VMFS datastores due to its use of Vmware snapshots. Given that you have 9 ESX servers, I suspect that you will run into the scalability limitations that VCB has fairly quickly.

If you are using ESX ranger or similar, then you may find that backing up from multiple LUNs will improve your speeds.

One thing to keep in mind is that on NetApp storage, all the LUNS are striped across all the disks in the aggregate that underlies the flexvol/s that contains your LUN/s. The net result is that reading from multiple LUNs has less incremental benefit on NetApp storage than it will on traditional storage as OnTap will automatically gets really good read performance by using as many of the spindles (usually more than ten) as possible to satisfy both read and write requests whereas on OVS (other vendors storage), you may find that reading from one LUNs means reading from only two or three spindles, reading from three LUNs simultaneously means reading from six to eight spindles, etc

Having said that reading data in a single stream from one LUN subjects you to "Littles Law" and the aggregate performance from multiple simultaneous read requests is almost always going to be better than the peformance of a single threaded read request.

The good new is though, that If you are using OSSV or Snapmanager for Virtual Infrastructure, the number of LUNs is pretty much irrelavent as NetApp minimises data movement through the use of snapshots and block level increemental backups. If you're not using these technologies, or planning on it, you should probably take a look at them before you invest too much more in other approaches.

Regards

John

jsykora · ‎2008-08-07

Question #1 was answered by other posts.

Question #2: I prefer to split all my VMs up by assigning them to Tiers. I base these tiers off of our BCP/DR Plan's Business Impact Analysis. Tier4 is testdev and non-production stuff, Tier3 is important stuff, Tier2 is very important stuff, and Tier1 is CRITICAL stuff that can't stand to be down for more than a few minutes. Tier1 tends to be domain controllers, DNS, and other infrastructure without which nothing else can function. Tier2 ends up being SQL, Exchange, and very important end-user App servers and data mounts that should never be down for more than 4 hours. Tier3 is mostly web servers and such with no real data stored on them that can stand to be down for up to 8-12 hours. Tier4 stuff can come back up whenever.

I carve out volumes and LUNS by tier and place my VMs on those VMFS LUNs based on tier. This allows me to set up snapshot, snapmirror, and offline backup policies by recovery tier (ie snapmirror the Tier3 stuff once per week since it really contains no data and the Tier1 stuff hourly as it changes constantly). If one particular tier has what I deem to be too many servers in it (based upon both performance and data size), I create another volume/lun for that tier. This also lets me keep my volume sizes small enough to dedupe everything that I want to dedupe.

I hope to be able to budget in NFS licensing next year. That may change my strategy a bit, but likely due to volume constraints for dedupe not by much.

Question #3 about backup was answered well from my point of view.

timteller · ‎2008-10-16

I think this thread is the best read I have read all year. When we first set up VMWare ESX 3 on the SAN, there wasn't too many suggestions from the VMWare engineers, or our Storage Vendor (Sun). This year we purchased 3040 NAS gateways with ALL extra options (sNapp -everything).

At the current moment, we have 1 Big Datastore and a couple LUNs because of growth. Now with the purchase of the NetApp gateways and a big purchase of FC drives, I would like to remap the datastores.

I currently have 130 VMs in our ESX farm. Although we do not have anyone complaining about performance, I would like to remap the datastores with Best Practice methods so we can 1) Have best performance possible 2) Fastest recovery Time 3) Utilize VCB, rather than traditional method on vm(client/server).

I have talked to our storage engineer and he is on board with what I want to do, but I want to get advice before i put the whole project plan/change control together. Here's what I would like to do, and let me know if I should do something different? BTW, I love Storage VMotion!!!

Create 3 Datastores (Like Jim Sykora does) - Tier 1, Tier 2 and Tier 3

Add multiple LUNS to the DataStores

On each of the NetApp 3040s, connect one of the fiber to the ESX Zone

Use SnapMirror to replicate the VMDKs and DeDup the VMs to our alternate data center (which also has NetApp 3040s and storage)

Here are my questions:

Can I use the NetApp 3040s to do what I want? If I can put the 3040s on the same san zone, can I create a ESX Volume on NetApp to snap?

Is it best practice to have 1 Lun / Datastore or multiple Luns/Datastore depending on the Tier?

Thank you,

Tim Teller

jsykora · ‎2008-10-16

Tim,

If I had a set of FAS3040s with an "everything" license I'd consider looking at using NFS to access your tiered datastores instead of FC. Then you don't need to worry about SAN zones or LUNs (although you might not be able to get your storage engineer on board with this as it would put him/her out of a job).

You'd just have 3 FlexVols set up (Tier1, Tier2, Tier3) that are each as large as your model's dedup limit will allow (not sure what the FAS3040's dedupe limit per volume is), or how ever large you figure you might ever need for that Tier class's present and future storage needs. If your storage needs for a specific Tier are larger than the applicable dedup limit you might need to have multiple FlexVols for a tier (ie Tier1a, Tier1b, Tier1c, Tier2, Tier3a, Tier3b). Then set up each FlexVol for NFS access and connect to your datastores via NFS from ESX.

Sure the NetApp 3040s will do what you want, but just keep in mind that there may be a better way. I'd run your scenario past a NetApp engineer as I don't have all the details due to my small environment (I have almost 0 FC experience). Get some whitepapers and presentations from NetApp regarding ESX on NFS so you can compare the benefits vs using FC. Sure NetApp does FC, but it does NFS to a whole new level.

Ideally I think you don't want to have any more than 10-15 VMs per LUN whether iSCSI or FC for performance issues (there is some sort of max VMs per LUN limit enforced by ESX also, but don't try to push that), so you may need to carve out multiple LUNs per tiered FlexVol, which you can surely do. To me, thinking in LUNs is old FC ideology. I'd rather think in virtualized storage terms and consider NFS. There are practical limits on how many VMs can reside on an NFS mount also, but I'm not sure what it is as I'm not even close to hitting it.

Jim Sykora

carlo_wejszko · ‎2008-10-20

Hi all. First time poster on here but I thought I'd chip in with my experience of this.

Due to my predecessor not purchasing NFS, we've also gone down the multiple LUN route for VMs as well. Just for simplicity (there is a whole bunch of other stuff), we've create a FlexVol on our Tier 1 storage (SCSI 10k) for operating systems, and a FlexVol on our Tier 2 storage (SATA) for data, rightly or wrongly, and then create multiple LUNS inside each of these Volumes - the 'sweet spot' from what we've read being 575GB per LUN.

The problem is, once we filled up a Volume with LUNS (even if those LUNS are not full), we stopped being able to snapshot it. The 200% + Delta for reservation being the culprit. They are all Space-Reserved and fractional reserve set to 100(%).

Can't say I've got my head around this completely yet (We put the call in on Monday). But if you guys are considering a topology like ours, you might want to consider the above, and look at how you can get around it. Please feel free to let me know where I'm going wrong with this too

http://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb23574

http://now.netapp.com/NOW/knowledge/docs/ontap/rel722/html/ontap/bsag/4cr-f3.htm

PS: I know 10k SCSI isn't exactly Tier 1 (15k SCSI is) and SATA isn't Tier 2 (10k SCSI is), but it's all we've got 😕

adamfox · ‎2008-10-20

Please keep in mind that using space reservations is only one method of handling snapshot blocks with LUNs. Many customers prefer using vol autogroup or snap autodelete or a combination of those along with provisioning some level of extra space in the volume to handle deltas. You can deploy these measures and not use space reservations at all.