ONTAP Discussions

Question regarding small SQL deployment,

marcconeley
5,515 Views

Hi,

I am looking for some technical guidance here regarding the setup of small virtual SQL servers where I don't want to use SMSQL and iSCSI LUNs (if possible).

I have a small VMware host cluster connecting to a FAS2020 NFS volume over iSCSI.

As most of our SQL environments are fairly small (<50Gb of data) we run our SQL servers virtually. For production SQL servers, we use SnapManager SQL and setup 4x iSCSi LUNs per server following the best practices - this I am OK with.

However, I have a lot of test SQL servers where I would like an easy setup and simply create my virtual server with 2/3 vmware virtual disks (e.g. C:\ and D:\) and run SQL server's databases and logs from these 2 disks - and not have to bother with SMSQL, iSCSi LUNs and assigning large amounts of storage from my Filer (the overhead of using SMSQL with 4 LUNs with adequate space is quite high). So:

- Is it possible to run SQL server from local Vmware virtual disks in my environment, or should I always use block storage (iSCSi LUNs) for dbs and logs?

- Is it correct to assume that as NFS is a file-based system, and not block based - and as all my VMDK files are stored on this NFS vol - that SQL performance which typically has high sequential read I/O would perform particularly poorly if its databases and logs were placed within these VMDK files? (is this how it would work?).

- Does anyone have any better/other recommendations for setting up small virtual SQL server installments that don't require SMSQL? (should I always use iSCSI LUNs anyway?).

This has become important to me as I've been seeing some performance issues. When database backups or verifications run on SQL servers which have their dbs and logs within VMDK files, all VMs on that datastore start experiencing slow system disk write times and symmpi errors occur in the event logs (sometimes causing the servers to crash). I can also see very high I/O on my VM datastore (NFS vol) and NFS latency spikes.

The other benefit of running small and test SQL environments on local vmware disks is that the VMware admins can manage the disks themselves without touching the filer volumes. But if the downside to this is that heavy SQL use or database backups/verifications cause all my VMs to timeout, then this obviously isn't a solution.

Hey, and what happens for customers who don't have SMSQL licenses? Surely not everyone is running SQL server on iSCSI LUNs? I would assume its fairly common to have SQL databases & logs on vmdk disks?

Any guidance or help much appreciated.

Thanks.

8 REPLIES 8

thomas_glodde
5,516 Views

Marc,

usualy it isnt a problem to run SQL server from vmdks on nfs if there is no huge i/o. of course, like in every virtualization, you should schedule tasks like backups and verifies apart from each other, eg. first server at 1am, 2nd server at 2am etc..

If you experience performance troubles, make sure all of your filesystems are properly alligned.

If i use SM SQL or not, i still tend to have the same layout and same storage configuration for ALL systems, just to have a proper design for ALL machines so noone gets confused, but thats just my 2cents. If you have like 20+ SQL servers and they are growing each day, then you probably do not want to mess around with iscsi luns everytime and stick with vmdks. but if you setup 1 sql like every half a year, i would keep them all the same. this enables you to install SM SQL at any time if needed ,eg if a testing database gets productive over night (which happens all the time ;-).

Regards

Thomas

marcconeley
5,516 Views

Hi Thomas,

So you're placing your SQL dbs and logs on separate vmdk disks in the same datastore as the VM itself and not getting any performance issues?

Hmmm. I have one particular database of 40Gb where a daily integrity & backup-to-disk (network share in this case) job was causing all my VMs to timeout.

Total disk I/O on the filer seemed reasonably OK though during this time.

And total network throughput on the filer head also was OK.

Therefore I was assuming that it was simply too much I/O being generated by SQL, and that placing the dbs and logs within a vmdk on a NFS volume was simply a bad idea.

This kind of made sense to me as an iSCSI LUN is pure block based, whereas I am putting a database file within a vmdk on a nfs vol - which would appear to create more of an overhead (or so I would assume).

I only have around 10 SQL servers - some of them are production servers running with SMSQL, and some are test environments or are running small IT apps with the dbs and logs on the VMDK files. But I definitely don't see performance issues on the SMSQL configured servers with their iSCSI LUNs.

I guess I could try moving the systemdbs, logs and dbs onto different vmdk files instead of having them on just 1 "data disk" inside the VM. Although I assumed this wouldn't help much as whether its 1 underlying vmdk file or 3, it's the same I/O load on the underlying NFS volume/LUN/Filer isnt it?

If all else fails, I'll simply install SnapDrive on all SQL servers move all dbs to iSCSI luns. As you said, I can then easily add SMSQL later if needed.

Do you know if there are any performance tests/metrics that show performance of SQL within an vmdk as opposed to esx lun or iSCSI disk?

Thanks for the info so far .

shaunjurr
5,515 Views

i,

We probably need to split up the problem into a few more manageable chunks.

1. SQL instances per server:  VMWare is one way to better utilize server hardware, but running multiple SQL instances per server is another.  If you use mountpoint disks and perhaps even cluster your setup, then you have a better chance of getting better server utilization without having to press everything into vmware. This way you also have SMSQL already there and your backup problems are already solved. Disk access times will also probably be easier to  tune.

2. Space usage should basically not be a problem.  If you are using thin provisioning for your MSSQL LUN's then you aren't using more space than you need anyway.  If you are not, then I can understand why you can get the idea that VMWare could save you storage.  Thin-provisioning is really the simplest way to do things.

3. If you do choose to use VMWare, you should look at separating data that has significantly different performance requirements off to their own datastores.  Here you can tune I/O on the NetApp to reflect your needs.  This basically is valid for backup policies too.

4. Depending on how you already do backup on VMWare, you are probably just going to be adding to response time problems with trying to put SQL in the mix.  It isn't that NFS is incredibly slow, but if your VMWare storage philosophy was just to dump everything into datastores randomly, then you are pretty much hopelessly a victim of a backup system that might suck the life out of your storage system many hours a day and cause you endless pain in SQL customer complaints.

Now SQL clusters can work if you have enough customers to populate them and can get the network segregation to work right. 

VMWare could work but you have all of the problems of the rest of the environment and no clear backup savings over SMSQL.  VMWare filesystems aren't know for their speed and you will have problems if you run de-dupe, etc. You can still use iSCSI on VMWare hosts.

I'm not saying that this all won't work fine on VMWare, but it really depends on how ready one is to make it work.  If the current VMWare operational model is too inflexible to make things work on a satisfactory performance level then it's a bad idea.  You basically need to take into account what the limitations of each element are.  You don't have a lot of horsepower in a FAS2020.

marcconeley
5,515 Views

Hey again Shaunjurr (you seem to be my personal NetApp consultant on these forums recently! ;-))

Yeah, as you've seen from my previous posts on various topics, I get the feeling that I am constantly investigating performance issues due to your last point: FAS2020 isn't very powerful.

I feel like I am trying every tweak and experiencing every possible performance issue because of this (I am sure if we were using a more powerful FAS, it would be more forgiving of us accidentally running multiple dedupe schedules or backups at the same time, or if we place some SQL server data within the vmdk files instead of on separate LUNs).

Regarding your above points:

1) Yep. As we started with small virtualised SQL servers and have grown, we are now actually thinking of moving our production servers to a physical SQL cluster running multiple instances. I've read and am convinced that if your business requires a SQL cluster, than you gain very little from trying to virtualise it (MS licensing costs with SQL CPU licenses also make this expensive - you end up litterally sticking 1 SQL server on 1 ESX host anyway and keeping it there. Which takes away most of the advantages of virtualisation). Ìs this what you're getting at?

2) Yep you're right. We should make more use of thin provisioning on our SQL LUNs. We already use Fractional Reserve = 0 and autogrow etc, but we don't currently thin provision as, as I've said, many of our SQL servers have their dbs and logs on vmdk files and not on the filer (as such). And the SQL servers that do have LUNs on the filer don't create much work for us to grow manually when needed (we cut them quite close to their necessary limit as to not waste space).

3) Not sure what you mean here. Do you mean that I should not place vmdk files that store my SQL dbs and logs on the same datastore as the VM itself for example?

And how would I then tune the NetApp to meet my needs? (currently I have 2 NFS datastores - 1 for production servers and 1 for less important servers. The production ds gets SMVI'd every day and snapmirrored etc. The other not. Is this what you mean?). What other NetApp tuning tweaks should I be considering? (can you prioritise different traffic etc? i.e. NFS and iSCSI over CIFs?). Oh, and every SMSQL server has 1 LUN per vol. Not multiple LUNs per vol.

4) Good question. Most of our data is backed-up using SMSQL, SME and Snapvault for CIFs to another filer. For virtual servers running other kinds of weird and wonderful applications, we typically script them to dump to disk or back them up using BackupExec agents to tape/disk (we decided against OpenVault or whatever its called!). VMs themselves get SMVI'd and snapmirrored to another filer.

Eventually, everything gets NDMP'd to tape from the secondary filer. Again, not sure what you mean by "dumping data randomly into datastores". I would hope I'm not doing that!

Thanks guys.

shaunjurr
5,515 Views

Hi,

My point on 1) was to say that it might just be easier to run multple SQL instances on one host.  It is a type of virtualization as well.  Using a simple MSCS cluster (no extra cost on the windows Enterprise server images, afaik)  would also give you some flexibility as far as errors and manual load balancing.  Using mountpoint disks, you could run up to 21(26 drive letters - A, B,C and quorum disk)  instances per cluster... depending on a few things like where you install binaries and msdtc.  Most SQL isn't terribly CPU intensive, so it is a matter of getting sufficient memory in the systems.

2). Don't worry about fractional reserve.  Just turn off space reservation and volume guarantees and monitor aggregate free space ... and volume free space for snapshot growth... you use autosize, which is recommended.  This way you can overprovision and not have to worry about increasing file system sizes so often.  Just don't go crazy or one day everything might fill before you can buy more disk.

3) Because you can prioritize I/O per volume on NetApp, you should probably try to create a datastore structure that differentiates between systems that require high I/O (i.e. are paying for it) and those which can muck things up if they use too much I/O.  You can do this with any volume so you can down-prioritize test volumes, etc.  See the TR on FlexShare and read the 'priority' manpage.

4) One of the great pains of the VMWare world is backup.  With what I wrote in 3), you don't want your backup-exec runs from some lesser important VM ruining your I/O for your SQL instances, so don't put them on the same datastore and don't give these datastores (read: flexvols) the same priority.  Things like backup and server virus-scanning can really kill your I/O in a virtualized environment.  Segregating I/O priorities can limit the damage done by these lower priority jobs.  You might want to have just a few datastores for the C: drives, for example, because they could de-dupe with huge savings while not hurting application I/O which could be on different "drives" on different datastores, depending on SLA, etc..  Make sure your pagefiles and VMWare swaps are segregated so you don't kill yourself with block deltas for snapmirror/snap reserve... etc, etc, etc.  It makes things a bit more complicated, but it will save resources and money in the long run.

I hope most of this is clear.  You can make your 2020 do a better job with a little more detailed design on how its resources are used.

marcconeley
5,515 Views

OK.

Some useful tidbits of information here - thank you.

And I also just read the Flexshare Guide TR-3459 - interesting stuff.

My resounding feeling though is that we're simply pushing too much IO through the filer. There simply isn't enough spindles and I am trying to fight my way through it by being clever with scheduling, offlloading verifications, running reallocation, creating multiple datastores and now configuring Flexshare etc. I see your point that most of this is good practice anyway, but I can't help but feel I am flogging a dead-horse - and the management costs involved to keep all these balls juggling is proving too much!

And this particular case, it's a FAS2020 and all our SQL, Exchange, VMware data is on a very small FC array - the costs of adding more spindles Vs upgrading the entire box will be a close call.

Anyway, my original question here was regarding placement of SQL dbs and logs on vmdk disks - and thanks to you both for the replies. It was good information and I think I'll take bits of both replies to change how I do things and hopefully eek a little more performance out of my filer before it's upgraded.

Thanks again guys.

Marc.

shaunjurr
5,515 Views

Hi,

FlexShare isn't difficult to establish.  Basically, you just set it up once and add the priority to new volumes after creation.  There will always be a balance between time and money.  Here again, you just need to prioritize the potential improvements: what gets you the most improvement for the least work?

Setting up 'priority' is going to be easier than redesigning your datastore concepts, for example.  The savings of the latter (through de-dupe, snapmirror replication, backup) might only be significant on a larger scale anyway.  The success of priority will, of course, be dependent on how well you can prioritize certain volumes over others and if you have VMWare datastores that have basically every sort of I/O need, then you won't be able to solve some of your problems with FlexShare.

No need to dispair.  Part of storage maintenance is trying to compensate for limitations in technology and budgets.  Even VMWare has come to the conclusion that they need more fine-grained resource allocations/limitations to be able to prioritize resources according to economic and technological realities. These will be realities no matter how big your next NetApp is.

abhisek
5,515 Views

Hi Marc ,

I have answered some of your questions.

- Is it possible to run SQL server from local Vmware virtual disks in my environment, or should I always use block storage (iSCSi LUNs) for db’s and logs?

Yes,it is possible to run SQL Server using vmdk’s. You can use NFS for this .

- Is it correct to assume that as NFS is a file-based system, and not block based - and as all my VMDK files are stored on this NFS vol - that SQL performance which typically has high sequential read I/O would perform particularly poorly if its databases and logs were placed within these VMDK files? (is this how it would work?).

NFS is file based system. You need to architect your environment keeping VM files and SQL db files on separate datastores. You would require separate vmdk’s for system db’s, user db’s, log files, temp db and log files.

Keep in mind that you have 8 datastores by default and can reach 64 datastores as maximum.

- Does anyone have any better/other recommendations for setting up small virtual SQL server installments that don't require SMSQL? (should I always use iSCSI LUNs anyway?).

SMSQL is required to maintain database consistency and preventing corruption. You can use SMSQL for vmdk’s also.

Regards,

Abhishek

Public