2012-10-03 11:26 AM
An important customer has 2 metrocluster systems, or 4 controllers.
Each controller has a SAS volume that are presented to VMWARe 4.1 by NFS. Then, this volumes become datastores.
From these datastores, the vmware administrator created several virtual disks to a Windows Virtual Machine.
In the Windows Server, the administrator created a dynamic disk using the virtual disks from the 4 filers. I mean that each part of that dynamic disk are located into diferent datastores (Diferent controllers);
IS it supported? Does anyone have ever seen that?
There is a message on windows Device Manager. Those dynamic disks are at Health (At risk) state ( Indicates that the dynamic volume is currently accessible, but I/O errors have been detected on the underlying dynamic disk. If an I/O error is detected on any part of a dynamic disk, all volumes on the disk display the Healthy (At Risk) status and a warning icon appears on the volume.)
They have to use NFS (10GB) and dynamic disks, because "vmdk" disks are 2 TB max and they dynamic disks has 20TB.
Solved! SEE THE SOLUTION
2012-10-03 08:55 PM
I would be very wary of this configuration, even if it is supported.
Even with metro-cluster, this is still depending on 4 separate controllers, if one of them is down, that may cause corruption on that dynamic disk.
Yes, a takeover/giveback should not cause a problem, and prevent a controller down situation, but there are many reasons for a takeover (or givebacks) to not work correctly from time to time.
For the I/O errors, my suspicion is that the disk timeout is not correct, check that:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Disk\TimeOutValue to 190
Please verify the above value/key with other sources, I am typing this from memory.
I would believe that one large 20+ TB 64-bit sync-mirrored aggregate off of one controller would be the place to put those virtual disks that make up that one dynamic disk.
Anyone else agree?
I have a sense of deja vu on this configuration..... years ago spending all night restoring a corrupted 36 GB dynamic disk when one of the four 9 GB disks failed.
Then a few months later moved those file shares onto Netapp RAID-4 and use snapshots and snapvault instead of tape backup.
Rodrigo, is this data backed up? I hope so...
2012-10-04 10:37 AM
Thanks for the answer.. We are trying to double check this registry value with Microsoft.
These VM has just 1 full backup to tape. They do not have a suitable window to backup the 40TB SQL Server.
I will update as soon as i have new informations.
2012-10-04 10:52 AM
Ahh, SQL server.... No reason whatsoever I can think of why the server was built this way.
Could have spread the one single database over 20x2 TB filesystems, or 40x1 TB ones....
Use windows mount points to not run out of drive letters maybe even use Snap Manger for SQL to perform a backup, in just a few seconds while the database is live, no 'window' really needed..
2012-10-04 11:00 AM
Neither do I.
They started saying that was just a VM for testing, and now it is in production.
They don't use luns directly, only 'vmdk'. That's why we dont use snapmanager. And they do not have so many space on NetApp, because the changed blocks after a snapshot is quite considerable.