ONTAP Hardware

Netapp FAS vs EMC VNX

dejanliuit
44,001 Views

Hi.

This year we have to decide if we should keep our IBM N-series 6040 (Netapp 3140) stretch metrocluster, upgrade to Netapp 3200 (or rather the IBM alternative) or move to another manufacturer.

And from what I can see, EMC VNX is one of the most serious alternatives. My boss agrees and so we have aranged meeting with EMC to hear about their storage solution including ATMOS.

 

So, I would like to hear from you other guys what I should be aware about if we decide to go EMC VNX instead of keeping the Netapp/IBM track.

It could be implementation wise or things like "hidden" costs, ie volume based licensing.

I'm having trouble finding EMC manuals to see what can be done and what can't.

 

Our CIO has set up one big goal for the future user/filestorage: The storage has to cost at most as much as it would if you go and buy a new Netgear/DLink NAS (with mirrored disk) a year.

This would mean that $/MB for the system has to be as close as possible to this goal. Today the cost is at least tenfold more.

Unless we come close to that, we have a hard time convincing the Professors with their own fundings to store their files in our storage instead of running to nearest HW-store and buy a small NAS (or two) for their research team.

It's called "academic freedom" working at a university...

Initial investment might be a little higher, but the storage volume cost has to be a low as possible.

 

Today we have basic NFS/CIFS volumes, SATA for file and FC for Vmware/MSSQL/Exchange 2007.

No addon licenses except DFM/MPIO/SnapDrive. Blame the resellers for not being able to convince us why we needed Snap support for MSSQL/Exchange.

We didn't even have Operations manager for more than two years and has yet to implement it as it was recently purchased.

 

The Tiering on Netapp is a story for itself.

Until a year ago our system was IOPS saturated during daytime on the SATA disks and I had to rechedule backups to less frequent full backups (TSM NDMP backup) to avoid having 100% diskload 24/7.

So the obvious solution would be PAM and statistics show that it (512GB) would catch 50-80% of the reads.

But our FAS is fully configured with FC and cluster interconnect cards so there is no expansion slot left for PAM.

So to install PAM we have to upgrade the filer, with all the costs associated BEFORE getting in the PAM.

So the combination of lack of tiering and huge upgrade steps makes this a very expensive story.

 

What realy buggs me is that we have a few TB fibrechannel storage available that could be used for tiering.

And a lot of the VM images data would be able to go down to SATA from FC.

 

EMC does it, HP (3Par) does it, Dell Compellent does it, Hitachi does it ...

But Netapp doesn't implement it. Despite having a excellent WAFL that with a "few" modifications it should be able to implement it even years ago.

 

Things we require are

* Quotas

* Active Directory security groups support (NTFS security style)

* Automatic failover to remote storage mirror, ie during unscheduled powerfailure (we seem to have at least one a year on average).

 

Things we are going to require soon due to amount of data

* Remote disaser recovery site, sync or async replication.

 

Things that would be very usefull

* Multi-domain support (multiple AD/Kerberos domains)

* deduplication

* compression

* tiering (of any kind)

 

So I've tried to set up a number of good/bad things I know and what I've seen so far.

What I like with Netapp/Ontap

* WALF with its possibilities and being very dynamic

* You can choose security style (UNIX/NTFS/Mixed) which is good as we are a mixed UNIX/Windows site.

 

Things I dislike with Netapp/Ontap/IBM

* No tiering (read my comment below)

* Large (read: expensive) upgrade steps for ie. memory or CPU upgrade in controllers

* Licenses bound to the controller-size and has essentialy to be repurchased during upgrade (this I'm told by the IBM reseller)

* You can't revert a disk-add operation to a aggregate

* I feel a great dicomfort when switching the cluster as you first shut down the service to TRY to bring it up on the other node, never being sure it will work.

* Crazy pricing policy by IBM (don't ask)

* A strong feeling of being a IBM N-series customer we are essentialy a second rate Netapp user.

 

Things I like so far with VNX from what I can see

* Does most, if not everything our that FAS does and more.

* Much better Vmware integration, compared to the Netapp Vmware plugin that I tried for a couple times and then droped it.

* FAST Tiering

* Much easier smaller upgrades of CPU/memory with Blades

 

I have no idea regarding negative sides, but being an EMC customer earlier I know they can be expensive, especially after 3 years.

That might counter our goal of keeping the storage costs down.

 

I essentialy like Netapp/Ontap/FAS, but there is a lot of things (in my view) talking against it right now with Ontap loosing its technological edge.

Yes, we are listening to EMC/HP/Dell and others to hear what they have to say.

 

I hope I didn't rant too much.

46 REPLIES 46

FISHERMANSFRIEND
6,524 Views

Its interesting following your discussions. Have any of you had experiences of or heard of major disadvantages using a FAST cache like in the EMC?

dustin_cavin
6,524 Views

It's much more difficult to do routine tasks on an EMC box.  I know that is relative, and subject to everyone's opinion.  I've worked on both, and NetApp just makes more sense to me, and it is so much easier to do basic stuff.

An example would be shrinking a volume.  With NetApp, it is one command, and the FlexVol that contains the share is grown or shrunk to whatever size you want.  With Celerra, you can't shrink the container that the share is in.  They call it a file system, and it cannot be shrunk.  If you want unused space back from a bloated file system, you've got to create a new file system, copy all the data, re-share everything from the new path, and destroy the old.  This plays hell with replication.  If you want to move your Celerra LUNs around in the storage array with the old CLARiiON LUN Migrator tool, too bad, you can't.  Again, it's create new, copy data, and delete the old.  Obviously, this would cause a loss of service to your users.

If you're running a small dedicated NAS array these may not be a big problem for you.  If you're hoping to run a large array with CIFS/NFS/iSCSI/FC with dozens or hundreds of TB behind it, then these are useful features that you'll be missing out on.

I understand the FAST is a big deal for you.  On the surface, it does sound pretty cool.  There are some drawbacks, though, and of course EMC doesn't talk about them.  Once you put disks in a FAST pool, they are there forever.  You CANNOT pull drives out of a pool.  You've got to create a new pool on new spindles, copy ALL the data, and then destroy the entire pool.  Any SAN LUNs could be moved with CLARiiON Migration, but the LUNs you've allocated to the Celerra cannot be moved this way.  It's a manual process with robocopy, rsync, or your tool of choice.  Obviously, this would cause a loss of service to your users.

If you're running a small dedicated NAS array these may not be a big problem for you.  If you're hoping to run a large array with CIFS/NFS/iSCSI/FC with dozens or hundreds of TB behind it, then these are useful features that you'll be missing out on.

Maybe some of these things have changed with VNX, but from what I understand it is still the same in these respects as Celerra.  If someone in the community knows more about VNX than I do, please correct me.

insanegeek
6,524 Views

Having both in my environment it's not quite as horrible as you make it to be.

In one way I've conceptually thought of a Celerra filesystem == a NetApp aggregate, both were limited to 16TB of space (until recently) and neither could shrink.  I have a 600TB NS960 Celerra on the floor and not having to think about balancing aggregate space for volumes on it is very nice.  I've only had to shrink a NetApp volume maybe two-three times (trying to fit a new 2TB or so volume into an existing aggregate), generally for our environment all that happens is storage consumption nobody gives back, unless they are completely done at which point we delete the volume.  If you really want to shrink it there are a number of easier ways than using host level migration either nas_copy (similar to a qtree snapmirror copying at the file rather than block level) or using CDMS where you point your clients to the new filesystem and it hot-pulls files over on demand (still requires outage to point your clients to new location, but measured in minutes rather than hours)

While not suggested (because you can hurt yourself badly if done wrong) you can move Celerra luns around in the storage array using lun migrator.  Caveat is that it needs to be the same drive type, raid type and raid layout else AVM will be confused on the state of the lun.   i.e. AVM thinks it's a mirrored FC disk, you migrate it to a raid5 SSD disk, next time AVM queries the storage it will have a lun defined in a mirrored FC pool that has a different characteristic.  If you aren't using AVM or are using thin devices from the array this might not be a problem but 99.9% of people don't run that way.

On shrinking a block level pool NetApp has the exact same drawback and they don't talk about it either.  You want to shrink an aggregate... how do you do it?  You do the same process you mention on the NetApp, drain the entire aggregate, pretty much just as painful.  Additionally, if you aren't shrinking the volume on the Celerra you would replicate it to another pool on the same or different NAS head, only if you are shrinking a filesystem would you have to do anything at a per file level.

That's all on the older Celerra NS not the VNX (but at this time they basically have the same feature set, just more power, capacity, etc).

I'd say that the EMC pools with FAST tiering is better than NetApp fixed aggregates, but if you are using both SAN & NAS I'd say that it almost becomes a throw away value.  It's nice to have one big storage pool for the array that can go to really huge sizes (100TB is certainly bigger than 16TB it isn't really that huge anymore): I don't have to worry about balancing space, wide striping just happens, etc.  That's all great but to use the same thin block pool for SAN & NAS you give up NAS filesystem thin provisioning, you present the NAS head thin luns and create a thick filesystem on it; there is no thin on thin.  While not the end of the world as it's still thin provisioned in the backend, nobody really does it.  You have 200TB of storage and you want to split it evenly, you generally give 100TB traditional raid luns to NAS AVM pool and 100TB thin pool luns to SAN pool.  I haven't explored using a thick provisioned pool lun... but still nobody really does it that way so why bother.  With that quantity of storage you still would create 2x 100TB aggregates but there is no issue with mixing and matching NAS & SAN in the same aggregate.

I personally have found them both annoying in their own ways to manage.  I'm more CLI type person, but I'd probably put the Unisphere interface higher than NetApp if you like GUI's (the previous version of Celerra manager not so much)

The NetApp is nice in it's similarity to traditional unix file structure: mount vol0; edit exports, quota, netgroup, etc done.  It is a bit annoying in that depending upon the change you can't do everything from one location: i.e. change exportfs file, have to login to the filer to activate it.  I can copy all those config files somewhere else for backup before I make a change (very nice!) or to apply if I'm moving from filer to filer: replacing filer A with B, copy exports from A to B done.  General real-time performance stats are easy to get "login run sysstats -x", detailed not so much. 

Celerra you change those values via a command which is rather esoteric, "server_export ...",etc once figured out not a big deal, but it's not as obvious as exports line.  It's nice in that everything is done from the same system, login to the control station issue whatever command done, don't have to ssh into the NAS head to do anything.  Having said that if you are using scripts for things because most everything is a command it makes things very simple.  Don't have to edit a file, ssh in anywhere run a script and it's just done which for our environment with thousands of clients and petabytes of storage is very, very nice.  Detailed real-time performance stats are very easily accessible via, "server_stat".

They both suck in long term performance stats without adding on additional packages, they both suck in finding out oversubscription rates with qtrees, etc.

BEN_COMPTON_MCPC
6,524 Views

Having read all these posts I feel compelled to respond.  I have used all flavors of both the Netapps and EMC... The Netapp 2040 does compete with the EMC configuration.  However, a 2240-4 or 2240-2 have ultra attractive price points these days.  One thing I would like to say about day to day tasks on the EMC vs the Netapp is that I do observe most things being easier and more central on the Netapp.  EMC's unisphere has bridged the gap to some respect, but Netapp is still ahead of the curve.  Not too long ago I used a Celerra NS-120 backended by clariion CX4-120's.  Most folks in a virtual environment are looking to leverage storage efficiencies and by and large the NAS portion of the devices.  The Celerra consistently had issues with both replication and NFS.  By issues I mean the head on the Celerra would kernel panic becuase of the NFS mounts and fail over to the other head.  Talk about unacceptable.  To further add to the pain; EMC admitted the issue and said there was no fix or workaround yet available.  Also, they went further to say there was no projected date to alleviate the problem and their work around was to present via the clariion portion and use fiber channel.  Really?  Why did I WASTE money on a NAS head if all it could do were SAN operations effectively?  To my knowledge EMC has now remediated these issues.  However, how much confidence does this give me in EMC?  Answer, NONE!  EMC has a solid SAN that solidly replicates.  As far as NAS and deduplication; NEVER AGAIN. 

RAVONATOR
6,524 Views

Hi all,

First of all to remove any confusion, you should know that I am 100% in favor of EMC, that's what I sell, but that does not mean I cannot talk positive about other vendors, especially NetApp whom I always highlight as one of the 2 single best positioned

storage vendors for example for VMware projects (besides EMC off course).

A number of replies above resemble more a consumer Mac vs Windows forum, where it seems very important to create FUD,  playing with the truth.

For example:

- FAST and FAST Cache not working with NFS: Not true, it does work with NFS

- FAST Cache having the same approach as NetApp Flash Cache: Not true, FAST Cache accelerates read and write I/O's, where Flash Cache accelerates read I/O's

- When replying on why it is not possible with NetApp to "tiering without PAM" the answer is; "we don't need automated tiering as caching is better". Come on, automated tiering is true almost in every case.

- then the dialog starts on routing cables ...

Whether EMC's FAST and/or FAST Cache makes sense or not, depends on the customer requirements, which can be identified true dialog and cooperation. I urge you all to read the article from the below

link on relevant usecases for FAST and/or FAST Cache and when NOT. Also I fully support the author to never go negative on the other guy, and I believe strongly in that to focus on how you're offering can

add value or the project/customer, is the right thing to do. I hope we all can see a more non-FUD, and factual discussions.  

http://virtualgeek.typepad.com/virtual_geek/2012/04/emc-vnx-fast-cache-vmware-vmmark-a-killer-result-a-killer-vroom-post.html?utm_source=feedburner&ut...

Public