Legacy Product Discussions

New FAS 2050 Install

__SBROUSSE2008_15459
15,607 Views

Hello All,

 

I just ordered a FAS 2050 12TB with SATA drives and 2 controllers. I am looking for configuration documentation or anything else that can help me get things up and running. I am replacing a S550.  We are running 2 ESX servers with NFS connection back to the S550.

 

Is their a serial port where I could assign a IP address instead of using the Easy FAS wizard. Then use filerview to finish the config. I would rather do things manually to get a better understanding of what is going on.

 

Any help that could be provided would be greatly appreciated.

 

Scott

1 ACCEPTED SOLUTION

chriskranz
14,226 Views

No problem Scott,

Just remember that if you've got 12 disks in total, that's 6 disks per controller. 2 hot spares, 2 parity (if you stick with the defaults and recommended layout).

I think there was a little miscommunication in some of the posts earlier. You have SATA disks, and the NetApp recommended RAID layout for that is 14 disks. You don't need to tweak this, the defaults will already be on. You also don't really need to make too much consideration to this at the moment either. When you add more disks, the RAID group will simply grow until you reach 14 disks, then it'll create a new RAID group for you. This is all behind the scenes and automated, so don't worry too much about that, just stick with the defaults.

If you think that only have 2 data disks per controller is a bit slim on the storage availability (giving you around 6-700g usable), you can tweak the overheads. You can drop down to 1 hot spare, or single parity if you want. Although this will potentially give you less protection, it will give you more immediate usable storage. Arguably if you have 4 hour parts replacement, then 1 hot spare and 1 parity is more than enough on a smaller system. As you grow, you can convert this later. So if you add more disks in 6 months, you could give yourself an extra hot spare, and then convert the aggregate into RAID-DP (2 parity) at that time.

I'm a big fan of RAID-DP, so I wouldn't drop that unless I was really desperate, but you could quite happily drop one of the hot spares. The only thing you lose there is the disk garage feature. This is quite a cool feature that checks a failed disk for soft errors, reformats it and re-labels the disk as a spare if it is recoverable. A very cool feature on a big system with lots of spindles and potential failures more often (purely by a greater chance), but on a smaller system this is less of a really useful feature. I'd personally go for 1 hot spare and keep the RAID-DP. This will give you 3 data disks on each controller. Nice and easy to grow into in the future aswell. The only thing here is that you'll need to add one of the spares on the command line as the filer view will force you to have 2 hot spares.

Not meaning to confuse you with too much info and too many options, but I want to make sure you're doing the right thing. Now is the time to make sure you get the configuration right, not in 6 months when it's already in production! Remember you can always add new storage and build on a good architecture. But changing the fundamentals down the road or shrinking storage can be quite tricky. NetApp have made the systems very easy to grow into, just make sure you're comfortable with the architecture from the start.

Give me a shout if you need any more pointers or you have any more questions. I work quite a lot with the smaller systems and I work closely with customers to get the most out of them, and it's always dependent on exactly what you want from the system and how you want the storage to be provisioned and layed out. The systems are nice and flexible like that!

Hopefully the above info is helpful to you though...

View solution in original post

39 REPLIES 39

__SBROUSSE2008_15459
6,424 Views

Hi Chris,

I just wanted to thankyou again, I have had the 2050c running for a couple of weeks now without any major problems. I have seen a couple issues though and I believe it's do to the spindle count.  We have 12TB disks total with 6 disks per head (SATA). Filer 1 is running iSCSI and CIFS filer 2 is running NFS for vmware. We have 10 servers and 5 VDI dektops running on filer 2.

We are using Raid-DP as you suggested and we have 2 spares which might have to change. I also ordered 2 more nics one for each controller to add to our VIFs and 8 more 1TB drives. 4 drives per head hopefully this will give me the spindle count we need as the box now has 20. 10 per head.

We don't have 4hr parts we have next business day, so I not sure whether to get rid of the second spare. Will 1 drive make that much difference on the spindle count? I don't necessary need the storge just the performance.

I think its a spindle problem because when I look at {Filer AT-A-Glance} Filer 2 (VMWARE) shows network ops in the color blue and filer 1 shows them as red. I think red is bad. Would you agree? All of our users access data from here. We also have  8 luns on filer 1.. We also have My Documents mapped to users directory on filer 1.

Also, I was told to ordered DFM {Operations Manager} to manage the filer as a single unit and was advised it has better reporting and management features. What do you think about this add on?

Appreciate your help,

Scott

lrhvidsten
6,285 Views

As far as the management tool, you might want to wait until NetApp System Manager comes out as it will be free. It may not give you exactly what you're desiring, but the price is right. I was told by a NetApp sales engineer that the beta is closed and the final release is very close. He stated it should be posted on NOW about mid-April. Read these links for more info:

http://blogs.netapp.com/exposed/2009/02/fas2050-wins-aw.html

http://blogs.netapp.com/simple_steve/2009/02/netapp-system-m.html

http://blogs.netapp.com/storage_nuts_n_bolts/2009/03/sneak-preview-netapp-system-manager-nsm.html

http://communities.netapp.com/message/7987

The FAS 2000/3000 platforms will be supported first, so you're in luck.

Also, there are other tools you can run for monitoring. A new customer myself, I have yet to try them, but plan to test some of them out with the simulator.

Check out some of the tools at the bottom of the TechOnTap Archive under Admin Tips and Tools:

http://www.netapp.com/us/communities/tech-ontap/archive/tot-archive.html

chriskranz
6,285 Views

Anytime Scott, that's what these communities are for

As for your performance, you may want to try spread the load across both controllers. So rather than one controller doing one job, and the other controller doing another job, try spread the load. If you have VDI, create 2 datastores and spread the users across both. This will maximise your spindle count too!.

If your networking is showing red, this won't necessarily be caused by the disks. More likely your network is pushing some throughput there. From filer view, under "Filer" goto "Show System Status". Run the default and you'll see network throughput. This will give you some idea of what you are pushing in real terms on the network. Gigabit ethernet can push somewhere around 80-120Mb/sec on a good day.

Adding more disks is always a good thing if you're worried about spindle performance, but I'd look into all the details properly first. Ask NetApp support if they can analyse a perfstat report for you.

Operations Manager is great, and you get the free Performance Advisor tool with it, which is great for troubleshooting. Ops Mgr does a lot, and you can start to do things like performance and storage trending, as well as alerting on abnormal system usage. I'm only really scratching the surface, it's a great tool, and you'd probably get a demo from someone really easily.

Let me know how you get along, always good to get feedback. And if you get stuck, feel free to give us a shout anytime.

Cheers...

__gregkorten_17054
6,285 Views

If you have 12 Disks, 6 on each side...  Each Controller  has 6 Disks  - 1 hot spare - 2 Parity   =  3 (1 TB SATA 7.2K) Spindles for your Workload.  I would recommend putting all your upgrade funds towards spindles.  I would be very  surprized to learn that your FA ports are a bottleneck.

greg

chriskranz
6,285 Views

Very true, and in this case you are probably entirely right. However I think it is very good practice to get into the habit of checking to see where the system is in need of help before ordering upgrades. It may be that the network ports are set to 100mb and not 1g for instance which is why the networking bar is showing up as red. There's always 2 sides to every coin!

amiller_1
6,219 Views

For a free option, try running "sysstat -x -s 1" from the console on each filer (make sure to expand your console window to handle all the columns) and watch the disk utilization column. It's not very granular but if you do have spindle count issues the number there will be high.

To go further, Performance Advisor (part of DFM/Ops Mgr) would be the next best step I think (lets you see business down to the disk level even) -- you might even be able to get a 90 day demo key to test it out/demonstrate it's worth before purchase.

__SBROUSSE2008_15459
6,219 Views

Andrew,

I ran the sysstat -x -s 1 command you posted and I'm receiving the following information.  Filer one is the CIFS and iSCSI, that has a consistent cache hit of 98-100%. The stats also showed consistent disk reads of about 2000 and writes of about 1000. It has consistent disk utilization of 30% spiking up to 60 or 70% at times.

Filer two is for NFS, that also has a consistent cache hit of 98-100%.  Disk utilization is roughly 25-45%, but it varies higher and lower.

NFS hasn't shown any problems so far.  On the other hand we are experiencing disconnects in our outlook because our pst files are located on the FAS. Some of our pst files are around 2GB.

Could our outlook issues be linked to the stats on filer one?

Scott

chriskranz
5,729 Views

A high cache hit rate is good. Basically if 98-100% of the reads are cache hits, then the data is coming from cache, not from disk. This is one readout you want to be as high as possible! Also if your system is not busy, or not really doing anything, this stat will generally be high too.

Looks like you are pushing the disks a little (as expected), but not to an intollerable level.

What are your Outlook issues? Ops Mgr would certainly help you delve into more detail as to where you may be having any performance issues.

amiller_1
5,729 Views

Sorry to be slower responding here....basically Chris's analysis spot on. High cache hit rates are good (getting lots of the requested data from RAM rather than disk) and disk utilization rates of anything less than 90% mean the disk isn't totally saturated.

Just in case though, make sure you're watching the sysstat output when things are slow/not running well.

Performance Advisor in Ops Mgr would be the next step as can go deeper and give historical info as well.

Also, for NFS usage, I presume that's for VMware? If so, make sure to go through TR-3428 carefully (install the Host Utilities to get optimal NFS settings, vol_atime on the NFS volume, check alignment and/or align (mbralign is now available), etc.).

GregKorten
6,423 Views

I am also working on deloyment of a FAS2050 with 20 x 300 GB SAS 15K drives .  The array will only use FC and serve luns to a 2 node ESX cluster.

I wanted to get opinions / comments on the following deployment decisions...

I opted to use Controller B as an Active Controller which will only be utilized in a failure so I changed it to Raid-4 after a quite a bit of research and pestering of our SE.  I am hoping that going for  a wider spindle spread under my production Aggregate in Controller A is better suited for the ESX workload then 2 smaller Aggregates under 2 Controllers.   I also want to provision from a single interface.  I had some hesitation about using Raid-4 at first because I was told it would require that provisioned luns be taken offline for upgrades of disk firmware but after some research it looks like this is is required when using any SAS or ATA disk chassis even when configured with Raid-DP.

Node A Raid-DP Active  -  Resized Root Flex Volume to 64 GB from 224 GB

3.04 TB Available for Production Volumes

17 - Disks

2 Parity

14 DATA

1 Hot Spare

Node B Raid-4 Active ( only used durring failure of Node A )

3 Disk

1 Parity

1 Data

1 Hotspare

chriskranz
6,423 Views

Hi Greg,

Not sure if you are asking a specific question about this, or just after general opinions?

Configuring a system as psuedo active/passive like this has it's advantages, and the biggest is spindle count. If they are split evenly you may not have enough spindles in one place to give the performance you need for your applications. However the also have disadvantages, so if spindle count is not a requirement for you, then CPU and controller throughput may be, in which case you would want a 50/50 split.

I have done many configurations using both ways, and it all depends on the customers requirements.

On a side note, SAS drives are definitely hot upgradable with respect to firmware. RAID-4 can cause problems with this, but I am guessing you won't be hosting any data on the RAID-4 configured controller, so this shouldn't be too much of a consideration? Again, the key here is system performance. A single data drive will effect performance and this may have a knock on effect to the speed of cluster failover.

In your scenario, you are only presenting storage to 2 ESX servers. I would probably favour a 50/50 split of disks and create at least 2 datastores. This would spread the system load across all spindles, but also give you double the controller throughput. Depending on what you are actually doing within ESX, I don't see this being a huge problem. The SAS disks are 15k, so they perform very very well, and you will still have 6 or 7 data disks a piece.

This does need careful consideration as it is difficult to re-configure. To be honest it is probably easier to configure it 50/50 from the start, then if you don't like this reconfigure it to be active/passive at a later date. An aggregate can always (within physical limitations) be grown, but cannot be shrunk.

__gregkorten_17054
6,424 Views

Chris,


Thank you for the response.  I am looking for general opinions as this is my first Netapp array.  I will be migrating a handful of Windows Servers one of which is large Windows file server which is about 2.5 TB of data spread across 4 Windows Volumes on a Clariion CX300. The main reason for moving to Netapp was to leverage array based replication.



I have worked with EMC arrays for about 4 years, again I am new to Netapp, but in my EMC experiences I have rarely seen high storage processor or Front end utilization issues for the MS Windows Application workloads we run.  Utilization issues in the past have been dues to underlying spindle count.

In regards to the firmware upgrades of SAS disk it seems like you are speaking from experience which I am lacking on Netapp but I found this in the Data Ontap 7.3 upgrade guide Page 33 which I interpreted as a SAS / SATA shelf limitation.


When not to use  nondisruptive upgrades

You need to update  firmware for SAS-based,  AT-FC-based, AT-FC2-based, or AT-FCX-based disk

shelves on your  system.  Client services might  encounter delays accessing data when disk shelf firmware is updated to  SAS,

AT-FC, AT-FC2, or AT-FCX  modules. To prevent data loss, all session-oriented services must  be

terminated before you  begin an update procedure.

Here is a more detailed summary of what I am working with...


2 –  FAS2050 arrays

2 Host  ESX 3.5 U3 Cluster Production – Leverage Existing 3 Node ESX cluster for  DR

2 –5   Guest OS Windows 2003 Standards SP2 – RDM Virtual Machines

FAS2050  will serve up FC luns to a 2 node ESX cluster hosting Windows Virtual Machines.   All VMs will use RDM (Raw Device Mapping) Luns for Windows OS “C:\” and Data Volumes, ie “D:\”, “E:\”, etc…  A small Netapp Volume  will be created for a single Netapp lun which will only be configured as a VMFS Datastore used  only for Virtual Machine Config files and a small VMDK lun for each Virtual  Machine.  The VMDK lun will be configured as dedicated Windows X:\ volume with a 2nd 4095 MB  Windows Page file with in the Guest OS.  This Netapp Volume will not be replicated.  All Netapp RDM luns  provisioned for each specific VM will be created in their own Netapp volume and  replicated with Snapmirror to another FAS2050 in the DR site.  The target  FAS2050 will have a similar configuration with the underlying read-only  replicated Netapp Volumes containing the DR copy of the production Virtual Machines underlying RDMs.  The DR Luns will be   pre-provisioned configured in Virtual Center with resources matching production but which will be in a powered OFF state ready for a failover if  needed.  The paramount requirement is replication.

lrhvidsten
6,423 Views

You also are able to take advantage of double the NVRAM and double the read cache with the 50/50 setup since both heads are active. This helped alleviate my concern for having fewer spindles per aggregate.

Correction: I don't think doubling the NVRAM from using two heads actually helps since each node is mirroring the log data for each other’s NVRAM. But I think one still gets the benefit of doubling the read cache or normal volatile memory from using two active nodes.

Message was edited by: Leif Hvidsten

__SBROUSSE2008_15459
5,867 Views

We've just ordered and received eight 1 TB drives for our 2050.  These are hot swappable, so I assume I can just install them in the box.  The RAID will automatically adjust once I assign the drives to their respective aggregates, correct?

We also ordered two NICs for the 2050 as well.  I'm assuming I'm going to have to take the box down to install the two NICs, right?

Scott

amiller_1
5,868 Views

For the drive install, yes, you can just install the drives in the 2050 live. You'll need to assign them to the correct 2050 (i.e. software disk ownership) and then add them to the aggregate(s).

Once you've added them to the aggregate, you might want to check on the "reallocate" command to redistribute the data across the disks faster than might happen otherwise (otherwise Data ONTap will direct more writes to the new drives until all the drives are at the same % used).

For NICs, yes, you do have to take the 2050 down but you can do it as a NDU I believe -- just have to reconfigure the cluster failover partners NICs/VIFs after both NICs are added and online....easiest if you have a downtime window but doable without it (although I'd still do it after hours in case).

__SBROUSSE2008_15459
5,868 Views

Thanks Andrew,

Installed the drives per your direction and added 2 nics to each VIF. Then enabled LACP on our CISCO 2960G to trunk the ports together and everything works like a charm.  Also installed operations manager and data fabric manager and everythings seems to be working fine.

Appreciate your help.

Scott

__SBROUSSE2008_15459
5,867 Views

Chris,

Do you know where I can get some information about setting up a FAS 2050 with an active/passive configuration?  I don't remember seeing anything about that on the initial setup of the box. Which is the recommended configuration active/passive or active/active?

Scott

amiller_1
6,423 Views

I'd actually hoped to reply to this sooner....overtaken by events and I think you've got most everything out on the table.

The only comment I wanted to add was that I've been steering away from selling FAS20x0c boxes with just internal drives....simply given it makes the aggregate setup annoying (stuck with trade-offs regardless).

So yes, it's money....but I like to do one (or more) external shelves with the external disk controlled by one head and the internal disk controlled by one head. Just makes for nice isolation and also allows for good aggregate sizing/usable space.

For example, I'll be setting up a 2050c in the next month with (2) external shelves of 300 GB 10k FC drives and (12) 1 TB internal drives -- makes the aggregate setup nice and simple. 🙂

sbrousse2008
5,867 Views

i installed four disks in each head on my FAS2050.  i currently don't have any reallocation running.  Do i need to reallocate my aggregates when put the new disks in in order to use them, and should i start a reallocation schedule anyway?

SB

Public