The Broome-Tioga Board of Cooperative Educational Services (BOCES) helps New York public schools control costs through shared administrative and IT infrastructure services. Because of our unique role serving some 50 separate school districts we face a variety of challenges. Some of these challenges, such as accommodating rapid IT growth and operational change under tight budget constraints, will be familiar to almost anyone working in IT. However, these challenges are compounded by the fact that in addition to managing a centralized data center that serves a variety of applications to all the districts we provide managed IT for 14 of the school districts—meaning we’ve basically been handed the keys to IT in each of those districts.
Over the past few years, service requests have steadily increased, and we’ve added more and more new applications to serve students and staff. A side effect of that growth has been data center sprawl, with much of our data effectively siloed and data storage needs doubling annually. At the same time statewide regulatory requirements demanded long-term, off-site archiving that we simply couldn’t accomplish. The time had come for a more flexible IT infrastructure that would improve availability and data protection.
The key for us was to start with the right storage solution that would allow us to simplify our operations and infrastructure and then grow smart. For us, that storage solution turned out to be NetApp®. By implementing NetApp storage early we were able to grow into the full NetApp feature set over time. Key achievements are summarized in Table 1.
Table 1) Broome-Tioga achievements.
|Reduced server hardware costs by 85% and saved $500,000 over three years through virtualization|
|Reduced server storage by more than 35% through Windows® storage consolidation|
|Reduced overall data storage needs by 50%|
|Decreased backup time to minutes for key applications such as Exchange, SQL Server®, and Oracle®|
|Met state mandates for getting data off site|
In this article I talk about why NetApp was the right solution for Broome-Tioga and how NetApp solutions have enabled us to keep things simple and grow smart.
Start with the Right Solution
Backup has been another persistent problem for us because of explosive data growth and shrinking backup windows. For instance, when we were using NetBackup™ to accomplish the task, the nightly backups of our SQL Server databases often overlapped with other nightly database maintenance tasks, causing failures on a regular basis.
We originally bought our first NetApp system to use as a backup target for Syncsort Backup Express, but over time we discovered that NetApp provided the right solution to our backup problems and our disaster recovery/archiving issues, and a variety of other problems as well.
We started with a FAS250 system in 2003. Since then we’ve steadily and easily expanded and upgraded this system to meet the needs of our main data center, first to a FAS270, then to a FAS3020. By that point the NetApp system was storing almost all of our data.
Today, we’ve upgraded our NetApp to a FAS3240HA with about 150TB of capacity. Because each upgrade takes the form of a simple head swap without the need for complicated data migration, there are no hassles or barriers to upgrading. The new storage controllers just attach to our existing disk shelves. We’ve got a FAS3140 system running in our DR location, and we also have NetApp systems running in the 14 districts that we manage directly with an additional 250TB of storage (for a total of 400TB). Prior to implementing NetApp on our recommendation, most districts had either direct-attached storage or simple external arrays.
Figure 1) Broome-Tioga BOCES storage infrastructure.
While ease of upgrade is a great feature, the real reasons that NetApp has worked so well for us include:
- Unified storage. NetApp lets us meet the storage needs of any application either from our main data center or at the district level.
- Great efficiency. Capabilities like thin provisioning and deduplication have let us greatly reduce our appetite for storage.
- Integrated data protection. NetApp Snapshot™ copies and application backup/restore via the SnapManager® suite allow us to accomplish critical backups in much less time and for less money.
- Disaster recovery. As with backup, NetApp makes it easy to do remote backups and data replication so that we always have copies of important data off site.
I talk more about these points in the following sections.
Keep Things Simple
In the current economic climate it’s pretty clear that we’re not getting any more people, so we have to keep things simple and work smarter. Our decision to standardize on NetApp storage has been a key to our success.
We only have two administrators, and we each only spend about a quarter of our time on storage tasks, so that amounts to just half of a full-time equivalent (FTE) admin to handle all our storage systems—including those in district locations—currently with 400TB of capacity. We expect the amount of storage to continue to increase without the need for additional head count.
In particular, NetApp has helped us in three key areas:
- Windows consolidation and virtualization
- Data protection
- Disaster recovery
Windows Storage and Server Consolidation
Many of our school districts were still running Novell NetWare, so we undertook the process of converting them to Windows and consolidating our storage and server infrastructure at the same time. Having NetApp storage in place in each of the managed districts eliminated the need for standalone file servers. For instance, the Binghamton school district previously had 10 file servers that were replaced by CIFS shares on the single NetApp system. That’s a lot less management overhead, and the NetApp actually delivers better performance and allows users to restore their own data by accessing the Snapshot directory. The districts also benefit from improved access to both active and archived school data, while reducing overall service costs.
In addition to consolidating storage and eliminating file servers, we consolidated physical servers via virtualization. Today we run about 250 virtual machines on 7 physical servers in our central data center. Eliminating excess infrastructure further reduces complexity and saves us time and money.
We currently use SnapManager for Virtual Infrastructure (SMVI) to help manage our VMware® environment, but we’re in the process of implementing OnCommand™ software, which integrates the capabilities of SMVI, NetApp Operations Manager, Provisioning Manager, Protection Manager, and SnapManager f....
When it comes to server and desktop virtualization, our goal is to be hypervisor agnostic. This gives us the maximum flexibility to accommodate varying requirements (and budget constraints) across the districts. Today, about 10% of our virtual server environment is running on Microsoft Hyper-V managed by SnapManager for Hyper-V (SMHV). We’re also using Citrix XenDesktop to support our virtual desktop environment. The fact that NetApp provides direct management integration for all three hypervisors gives us a degree of comfort with all of them that we wouldn’t otherwise have. (Read previous Tech OnTap® articles on Hyper-V and XenServer.)
I’ve already mentioned the problems created by SQL Server database backups in our previous environment, and the problems were as bad if not worse for backup of our Oracle data warehouse. Oracle backup was being performed using a series of homegrown processes and scripts that only one DBA fully understood. When he left the organization, that created an opportunity—and an urgent need—to do something different. We switched to NetApp SnapManager for Oracle (SMO), which greatly simplified backup, restore and replication.
Of course, Oracle DBAs are always skeptical about making changes, but soon after we made the switch a failed extract operation populated some Oracle tables with bad data and necessitated a restore. Everyone was impressed when the SMO restore process took just a few minutes, and no one complained about SMO after that. We no longer worry whether or not Oracle has been backed up, and because restores are so simple we now have six people capable of restoring Oracle if needed.
The SnapManager suite is the key to much of our backup strategy now. In addition to SMO, we use SnapManager for SQL Server (SMSQL) and SnapManager for Exchange (SME), and, as I mentioned in the previous section, we also use SMVI and SMHV. All these products provide simple, consistent, application-aware backups in seconds with fast restores.
As part of our effort to streamline operations, we centralized Exchange services in our regional data center, and most of the districts now use those services rather than maintain their own mail servers. As with the other SnapManager products, SME automates the complex, manual, and time-consuming processes associated with the backup, recovery, and verification of our Exchange Server databases and uses NetApp Snapshot technology to reduce backup times to seconds and restore times to minutes. NetApp Single Mailbox Recovery software makes it possible to recover and restore individual mailboxes, messages, or attachments quickly without disrupting other Exchange users. The ability to quickly and easily restore to different points in time that SME provides makes keeping a "lag" copy of the database unnecessary, saving additional storage.
For data protection at the district level we use NetApp SnapVault to back up CIFS shares and other data to either our main data center or our DR site. We also use NetApp storage as a NetBackup target for backup of some remaining physical servers. We looked to see if we could purchase cheaper storage or put this data in the cloud, but keeping the data on NetApp turned out to be the cheapest way to do it because of the efficiency and low cost of NetApp deduplication versus NetBackup dedupe costs.
Our disaster recovery strategy ties directly to our data protection strategy and also allows us to maintain separate, up-to-date copies of critical student records off site to meet state mandates.
We maintain a DR site at Binghamton University (and serve in turn as a DR site for them). A FAS3140 at the DR site serves as a target for NetApp SnapVault backups and for replication with NetApp SnapMirror. The SnapManager suite allows us to schedule replication of consistent images of our Exchange, SQL Server, and Oracle databases and our VMware and Hyper-V virtual machines.
We use VMware Site Recovery Manager (SRM) to automate the recovery of our VMware environment. For an operation our size it’s really the only way we can automate recovery and run through test plans to validate them.
This approach is not only simple and easy to manage, but it gives us a much higher level of disaster recovery. Before, the best we could do was to spin off a tape from NetBackup to our DR site on the day after the backup occurred.
For us, growing smart means having the ability to flexibly meet user needs while limiting storage growth and minimizing management overhead. The most exciting thing about using NetApp storage has been that as use cases change at the district level, our storage is able to adapt to the need without big changes. For example, when we moved our data from NetWare to CIFS shares on NetApp, we didn’t have to change anything on the NetApp side because the CIFS license was already in place. It also opened up a whole range of new possibilities, including better performance and availability, better data protection, and easier, user-driven restore.
If a district wants to use Fibre Channel for its virtual server environment, that’s not a problem and neither is iSCSI. We also use NFS for VMware in some districts because we can create just one large datastore to which they can add virtual machines as needed. In our main data center we recently upgraded our core switches to fully redundant HP 10-Gigabit Ethernet switches and we’ll be migrating data from Fibre Channel to iSCSI. It’s just not a big deal to change protocols.
By using NetApp features such as thin provisioning, deduplication, and FlexClone we are able to reduce storage requirements by 30–50% depending on the application. All of our new deployments are thin provisioned, and we’re going back and reconfiguring existing volumes as well. We use NetApp Operations Manager to monitor thin provisioning and avoid shortages. Our procurement process can take as much as two months, so it’s important not to come up short. Operations Manager provides tools and reports that let us thin provision with confidence and avoid making our lives miserable.
We went into deduplication cautiously, but now we use it almost everywhere. We see about 20–25% space savings with Exchange 2010, 50–70% space savings with VMware, and 30% with CIFS shares.
Table 2) Broome-Tioga deduplication savings.
|Data Type||Average Savings|
|Microsoft Exchange 2010||20–25%|
We also use NetApp FlexClone technology to save storage space in our development environment and for DR testing. FlexClone allows developers to create a clone of a database in seconds and without consuming a lot of additional disk space. This means that they can test more often and more thoroughly. For DR, in conjunction with SRM we can use FlexClone to clone replicated volumes without interrupting ongoing replication and then use those volumes to do complete testing of our DR plans, again while only requiring incremental amounts of disk space.
With NetApp we are able to deliver enterprise-level performance and features to districts that wouldn’t otherwise have the staff to manage the complexities that come with most of the competing technologies. There’s no need to bring in a new NAS system for this or Fibre Channel storage for that. All our storage and data protection needs are flexibly met by NetApp. There’s no way that we could manage all our storage needs with just two people—who have many other responsibilities besides storage—without NetApp. NetApp helps us make the right choices for the right reasons.