Tech ONTAP Articles

End-to-End Quality of Service: Cisco, VMware, and NetApp Team to Enhance Multi-Tenant Environments


Building shared infrastructure has always been something of a challenge. If you look at a typical corporate data center design you find that important applications either have their own dedicated infrastructure or that shared elements have been overengineered to far exceed requirements. Either approach underutilizes resources and wastes your IT budget.

The problem is that no one really knows how infrastructure components such as servers, networks, and storage will behave as additional load is added. Will a resource become a bottleneck, decreasing the performance of an important application unexpectedly? If so, how can you quickly identify the source of such bottlenecks?

The current interest in cloud computing has made understanding all aspects of multi-tenant environments—infrastructures in which all resources are shared—even more critical. In fact, many companies hesitate to build cloud infrastructure or contract for cloud services because of fears about security and quality of service (QoS).

Cisco has teamed with VMware and NetApp to design and test a secure, multi-tenant cloud architecture that can deliver on what we see as four pillars of secure multi-tenancy:

  • Secure separation. One tenant must not be able to access another tenant's virtual machine (VM), network, or storage resources under any circumstance. Each tenant must be securely separated.
  • Service assurance. Compute, network, and storage performance must be isolated and guaranteed during normal operations as well as when failures have occurred or certain tenants are generating abnormal loads.
  • Availability. The infrastructure must ensure that required compute, network, and storage resources remain available in the face of possible failures.
  • Management. The ability to rapidly provision, manage, and monitor all resources is essential.

In this article I describe the unique architecture the three companies have designed to address these pillars of multi-tenancy. I go on to discuss our efforts around the second pillar—service assurance—in more detail.

A recently released design guide provides full details of a Cisco validated design that uses technology from all three companies to address all four pillars described above. A companion article in this issue of Tech OnTap describes one element of the architecture, NetApp® MultiStore®, in more detail.

Architecture Overview

A block-level overview of the architecture is shown in Figure 1. At all layers, key software and hardware components are designed to provide security, quality of service, availability, and ease of management.

End-to-end block diagram

Figure 1) End-to-end block diagram.

Compute Layer
At the compute layer, VMware® vSphere™ and vCenter™ Server software provide a robust server virtualization environment that allows server resources to be dynamically allocated to multiple guest operating systems running within virtual machines.

VMware vShield Zones provides security within the compute layer. This is a centrally managed, stateful, distributed virtual firewall bundled with vSphere 4.0 that takes advantage of ESX host proximity and virtual network visibility to create security zones. vShield Zones integrates into VMware vCenter and leverages virtual inventory information, such as vNICs, port groups, clusters, and VLANs, to simplify firewall rule management and trust zone provisioning. This new way of creating security policies follows VMs with VMotion™ and is completely transparent to IP address changes and network renumbering.

The Cisco Unified Computing System™ (UCS) is a next-generation data center platform that unites compute, server network access, storage access, and virtualization into a cohesive system. UCS integrates a low-latency, lossless 10-Gigabit Ethernet network fabric with enterprise-class, x86-architecture servers. The system is an integrated, scalable, multichassis platform in which all resources participate in a unified management domain.

Network Layer
The network layer provides secure network connectivity between the compute layer and the storage layer as well as connections to external networks and clients. Key components include:

  • Cisco Nexus 7000, which provides Ethernet (LAN) connectivity to external networks
  • Cisco Nexus 5000, which interfaces with both FC storage and the Cisco 7000
  • Cisco Nexus 1000V, a software switch that runs within the VMware kernel to deliver Cisco VN-Link services for tight integration between the server and network environment, allowing policies to move with a virtual machine during live migration
  • Cisco MDS 9124, a Fibre Channel switch that provides SAN connectivity to allow SAN boot for VMware ESX running on UCS

Storage Layer
The storage layer consists of NetApp unified storage systems capable of simultaneously providing SAN connectivity (for SAN boot) and NFS connectivity for the running VMware environment. NetApp storage can also meet the specialized storage needs of any running application. Running the VMware environment over Ethernet provides a greatly simplified management environment that reduces costs.

NetApp MultiStore software provides a level of security and isolation for shared storage comparable to physically isolated storage arrays. MultiStore lets you create multiple completely isolated logical partitions on a single storage system, so you can share storage without compromising privacy. Individual storage containers can be migrated independently and transparently between storage systems.

Tenant Provisioning
When a tenant is provisioned using this architecture, the resulting environment is equipped with:

  • One or more virtual machines or vApps
  • One or more virtual storage controllers (vFiler units)
  • One or more VLANs to interconnect and access these resources

Together, these entities form a logical partition. The tenant cannot violate the boundaries of this partition. In addition to security we also want to be sure that activities happening in one tenant partition do not interfere indirectly with activities in another tenant partition.

End-to-End QoS

Very few projects tackle end-to-end quality of service. In most cases, a QoS mechanism is enabled in one layer in the hope that downstream or upstream layers will also be throttled as a result. Unfortunately, different applications have different characteristics—some may be compute intensive, some network intensive, and others I/O intensive. Simply limiting I/O does little or nothing to control the CPU utilization of a CPU-intensive application. It’s impossible to fully guarantee QoS without appropriate mechanisms at all three layers. Our team set out to design such a system.

Companies such as Amazon, Google, and others have built multi-tenant or “cloud” offerings using proprietary software that took years and hundreds of developers to create in house. Our approach was to use commercially available technology from Cisco, NetApp, and VMware to achieve similar results.

One design principle we applied in all layers is that when resources are not being utilized, high-value applications should be allowed to utilize those available resources if desired. This can allow an application to respond to an unforeseen circumstance. However, when contention occurs, all tenants must be guaranteed the level of service they have contracted for.

Another design principle is to set the class of service as close to the application as possible, map that value into a policy definition, and make sure that the policy is applied uniformly across all layers in accordance with the unique qualities of each layer. We used three mechanisms in each layer to help deliver QoS:

Table 1) QoS mechanisms.

•Expandable Reservations
•Dynamic Resource Scheduler
•UCS QoS System Classes for Resource Reservation and Limit

•QoS—Bandwidth Control
•QoS—Rate Limiting
•Storage Reservations
•Thin Provisioning

Compute Layer
At the server-virtualization  level, VMware vSphere provides many capabilities to ensure fair use, especially  of CPU and memory resources. A vSphere resource pool is a logical abstraction  for flexible management of resources. Resource pools can be grouped into  hierarchies and used to hierarchically partition available CPU and memory. By  correctly configuring resource pool attributes for reservations, limits,  shares, and expandable reservations, you can achieve very fine-grained control  and grant priority to one tenant over another in situations in which resources are in contention.

VMware Distributed Resource  Scheduler (DRS) allows you to create clusters containing multiple VMware  servers. It continuously monitors utilization across resource pools and  intelligently allocates available resources among virtual machines. DRS can be fully  automated at the cluster level so infrastructure and tenant virtual machine  loads are evenly load balanced across all of the ESX servers in a cluster.

At the hardware level, Cisco  UCS uses Data Center Ethernet (DCE) to handle all traffic inside a Cisco UCS  system. This industry-standard enhancement to Ethernet divides the bandwidth of  the Ethernet pipe into eight virtual lanes. System classes determine how the  DCE bandwidth in these virtual lanes is allocated across the entire Cisco UCS  system. Each system class reserves a specific segment of the bandwidth for a  specific type of traffic. This provides a level of traffic management, even in  an oversubscribed system.

Network Layer
At the network level, traffic is segmented according to the  Class of Service (CoS) already assigned by the Nexus 1000v and honored or policed  by the UCS system. There are two distinct methods to provide steady-state  performance protection:

  • Queuing allows networking  devices to schedule packet delivery based on classification criteria. The end  effect of the ability to differentiate which packets should be preferentially  delivered is providing differentiation in terms of response time for important applications  when oversubscription occurs. Queuing only occurs when assigned bandwidth is  fully utilized by all service classes.
  • Bandwidth control allows network devices an appropriate amount of buffers per  queue such that certain classes of traffic do not overutilize bandwidth. This  allows other queues to have a fair chance to serve the needs of the rest of the  classes. Bandwidth control goes hand in hand with queuing, since queuing determines  which packets are delivered first while bandwidth determines how much data can  be sent per queue.

A set of policy controls can be enabled such that any  unpredictable change in traffic pattern can be treated either softly, by  allowing applications to burst/violate for some time above the service  commitment, or by a hard policy, dropping the excess or capping the rate of  transmission. This capability can also be used to define a service level such  that noncritical services can be kept at a certain traffic level or the lowest  service-level traffic can be capped such that it cannot impact higher-end  tenant services.

Policing as well as rate limiting is used to define such  protection levels. These tools are applied as close to the edge of the network  as possible to stop the traffic from entering the network. In this design, the  Nexus 1000V is used for the policing and rate-limiting function for three types  of traffic:

  • VMotion. VMware  traditionally recommends a dedicated Gigabit interface for VMotion traffic. In our  design the VMotion traffic has been dedicated with a nonroutable VMkernel port.  The traffic for VMotion from each blade server is kept at 1Gbps to reflect the  traditional environment. This limit can be either raised or lowered based on  requirements, but should not be configured such that the resulting traffic rate  impacts more critical traffic.
  • Differentiated  transactional and storage services. In a multi-tenant design,  various methods are employed to generate differentiated services. For example, a  "priority" queue is used for the most critical services and  "no-drop" is used for traffic that cannot be dropped but can sustain  some delay. Rate limiting is used for fixed-rate services, in which each application  class or service is capped at a certain level.
  • Management. The  management VLAN is enabled with rate limiting to cap the traffic at 1Gbps.

Storage Layer
As described above, NetApp MultiStore software provides secure  isolation for multi-tenant environments. (MultiStore is described in more  detail in a companion article in this issue.)

In the storage layer, delivering QoS is a function of  controlling storage system cache and CPU utilization as well as ensuring that  workloads are spread across an adequate number of spindles. NetApp developed  FlexShare to control workload prioritization. FlexShare allows you to tune  three independent parameters for each storage volume or each vFiler unit in a  MultiStore configuration so you can prioritize one tenant partition over  another. (FlexShare is described in more detail in a previous Tech OnTap article.) Both MultiStore and FlexShare have been available for the  NetApp Data ONTAP® operating environment for many years.

NetApp thin provisioning provides tenants with a level of  "storage on demand." Raw capacity is treated as a shared resource and  is only consumed as needed. When deploying thin-provisioned resources in a  multi-tenant configuration you should set the policies to volume autogrow, Snapshot™ autodelete, and fractional reserve. Volume autogrow allows a volume  to grow in defined increments up to a predefined threshold. Snapshot autodelete  is an automated method for deleting the oldest Snapshot copies when a volume is  nearly full. Fractional reserve allows the percentage of space reservation to  be modified based on the importance of the associated data.

When using these features concurrently, important tenants can be  given priority to grow a volume as needed with space reserved from the shared  pool. Conversely, lower-level tenants require additional administrator  intervention to accommodate requests for additional storage.


Cisco, VMware, and NetApp have teamed to define and test a  secure, multi-tenant cloud architecture capable of delivering not just the  necessary security, but also quality of service, availability, and advanced  management.

This article introduced our end-to-end approach to QoS. You can  read more about QoS or the other pillars of multi-tenancy in our recently released design guide, which describes the  elements of the architecture in detail along with recommendations for correct  configuration.

Got opinions about QoS in multi-tenant environments?
Ask questions, exchange ideas, and share your thoughts online in NetApp Communities.

Chris Naddeo

Chris Naddeo
Technical Marketing Engineer for UCS
Cisco Systems

Chris joined Cisco to focus on customer enablement and the design of optimal storage architectures for Cisco’s Unified Computing System. He has an extensive storage background, including one year spent at NetApp as a Consulting Systems Engineer for Oracle and Data ONTAP GX as well as nine years at Veritas, where he served as a product manager for Veritas storage software.


Please Note:

All content posted on the NetApp Community is publicly searchable and viewable. Participation in the NetApp Community is voluntary.

In accordance with our Code of Conduct and Community Terms of Use, DO NOT post or attach the following:

  • Software files (compressed or uncompressed)
  • Files that require an End User License Agreement (EULA)
  • Confidential information
  • Personal data you do not want publicly available
  • Another’s personally identifiable information (PII)
  • Copyrighted materials without the permission of the copyright owner

Continued non-compliance may result in NetApp Community account restrictions or termination.