Apache CloudStack / Citrix CloudPlatform Introduction - Part 2 of 3

Originally published 6/3/14.

 

This is part 2 of 3 in the Apache CloudStack / Citrix CloudPlatform introduction series. In this post we will go into further detail about the Apache CloudStack / Citrix CloudPlatform architecture.

 

The full series can be found here:

Apache CloudStack / Citrix CloudPlatform Introduction - Part 1 of 3 - Introduction

Apache CloudStack / Citrix CloudPlatform Introduction - Part 2 of 3 - Apache CloudStack 101

Apache CloudStack / Citrix CloudPlatform Introduction - Part 3 of 3 - NetApp Integration

 

Apache CloudStack / Citrix CloudPlatform Hierarchy

Apache CloudStack infrastructure has a well-defined hierarchy and it introduces some new concepts for managing infrastructure at scale: Zones, Pods, Clusters, and Hosts.

 

Zones

A zone can be thought of as a datacenter in VMware. It is at the top of the Apache CloudStack hierarchy. A zone can be either a physical or logical boundary. The boundary would usually be done according to resources and function. You can have two types of Zones, a basic zone and advanced zone.  Advanced zones support advanced networking capabilities such as creating public networks or physical / logical network separation through VLANs. For on-premise or private clouds the Basic Zone should provide enough capability and simplicity. Hybrid clouds which integrate with other clouds require Advanced Zones and the network capabilities which they support.

 

Pods

A pod can be thought of or represent a rack or a grouping of IT resources within a zone. Converged infrastructure, e.g. FlexPod (which includes Cisco servers/switches and NetApp storage in a rack) would be an example of a pod. Certainly you do not have to create a pod per rack (or per set of racks) - it is simply an resource organization structure within a Apache CloudStack Zone.

 

Clusters

A cluster is a grouping of compute resources. Clusters in Apache CloudStack are bound to a specific hypervisor. A cluster can be VMware, Hyper-V, XEN, KVM, Bare Metal, OVM, or Linux Containers (LXC). For example a XenServer farm for VDI services in a particular region could be treated as a single cluster.

 

Hosts

A host is a server providing compute resources. It is the physical server running hypervisor software, i.e. the place where virtual machines exist and run in.

 

Storage Overview

There are two types of storage that exist in Apache CloudStack: primary storage and secondary storage. Apache CloudStack primary storage supports NFS and iSCSI natively. You can use the PreSetup option to configure other types of storage such as FC, however in this case the storage must be managed through the hypervisor. Secondary storage supports NFS, iSCSI, SMB/CIFS, S3, or SWIFT.

 

Primary Storage

Primary storage is used for guest VMs and guest VM storage. Apache CloudStack primary storage contains volumes. There are two types of volumes: the root volume and data volume. Each VM will be stored on a root volume and any additional storage resources provided to a VM will be in form of data volumes. You can have more than one primary storage configured per cluster. Storage tags can be used, so the appropriate storage is chosen based on a service offering when provisioning guest VMs within a zone that has more than one primary storage configured. Primary storage is bound to a cluster similar to a host in Apache CloudStack.

 

NetApp Value for Apache CloudStack primary storage

A NetApp volume on a NetApp storage controller maps directly to Apache CloudStack primary storage. NetApp not only has thin provisioning capabilities as well as de-duplication built into Apache CloudStack through the NetApp Virtual Storage Console (VSC) but also provides management capabilities for the underlying storage. This is something you simply don't get by default and storage usually appears as a white-box within Apache CloudStack. White-box storage can be very dangerous as often underlying problems in the storage layer can be missed or go un-noticed if the proper infrastructure to monitor those storage resources is not in place.

 

Secondary Storage

Secondary storage is used for VM templates and there can only be a single secondary storage per zone. Secondary storage is bound to the zone. All templates and ISO images are stored on secondary storage. When a guest VM is created it is always provisioned from a template or ISO that exists on secondary storage.

 

NetApp Value for Apache CloudStack secondary storage

A NetApp volume on a NetApp storage controller maps directly to Apache CloudStack secondary storage. NetApp provides a lot of value to Apache CloudStack secondary storage in form of thin provisioning and data de-duplication. We can expect very high data de-duplication rates, upwards of 90% is possible on Apache CloudStack secondary storage. Since VM templates can consume a lot of space, this can add up to a lot of saved storage costs and efficiencies. NetApp storage is also ideal for backing up templates using NetApp SnapVault if this is desired. By using NetApp SnapMirror, we can protect ourselves against a disaster scenario by mirroring templates to a remote DR site.

 

Apache CloudStack Snapshots

A snapshot can be created for any volume on Apache CloudStack primary storage. Since all storage in Apache CloudStack is a volume, root as well as data volumes can be snapshotted. Apache CloudStack snapshots are not space efficient, they are simply a thick copy of the original data. Snapshots can be scheduled and when taking a snapshot we do have the option of quiescing the VM. Quiescing databases or a federated multi-application integration is currently not in place but hopefully will be added soon via a plugin or natively in Apache CloudStack.

 

NetApp value for Apache CloudStack snapshots

NetApp Virtual Storage Console (VSC) for Apache CloudStack enables efficient snapshots for CloudStack. After installing VSC for Apache CloudStack, snapshots of the root or data volumes sitting on primary storage will be done through a file clone on NetApp storage. Therefore, they bypass the inefficient file system copy that Apache CloudStack uses by default. File clones can be thought of as incremental forever file based backups. Sounds to good to be true? This is a technology from NetApp called SIS Clone (single instance storage clone). It has been available since Data ONTAP 7.3.2 (several years). In order to create file based copies, Data ONTAP, the operating system running on NetApp storage, uses the same pointer reference technology as used in NetApp Snapshots (which are backups of a whole NetApp volume). These copies consume close to zero space upon create as they are pointers just like snapshots. You can think of SIS Clone as a granular file snapshot. Since we are backing up VMs, which are files and they can be very large, this greatly enhances the default behaviour of Apache CloudStack which, as mentioned, just does a full, thick copy. Other storage vendors struggle incredibly here to provide anything useful, as many of their snapshots impact performance negatively. NetApp Snapshots do not impact the performance negatively, regardless if you just have a few or a couple hundred! This is a huge advantage and I would even argue that the built-in thick copy is simply not an option - it is too slow and way too inefficient to work at scale.

 

Below is a graphical representation of the Apache CloudStack infrastructure we have thus far discussed:

 

cloudstack_zone_hierarchy.png

 

In the next post in this series we will dive into the NetApp Virtual Storage Console (VSC) for Apache CloudStack.

 

Stay Tuned!