Hi, I have a customer that is interested in utilizing OpenStack with E-Series, however I have been unable to find much information or documentation on how they integrate and any synergy's we may have. Any insight would be extremely helpful.
Thanks for reaching out. I'm currently working on the Icehouse Deployment and Operations Guide which will hopefully cover this topic in more detail - this should be on the communities site within the next two weeks.
Until then, I can point you to the Havana-based Deployment and Ops Guide v2.3 (available on this site) - the section on Swift integration talks about how E-Series can be used for a more efficient and scalable Swift deployment than the reference implementation. In addition, we now have Cinder support for E-Series in the Icehouse release, with upstream configuration documentation available @ http://docs.openstack.org/icehouse/config-reference/content/eseries-iscsi.html
Let me know if you have any other specific questions.
Bob, thanks so much for the response. Is there any way you could elaborate on the following, "The deployment of OpenStack Object Storage (aka Swift) can be substantially enhanced when deployed on NetApp’s E-series unique Dynamic Disk Pools (DDP) technology." I've read that DDP can substantially reduce the amount of storage required, but don't understand how.
Swift employs a consistent hashing ring to protect data. This typically means that 3X the capacity of a given object is consumed to store and protect it within a single site (with higher numbers yet when replicating to a second site and beyond). Swift deployments tend to be very heavily capacity optimized and employ the densest, highest capacity disks available. Traditional RAID parity schemes applied to these disks imply exposure to risk during very lengthy rebuild times upon failure. Given the scale of typical object storage deployments and the number of individual disks involved it wouldn't be uncommon for certain RAID implementations running permanently in a degraded state when considering the typical mean time between failure (MTBF) rates of commodity disks. The unique qualities of E/EF-series Dynamic Disk Pools (DDP), however, mitigate significantly these lengthy rebuild times and allow the capacity efficiency advantages of a parity scheme to be employed. I've heard various figures associated w/ the improvement, but it seems that 5% of the time it'd otherwise take is a common rule of thumb. The resulting overhead of the parity protection is ~.28 over the size of the object itself. The only modification to Swift required is simply a tuning parameter that adjusts the ringbuilder logic. Have a look at the Deployment & Operations Guide posted here for full instructions.
Beyond the obvious advantages (e.g. improved power & cooling, less physical space required, et cetera) associated with the reduction in disk deployed, there are some additional subtle improvements: 1) the reduction in additional copies associated with the significantly reduces ongoing replication traffic with the Swift cluster (which has been observed as a common delimiting factor to the scale that can be achieved 2) Swift becomes immediately consistent within a single site.