Do you use NFS or CIFS to access data stored in large repositories? Better watch out – there is a new kid in town!
Traditionally, large amounts of unstructured data (or Big Data) have been stored as files in file systems. Retrieving data meant that you needed to know the file share, the directory (and sub-directories) and have at least a rough idea what the file name and extension would be. Increasingly, this just doesn’t work anymore – today IT departments already manage content repositories that store hundreds of millions or billions of files, often across many locations. As the amount and complexity of data stored in enterprises grows, it becomes increasingly important to find a better way to store, manage and retrieve this data.
A key path to solve this issue is to leverage technology and standards that have been specifically developed to provide this idea of a single namespace for billions of data sets and across locations and even managed services that might reside off-premise.
On the technology side, NetApp just released StorageGRID 9.0 (http://www.netapp.com/us/company/news/news-rel-20120809-203086.html). StorageGRID was developed from the ground up to support large, distributed content repositories – managing billions of data sets and petabytes of capacity across hundreds of sites in a single namespace. With this technology, you know what data you have in your repository and you can control where this data is stored (locations, tiers, etc.).
On the standards side there is CDMI (http://www.snia.org/cdmi), the Cloud Data Management Interface. CDMI is a standard developed by SNIA (http://www.snia.org), the Storage Networking Industry Association, with heavy involvement from a number of leading storage vendors, including NetApp. CDMI not only introduces a standard to ingest and retrieve data into and out of a large-scale repository, it also enables applications to easily manage this repository and where the data sits.
CDMI has arrived in the real world
NetApp StorageGRID already supported NFS and CIFS, as well as an API on top of RESTful HTTP (http://en.wikipedia.org/wiki/Representational_state_transfer). So why is NetApp adding support for CDMI? It’s very simple – we believe that standards are important and that ultimately our customers will benefit from an ecosystem of solutions built on standards. Already a number of companies are working on supporting CDMI or have announced support for CDMI, so while still a bit early from an adoption perspective, the momentum is clearly there.
CDMI is the new NFS
When it comes to creating and managing large, distributed content repositories it quickly becomes clear that NFS and CIFS are not ideally suited for this use case. This is where CDMI shines, especially with an object-based storage architecture behind it that was built to support multi-petabyte environments with billions of data sets across hundreds of sites and accommodates retention policies that can reach to “forever”. NetApp’s Distributed Content Repository solution based on StorageGRID and E-Series storage systems fits precisely into this space.
Find out more about our Distributed Content Repository solution in the solution brief here: http://media.netapp.com/documents/ds-3339.pdf
Read the Big Content white paper here: http://media.netapp.com/documents/wp-7161-0512.pdf
Watch me talk about Big Content: http://www.youtube.com/watch?v=96g98Gb_rWE
What are your thoughts? Have you implemented object-based storage and want to share your experience? Go ahead and leave your comments below.