The Evolving Data Management Challenge: Introducing StorageGRID Webscale

Billions of mobile devices and applications are creating a massive amount of data every day. The advent of the Internet of Things, in which countless products and corporate assets will be connected and sharing data, will generate even greater volumes of information for enterprises. We asked Ingo Fuchs, senior manager of cloud solutions at NetApp, to explain what this growth of data means for business today and how it will affect data management and consumption going forward.

 

How is data management evolving today?

 

Ingo Fuchs: It’s clear that information is a fundamental driver of business success and competitive advantage today, and conversations around managing data have moved far beyond “just storage.” Data is a valuable corporate asset that needs to be retrieved efficiently and guarded carefully.

 

In addition, the cloud and the mobility of data today are putting extra emphasis on new approaches to data management, not only because data is extremely valuable and personal, but also because data management is the toughest aspect of the hybrid cloud reality facing businesses today. Moving data across, within, or from multiple clouds has not been a practical reality before, because data is not easy to move.

 

Can you elaborate on the “toughest aspect” of the cloud in terms of data management?

 

IF: The Internet of Things (IoT) and the growth of unstructured data such as texts, Facebook posts, tweets, and videos not only drives massive data growth; it also changes how data is stored and accessed. As data is created and consumed across users, locations, and devices (as opposed to a more traditional data center setup), IT departments need to reevaluate how they can manage large amounts of data so that they can classify it and use it more effectively. This leads to multisite datastores that bring data closer to the workloads, applications, and users.

 

The value of data typically changes over time, as does the cost of storing data in a particular location or storage technology. With object-enabled data management, organizations can now establish granular and dynamic data management policies that determine how unstructured data is stored and protected, taking into consideration a wide range of performance, durability, availability, geo-location, and longevity requirements.

 

Object-based storage enables organizations to manage data as objects. This differs from other storage architectures such as file systems, which manage data as a file hierarchy. An object usually includes the data, metadata, and a unique identifier. The architecture can be deployed at the device level, system level, and interface level. It offers several key advantages, including potential cost savings and greater storage flexibility.

 

NetApp announced StorageGRID Webscale today. How is it relevant to this conversation?

 

IF: Enterprises have been storing large amounts of unstructured data as files in systems for years. And retrieving data has meant that you needed to know the file share and the directory and have at least a rough idea what the file name and extension would be. This isn’t a practical reality anymore, as the amount and complexity of different data types stored in enterprises grow. For example, most social media data is unstructured. With the coming of the Internet of Things, companies are going to be gathering and managing all sorts of data from sensors and other sources. Object-based storage is better suited to handle this onslaught of information.

With NetApp® StorageGRID® Webscale, organizations can store massive amounts of data in a single, elastic content store. It’s a software-defined storage solution for large archives, media repositories, and web datastores and is an object storage solution designed for the hybrid cloud. The solution supports standard protocols, so organizations can run applications on premises or in the cloud.

 

The StorageGRID Webscale policy engine provides automated data placement according to site-based performance and availability requirements, optimized for cost as data ages. In addition, the StorageGRID Webscale data durability framework makes sure of data integrity and accessibility.

 

What are some customer environments where the product is applicable?

 

IF: There are a number of verticals where this technology would be valuable.

One is finance. Financial institutions must process and archive copies of mortgage applications and related documents and financial transactions, and they must process millions of customer check and deposit images every day. Similarly, insurance companies must manage large amounts of policy data and retain copies of insurance claims and related photos for many years to serve in litigation support.

 

In movies, entertainment, and sports, digital media studios, news agencies, publishers, and broadcasters need to store, collaborate on, and easily retrieve extremely large libraries of film footage, broadcast programming, and new web and mobile streaming video and audio files, whether they were stored today or years ago.

 

Healthcare organizations must store large patient content files, including patient records, CT and MRI scans, and other imaging files and diagnostic reports, for time frames as long as the lifetime of the patient and more. Data integrity and security are vital because patient data often needs to be stored for many decades.

 

Does StorageGRID Webscale expand NetApp’s cloud portfolio?

 

IF: NetApp StorageGRID Webscale is built for the hybrid cloud and expands NetApp’s hybrid cloud portfolio. There is no slowing in the rate of data going into cloud environments today, and with the growth of mobile device use and the Internet of Things, this data growth will surely continue. Some specific features relevant for the cloud include:

  • Flexible usage model to run on premises or hosted by a service provider
  • Ability to transform cloud applications to on-premises applications
  • Support for standard protocols S3 and CDMI
  • Scalability to meet enterprise and web scale requirements
  • Dynamic policy engine for on-demand data placement across cost/performance tiers and locations
  • Real-time audit for SLO monitoring and compliance
  • Data durability framework that makes sure data is always correct and accessible