Turning Big Data into Smart Data - Is Hadoop Right for You?

"We are drowning in information but starved for knowledge". John Naisbitt, author


Everyone seems to be jumping on the Hadoop bandwagon. It’s so easy to get caught up in the frenzy! According to Wikibon, an open source research and advisory organization based in Marlborough, Massachusetts, the big data market will grow from $5 billion in 2012 to $53.4 billion in 2017. IDC predicts that big data and analytics will be the CIO issue of the year in 2012. But how do you know whether Hadoop is right for you?  Do you actually have a challenge or problem that could be solved by using Hadoop?  Could your organization benefit from a Hadoop implementation?  The truth is, Hadoop is entering the enterprise in a big way and is enabling businesses to realize the full value of their ever-growing data in measurable and meaningful ways.

So again, is Hadoop right for your organization?

The answer depends on whether you have a big data problem or not. Do you know if you have a big data problem? 

You have a big data problem if your organization is not able to capture, store, process and analyze data efficiently and effectively. If it’s taking too much time and energy to process the data, then the output may be too expensive and too late to make a meaningful contribution to the business. If the output is not enabling revenue, or resulting in key business insights and better decision making, why go to the trouble of processing the data?

To further assess if your company has a big data problem, ask yourself the following questions:

Is my data coming from new and multiple sources? Traditionally, data sources have been CRM, ERP, and transactional DBs, but lately, businesses are dealing with new data sources like mobile applications, machine-generated data, web logs, social media, etc. Legacy ETL systems have not kept up and/or are not prepared to keep up with the pace because of the increase in data sources. 

Is my data in multiple formats (structured, unstructured) at a very high rate of volume and velocity?  80% of the world’s data is unstructured, and most businesses don’t have the capability or resources to analyze this data (according to IDC) –do you know how much of your data is unstructured? Can you quantify the amount of data collected on a daily basis? Weekly? Monthly? What about 3, 6, 12 months from now?

Is my data of high value? The data is of high value if it’s rich in key metrics like performance counters, system and environmental information, locality, shopping transactions –access to these facts is key in enabling revenue and improving customer satisfaction.

Is my data time-sensitive?  How long before my data becomes stale? If the data has a short shelf life, then it’s important to process it right away. Hadoop may be able to help with real-time and near-real-time analytics. See figure 2 below, for an example of how Hadoop is enabling key business processes in the enterprise.

Is my data supporting key business processes? In addition to the core business processes, are there any new business processes that rely on the timely processing of this data?

Is there a service-level agreement (SLA) in place to process the data? In other words, what are the possible business implications of not processing and analyzing the data within the agreed-upon time? Is there a need to revise the current SLAs in view of the new data challenges (volume, speed, variety)?

If you answered yes to most of these questions, it’s clear that you do in fact have a big data problem and that Hadoop might be the right solution for you.  At NetApp, “We’ve adopted Hadoop to process 24 billion records generated by our install base. The biggest payoff is that it provides fresh insight into the products and features that NetApp customers favor, and will help the company improve its offerings,” comments Cynthia Stoddard, NetApp CIO.

Watch for my next blog, where I’ll talk about how NetApp Open Source Hadoop (NOSH) solution could be the right solution for your business.