By Dennis Duckworth
Many companies these days find themselves on the defensive when it comes to Big Data. Some don’t really know what Big Data is. Others have an understanding of what it is but they aren’t sure whether they have a Big Data problem. Still others know what it is, and have identified a Big Data problem to deal with, but they just don’t know how.
Folks in IT are especially likely to be put on the defensive regarding Big Data because the term gets thrown at them in new “requests” – maybe from the VP of Sales who just read an article about it in the New York Times and is afraid that his competitors might already be “doing it” and gaining an advantage or maybe from the CMO who wants to be able to get a 360° view of his customers and he just heard in a vendor webinar that Big Data is the way to do that.
As they say, sometimes the best defense is a good offense. IBM has found that an effective way of proactively dealing with Big Data is dividing up the environment for dealing with it, along with the corresponding analytics, into different pieces or zones based on characteristics of the data and the analytics needed. One version of an analytics zone architecture is shown in the diagram below, with the main zones being: Real-time data processing and analytics zone; Operational data zone; Landing, Exploration, and Archive data zone; Enterprise data warehouse and Data mart zone; and Deep analytics data zone.
When you break up your analytics environment in this way, it allows you to break up your huge Big Data problem into more manageable chunks – you can see the individual trees instead of the huge forest and you can prune just those trees that need attention rather than attempting to clear-cut the entire forest.
As an example, in response to the request from the CMO asking for a more complete view of your company’s customers, you may want to add the ability to analyze unstructured data that you are now getting from Twitter or Facebook. You don’t necessarily need to mess with your Enterprise data warehouse or your tactical data marts– they may be working just fine, continuing to process all your highly valuable structured data with proper enterprise-class integration, data governance and security, supporting the tens/hundreds/thousands of BI reports your organization needs every day/week/month in order to run smoothly just as it has for the past few years. But you may want to consider adding a Landing, Exploration, and Archive data zone or adding to it if you already have one. There is some very useful information in those social media feeds but there is also a lot of junk – you wouldn’t want to try to convert all of that incoming social data into structured data and put it into your Enterprise data warehouse. Rather, you would likely want to put it into a landing area where you could do some exploration on it and uncover the valuable nuggets that you then might extract into a data warehouse.
Or maybe you are in the manufacturing business and would like to implement a more proactive servicing system for some of the heavy machinery in your plant. Right now, it is policy to shut down the machines and service them on a regular schedule but you’ve noticed that some of the machines are fine and could go on many more weeks without service while there are others that fail before the scheduled service because they needed more immediate attention — both cases result in lost productivity. You could implement a system that read the data coming off the many sensors that are on and in the machines and run predictive models to alert you to when the machines were likely to need maintenance. Putting that data into a data warehouse and running analytics on it there might work but you likely don’t want to store all that repetitive sensor data, particularly when the data is within normal specifications and ranges. Rather, your needs might be better served by adding a Real-time data processing and analytics zone in order to process the data as it flowed from the sensors rather than landing it and then deciding what to do.
Go on the offensive with Big Data. IBM helps customers with situations like this every day — we have a comprehensive set of products in our Big Data & Analytics portfolio to help address needs in all of these analytics zones and I invite you to explore them further here: http://www-01.ibm.com/software/data/bigdata/
And for a cool poster of IBM’s Big Data and Analytics “Zone” architecture, you can go to: http://public.dhe.ibm.com/software/data/sw-library/bda/zone/lib/pdf/28433_ArchPoster_Wht_Mar_2014_v4.pdf
About Dennis Duckworth
Dennis Duckworth, Program Director of Product Marketing for Data Management & Data Warehousing has been in the data game for quite a while, doing everything from Lisp programming in artificial intelligence to managing a sales territory for a database company. He has a passion for helping companies and people get real value out of cool technology. Dennis came to IBM through its acquisition of Netezza, where he was Director of Competitive and Market Intelligence. He holds a degree in Electrical Engineering from Stanford University but has spent most of his life on the East Coast. When not working, Dennis enjoys sailing off his backyard on Buzzards Bay and he is relentless in his pursuit of wine enlightenment. View all posts by Dennis Duckworth