The Logical Data Warehouse : Two Easy Pieces (DW+Hadoop)

By Dennis Duckworth,

In some of our recent blogs, we have described our Data Warehouse Point of View and our Zone Architecture for Big Data. We developed these from our experiences with our customers, seeing what worked (and what didn’t), to encourage those who are just starting out on their analytics journeys or those who are disappointed by the performance or rigidity of their existing data warehouse environments to at least consider the advantages of separating data (and corresponding analytics) into different zones based on the characteristics of both. We have been using the term Data Warehouse Modernization to describe the renovation of old traditional monolithic data warehouses (along with other data silos) into hybrid, integrated, or logical data warehouse models.

In a sort of modernization of our own, we have reexamined how we go to market with our data warehouse and data management products to see how we might make it easier for our customers to implement the best practices that we actively promote. With the recent release of our latest data warehouse appliance, the PureData System for Analytics (PDA) N3001 (codename Mako), we had the chance to make some changes. Now,  for example, included with every PDA appliance we ship (every configuration, from the smallest, the “Mako-mini” 2 server rack-mountable appliance, all the way up to our largest, our 8-rack system), we include license entitlements for other IBM software products we firmly believe can help our customers in creating a modern, flexible, high performance logical data warehouse environment. One of those entitlements is for IBM InfoSphere BigInsights for Hadoop.

Studies are proving out our opinion that the logical data warehouse is a critical contributor to analytic success for enterprises. In the recently released 2014 IBM Institute for Business Value analytics study, companies were analyzed and categorized by the extent and the effectiveness of analytics in them. Those in the top category, the “front runners”, use data to the highest benefit. They have been successful in “blending” their traditional business intelligence infrastructures with big data technologies to create agility and flexibility in the way they ingest, manage and use data. Quite interestingly, and consistent with our guidance in these blogs, almost all of the front runners (92 percent) have an integrated (or hybrid) data warehouse and, as part of that, they are 10 times than more likely than other organizations to have a big data landing platform. In practice, they have implemented what we have called zone architecture to allow them to collect and analyze a wider variety of data, empowering their employees to make full use of their traditional data and new types of data together.

DL 1

Our customers are also providing proof that data warehouse modernization works. How are these customers using BigInsights and these big data landing platforms? Many are creating what we have been calling data reservoirs. As you may recall from our blogs here and from the hundreds/thousands of other posts on the topic, Hadoop is finding a home in the enterprise as the preferred technology for data reservoirs. These are landing areas for all the data you think may be useful in your company, whether it is structured, unstructured, or semi-structured. Some more specific examples: One of our customers is using BigInsights in combination with the PureData System for Analytics to help it convert users of its free cloud service to customers for their paid service, using predictive analytics on user behavior (structured and unstructured data) to target them more accurately with offers. Another, a telco, is using BigInsights with PDA along with InfoSphere Streams to get a 360° view of its customers and to enable them to react in real-time to customer satisfaction issues. (The InfoSphere Streams entitlement with PDA will be the topic of a future blog.)

The BigInsights entitlement that comes with the N3001 PureData System for Analytics is for 5 virtual nodes which, by our calculations, gives you the ability to manage about 100TB of data. So this is not a useless little demo version – this license gives you the ability to create and use a full-blown Hadoop cluster with all of the advantages that BigInsights has to offer, things like Big SQL for SQL access to the data in BigInsights, Big Sheets (enables Excel like spreadsheet exploration of the data), text analytics accelerator, Big R (which allows you to explore, visualize, transform, and model big data using familiar R syntax), and a long list of other features and capabilities. You get all of this (and much more) with every N3001 PureData System for Analytics. With software entitlements like this, we allow you to practice what we preach: modernize your data management environment by putting data and the corresponding analytics on the proper platform.

About Dennis Duckworth

Dennis Duckworth, Program Director of Product Marketing for Data Management & Data Warehousing has been in the data game for quite a while, doing everything from Lisp programming in artificial intelligence to managing a sales territory for a database company. He has a passion for helping companies and people get real value out of cool technology. Dennis came to IBM through its acquisition of Netezza, where he was Director of Competitive and Market Intelligence. He holds a degree in Electrical Engineering from Stanford University but has spent most of his life on the East Coast. When not working, Dennis enjoys sailing off his backyard on Buzzards Bay and he is relentless in his pursuit of wine enlightenment. You can follow Dennis on Twiiter 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s