IBM DB2 Analytics Accelerator: OLTP and OLAP in the same system at last! (Part one)

By Isaac Moreno Navarro

With the advent of the data warehouse, the possibility of using the same infrastructure for online transaction processing (OLTP) as well as for online analytical processing (OLAP) has become a controversial subject. You will find that different database vendors have different points of view.

Let’s explore this topic by starting with a brief explanation of OLTP and OLAP.

Online transaction processing (OLTP)

This is the traditional way databases used to work – by placing small queries against a low number of rows and retrieving information. For instance, when you buy an airplane ticket, you check the open seats in a given flight and the price. To achieve this, several small queries are served by the database in a short period of time, and the data involved in the answer is a small set of the data stored.

Online analytical processing (OLAP)

In the case of OLAP, you send very heavy queries that need to process a huge amount of data in comparison to the total size of the database. You send fewer queries to your system, but very heavy ones. In our airline example, if you were working in marketing you might want to know how many people between 20 and 35 years old traveled from New York to Madrid during 2014, grouped by fares and describing how far in advance they purchased their tickets.

So, we have different uses for the same set of data. In this scenario, some database vendors offer a one size fits all solution, suggesting that just one machine is able to address such different workloads at the same time.

IBM Hybrid Approach to Transactional processing and analytics

Through IBM DB2 Analytics Accelerator, IBM supports a different approach. This DB2 Analytics Accelerator, together with DB2 for z/OS, form a self-managing, hybrid workload-optimized database management system that runs each query workload in the most efficient way. With this approach, you avoid the headaches involved with the configuration of a database designed for OLTP that is also trying to serve OLAP workloads. This is the main reason why many data warehouses are difficult to manage, expensive to maintain and require many people to tune them—while still producing frustrated end-users.

IBM DB2 Analytics Accelerator turns DB2 for z/OS into a universal database management system, capable of handling both transactional and analytical workloads.

 IBM DB2 Analytics Accelerator

IBM proposes the use of a hybrid environment where each query workload is executed in the most optimal environment for maximum speed, execution and cost efficiency. This hybrid infrastructure blends the best attributes of symmetric multiprocessing (SMP) leveraging DB2 for z/OS with the best attributes of the hardware-accelerated massively parallel processing (MPP) architecture delivered by Netezza technology.

The hybrid environment is enabled by the addition of the DB2 Analytics Accelerator, a high-performance appliance that integrates the z Systems infrastructure with PureData System for Analytics, powered by IBM Netezza technology.

And yes, it is only for DB2 for z/OS, at least for now.

This solution generally works as if you just added another access path  that is specialized for processing analytic queries to your mainframe. Because it is just like an additional access path, query processing happens transparently so that users and applications can send the very same DB2 query requests to the system, unchanged.

The interface for the end-user does not change at all. And when I talk about an “additional access path”, what I really mean is that we add the DB2 Analytics Accelerator to complement DB2 for z/OS, which is built for transactional workloads. The Accelerator provides a cost-effective high-speed query engine to run complex analytics workload. Therefore, IBM DB2 Analytics Accelerator turns DB2 for z/OS into a universal database management system, capable of handling both transactional and analytical workloads.

IBM DB2 Analytics Accelerator_image
Diagram of DB2 for z/OS and DB2 Analytics Accelerator for z/OS

 

As you can see, there are two different machines, each one solving different workload needs working as one system. As part of its unique design, the DB2 Analytics Accelerator includes breakthrough technologies to reroute complex, data-intensive queries to the integrated IBM PureData System for Analytics appliance. But the key point is that nothing has changed for the end user when compared to their traditional DB2 database, except that suddenly the analytical queries run much faster than before, with MIPS consumption decreasing. So, although you have two machines, they compound into a single system for the end users. And from the point of view of the administrators, there is no added complexity either.

DB2 Analytics Accelerator includes breakthrough technologies to reroute complex, data-intensive queries to the integrated IBM PureData System for Analytics appliance. But the key point is that nothing has changed for the end user when compared to their traditional DB2 database, except that suddenly the analytical queries run much faster than before . . .

In the words of one of our customers: “We are surprised how easy it is to manage such a system, it is really ‘plug-and-play’ and the results are awesome”.
This concludes my introduction. In following posts I’ll explain how DB2 Analytics Accelerator works, as well as real life experiences with it. To learn more, visit the DB2 Analytics Accelerator page on ibm.com.

Meanwhile, you can leave your comments, share your experience or join me on a conversation on Twitter.

See additional posts

IBM DB2 Analytics Accelerator: OLTP and OLAP in the same system at last! (part two)

About Isaac,

Isaac Moreno NavarroIsaac is working as a data warehouse and Big Data technical pre-sales professional for IBM, covering customers in Spain and Portugal, where his special focus is on PureData for Analytics. He joined IBM in 2011, through the Netezza acquisition. Before that, he has held several positions in pre-sales and professional services in companies such as Oracle, Sun Microsystems, Netezza and other Spanish companies. During the years previous to working at IBM, he has acquired a diverse experience with different software tools (databases, identity management products, geographical information systems, manufacturing systems…) in a very diverse set of projects. He also holds a Master of Science Degree in Computer Science.
Advertisements

Making faster decisions at the point of engagement with IBM PureData System for Operational Analytics

by Rahul Agarwal

The need for operational analytics
Today, businesses across the world face challenges dealing with the increasing cost and complexity of IT, as they cope with the growing volume, velocity and diversity of information. However, organizations realize that they must capitalize on this information through the smart use of analytics to meet emerging challenges and uncover new business opportunities.

… analytics needs to change from a predominantly back-office activity for a handful of experts to something that can provide pervasive, predictive, near-real-time information for front-line decision makers.

One thing that is increasingly becoming clear is that analytics is most valuable when it empowers individuals throughout the organization. Therefore, analytics needs to change from a pre-dominantly back-office activity for a handful of experts to something that can provide pervasive, predictive, near-real-time information for front-line decision makers.

Low latency analytics on transactional data, or operational analytics, provide actionable insight at point of engagement, giving organizations the opportunity to deliver impactful and engaging services faster than their competition. So what should one look for in an operational analytics system?

Technical capabilities
A high percentage of queries to ‘operational analytics’ systems—often up to 80% — are interactive lookups that are focused on data about a specific customer, account or patient. To deliver the correct information as rapidly as possible, systems must be optimized for the right balance of analytics performance and operational query throughput.

… systems must be optimized for the right balance of analytics performance and operational query throughput.

IT requirements
In order to maximize the benefits of operational analytics, one needs a solution that will quickly deliver value, better performance, scale and efficiency – while reducing the need for IT experts who design, integrate and maintain IT systems. In addition, one should look for a system, which comes with deep levels of optimization to achieve the desired scale, performance, and service quality, since assembling the right skills to optimize these systems is a costly and often difficult endeavour.

Flexibility
The ideal system should provide analytic capabilities to deliver rapid and compelling return on investment now; and this system must grow to meet new demands so that it remains as relevant and powerful in the future as it is today. In addition, the system should have the flexibility to meet these demands without disrupting the free-flow of decision support intelligence to the individuals and applications driving the business.

IBM PureData System for Operational Analytics
The IBM PureData System for Operational Analytics helps organizations meet these complex requirements with an expert integrated data system that is designed and optimized specifically for the demands of an operational analytics workload.
Built on IBM POWER Systems servers with IBM System Storage and powered by IBM DB2 software, the system is a complete solution for operational analytics that provides both the simplicity of an appliance and the flexibility of a custom solution. The system has recently been refreshed with latest technology that will help customers to make faster, fact-based decisions ¬and now offers:

  • Accelerated performance with the help of new, more powerful servers that leverage POWER8 technology and improved tiered storage which uses spinning disks for ‘cool’ data and IBM FlashSystemTM storage for the ‘hot’ or frequently accessed data.
  • Enhanced scalability that allows the system to grow to peta-scale capacity. In addition, nodes of the refreshed system can be added to previous generation of PureData System for Operational Analytics thus providing better protection for your technology investment.
  • A reduced data center footprint as a result of increased hardware density.

So explore the benefits and use cases of PureData System for Operational Analytics by visiting our website, ibm.com/software/data/puredata/operationalanalytics as well as connecting with IBM experts.

About Rahul Agarwal

Rahul AgarwalRahul Agarwal is a member of the worldwide product marketing team at IBM that focuses on data warehouse and database technology. Rahul has held a variety of business management, product marketing, and other roles in other companies including HCL Technologies and HP before joining IBM.  Rahul studied at the Indian Institute of Management, Kozhikode and holds a bachelor of engineering (electronics) degree from the University of Pune, India. Rahul’s Twitter handle :  @rahulag80


 

Why Simple, by Design, is Still Better

By Rich Hughes,

“Simplicity is the ultimate sophistication” is the succinct design dictum by perhaps the greatest designer of all time, Leonardo da Vinci. A proficient aviation engineer took the simplicity standard to another level by coining the design phrase “Keep it simple stupid”. Lockheed engineer Kelly Johnson invented the KISS principle because his project responsibility demanded ease-of-use for flight operations. The 1950s era aircraft, once created, had to be maintained by mechanics of average skill level, using an elementary tool kit, while working under combat conditions. In the 1970s, Johnson’s simplicity required aircraft engineering methodology inspired programmers as a leading design philosophy for the emerging software industry.

In a May 2014 article entitled Simple is STILL Better, Mike Kearney describes data warehouse implementation using the KISS software design philosophy.   Kearney draws a distinction between the first generation of data warehouses built on general purpose Relational Database Management Systems (RDBMS), and the second generation data warehouse— most prominently represented by the PureData System for Analytics family.

The earlier RDMBS platforms started around 1980, while the latter systems, powered by Netezza technology, came to the market in the early 2000s. The data warehouses built on the 1980s era, general purpose RDBMS have historically suffered from not being able to easily handle the heavy I/O demanded from users accessing large data volumes.  Conversely, the PureData System for Analytics was specifically designed to overcome the I/O problem,  provide data warehouse users with fast access,  and deliver the results in a simple to use platform.

Think of an NFL football stadium built long ago, and later refitted for the modern game. Chicago’s Soldier field was erected in 1924 at a cost of $14,000,000, and seated about 74,000 fans. Soldier Field’s exterior remained while the interior was gutted during extensive renovations that brought the stadium up to NFL standards, but reduced the seating to 63,000. The team was forced to find a temporary home during the 2001-03 renovation, and Bear’s fans travelled the 140 miles to Champaign, IL, for the 2002 Chicago ‘home’ games.

First generation RDBMS data warehouses need extensive renovation, not unlike Soldier Field, to adapt to data warehouse demands. The PureData System for Analytics, built from the ground up, resembles the new, fan friendly home of the San Francisco 49ers. Both Levi’s Stadium and PureData System for Analytics were purpose-built around the design principles of easy user access and manageable operations.

Kearney’s article, which can be viewed at this link, highlights speed delivered with simplicity as the combined reason for the successful data warehouses that are powered by the purpose built Netezza technology. The burden for general purpose but refitted for data warehousing platforms, is the not-so-hidden complexity that drives up ownership costs and reduce business value.  The necessary, but non value adding administrative expenses limit the potential generated from data warehouses based on earlier technologies. On the other hand,  the “… IBM PureData System shields DBAs from the complications of data management and business users enjoy immediate access to their data as soon as their new system is installed”.

Several success stories validate the inherent advantages of speed delivered with simplicity:

“With the IBM PureData System for Analytics, we can reduce the time to analyze complex GIS data from days to minutes — a more than 98 percent improvement.” 

  • Steve Trammell, Strategic Alliances Marketing Manager, Corporate Alliances and IT Marketing at Esri

“Premier Healthcare Alliance has benefited from PureData System for Analytics in four ways. Its simpler administration, faster query response time, faster load times, as well as in database analytics”

  • Todd Wilkes, Vice President for Enterprise Solution Development for the Premier Healthcare Alliance.

 

There might be nostalgic appeal to watch football from historic, but retrofitted venues, just as there is inertia in standing pat with first generation data warehouse technology. But for the best business payback and time-to-value, the simply fast PureData System for Analytics is designed for your success.

About Rich Hughes,

Rich Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs– and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands.  Hughes is an IBM Marketing Program Manager for Data Warehousing, and you can follow him on Twitter at @rhughes134.

IBM Insight 2014: Data Warehousing and Analytics in the Era of Big Data

By Wendy Lucas,  

Rapidly evolving technology, demanding business requirements and a world where insight drives competitive advantage are continuing to change the definition of the traditional data warehouse.  Where does today’s data warehouse fit into the information management landscape?  Is your data warehouse evolving to fit the needs of big data and analytics?

We all know the world is changing.  Technology allows us to consume more data and generate new insight.  As technology allows IT to deliver more to an organization, businesses need to quickly generate insight from information to accelerate informed decision-making and meet new user demands for mobility and self-service.  Rushing to meet these demands can easily generate complexity within traditional data warehouse architectures.   Is your architecture getting too complex?

Yet at the same time IT solutions grow in complexity, organizations are still challenged with adding more capability and answering the following questions:

  • How should my architecture change to handle big data?
  • How do I deliver faster insight for my users?
  • How do I simplify operations so my resources can spend more time innovating?
  • How do I add new data or capability without impacting performance?

Meeting these challenges requires solutions that provide simplicity in terms of implementation and administration, and seamlessly integrate with other information sources to deliver on the value in big data analytics.   And let’s not forget performance.  Great performance is a given.  It’s expected.  Solutions must have the unique combination of speed and simplicity to deliver business results and provide a platform that is agile and flexible to adapt to rapidly changing business needs. 

IBM’s Data Warehousing Solutions

IBM solutions for data warehousing are developed to simplify and accelerate the delivery of insights for business analytics, helping you meet the challenges of data warehousing.   It is IBM’s point of view that the data warehouse plays a fundamental role in providing the data foundation for analytics, but that the data warehouse must be modernized to take advantage of new data sources and systems of engagement.   We define the modern data warehouse as the means by which organizations can intelligently store and quickly access data, and deliver data to where it provides the most value.

“Through 2020, over 90% of big data implementations will augment, not replace, existing data warehouses.”[1]

Data Warehouse modernization builds on your existing foundation.  It does not require a rip and replace.  We see clients modernize their environments by adding new data sources or capacity, increasing usage or analytic capability, accelerating performance for faster time to insight, and by exploiting new technology innovations such as Hadoop, in-memory and columnar technology to boost performance of analytics.  IBM offers solutions to meet each of these entry points to data warehouse modernization. 

Data Warehousing and Analytics at Insight 2014

The Data Warehousing and Analytics track at the IBM Insight 2014 conference is dedicated to helping you turn these information challenges into opportunities.   More than 40 Elective Sessions, 8 hands-on Labs and Expert Exchanges will feature key topics such as:

  • Data Warehouse architecture and modernization
  • Data Warehouse best practices
  • Integrating data warehouse and Hadoop
  • Accelerating time- to-value and speed to insight
  • Client sessions with experiences from various industries including consumer products, financial services, healthcare, insurance and retail
  • Product sessions featuring DB2 with BLU Acceleration, IBM PureData System for Analytics (powered by Netezza), IBM DB2 Analytics Accelerator, and Informix Warehouse

Sunday, October 26 is a full day dedicated to PureData-Netezza enthusiasts.  Don’t miss this full day dedicated to discussing technical tips and best practices for the PureData System for Analytics, powered by Netezza technology.  Start off your week with this comprehensive, immersive trip through Enzee Universe, followed by a cocktail reception.  Register today and add these two sessions to your schedule:

  • IWS-6951 Enzee Universe Part 1 – Technical Sessions & Best Practices
  • IWS-6952 Enzee Universe Part 2 – Business and Product Strategy

Be sure to catch these spotlight/keynote sessions:

  • IWM-4857 The State of Data Warehousing in the Big Data World
  • IWM-4859 Driving Analytics with Common Architectural Patterns

And here are just a few examples of other informative electives:

  • IWS-4681 IBM DB2 Analytics Accelerator:  trends and directions
  • IWA-5303 Designing an integrated Big Data and DW landscape with IBM industry models
  • IWS-5338 Why University of Toronto is loving BLU – faster, smaller and simpler in only a few hours
  • IWS-5571 Case study with Dick’s Sporting Goods on Oracle to IBM PureData System for Analytics migration
  • IWM-6069 Comparing the Total Cost of Ownership of PureData for Analytics to its’ competitors
  • IWS-6295 DB2 with BLU Acceleration: the deep dive on DB2’s new super fast, super easy engine for analytics
  • IWS-6326 What’s new with IBM PureData System for Analytics, powered by Netezza

… and many others.  In addition to elective sessions, there are Expo Hall exhibits, demos and meetings with subject matter experts.  Visit the Insight website more information, and then build your agenda using the Agenda Preview Tool.

[1] “The State of Data Warehousing in 2014”, June 19, 2014, Gartner.

About Wendy Lucas

Wendy Lucas is a Program Director for IBM Data Warehouse Marketing. Wendy has over 20 years of experience in data warehousing and business intelligence solutions, including 12 years at IBM. She has helped clients in a variety of roles, including application development, management consulting, project management, technical sales management and marketing. Wendy holds a Bachelor of Science in Computer Science from Capital University and you can follow her on Twitter at @wlucas001

Modernizing the Data Warehouse Client Experience

By Nancy Hensley,

The world of Data Warehousing continues to evolve.  Not only are users clamoring for analytics that run as fast as you can think but they also want better consumability and improved accessibility.  Basically- consumers of the data warehouse want superfast analytics that are very accessible, on demand, and highly simplified.

Ten years ago we all would have fallen off our chair at that request.  Data Warehouses were big, complex and expensive.  We were so busy managing them it was tough to keep the consumers of the warehouse happy. The challenge of the traditional warehouse stemmed from time to value, the ability to expand analytic services, speed and cost.  Because of those challenges, most organizations limited data warehouse services to business analysts, data scientists and a few savvy departments.  But that wasn’t the goal when we started the journey of data warehousing.

The idea of analytics for everyone was always the goal of data warehousing, however complexity slowed us down dramatically. The operational side of our clients needed real time performance with very low latency while many business units wanted complex reporting and analysis.  The combination of the two workloads made the architecture complex and difficult to manage both service levels and performance.

Time to value for data warehousing  got a significant boost with the introduction of the appliance and we were able to deliver services in days as opposed to months.  Complex reporting and in-database performance improved and in-memory allowed for analysis as fast as you can think.  All that said and improvements made, while the data warehouse appliance continues to provide great value and performance, the demand for analytics continues to grow at a rate that is usually greater than IT staff and budgets can keep up with.  The real value of the data warehouse is only realized when you can deliver business intelligence more pervasively across your business and to all touchpoints in your operations.  The traditional data warehouse model is challenged in it’s ability to scale to support this goal. 

Self-service has also become an increasingly important requirement.  The ability to seize the moment and do what if analysis, examine the competitive landscape, understand in real time brand sentiment or other short term requests that are difficult to accommodate in the on- premise world.  Hardware needs to be procured and performance of current analytics cannot be compromised to meet these more short term but important needs of self service.

Enter the Cloud in the world of Data Warehousing

As Data Warehousing moves to the cloud we see yet another evolution and step towards simplifying the delivery of business analytics for our clients.   IBM has been focused on capturing the enterprise data warehouse experience in the cloud by combining the best of our on-premise technology in a highly consumable cloud offering supporting both cloud and hybrid ground to cloud deployments.   Simple, powerful, agile and yes, secure! 

Now, not only does self service become possible, it becomes simple.  Analytic portability between on-premises and cloud is now possible so you can develop & analyze in the cloud and move to on-premise or the opposite.  And you do not need to sacrifice performance for simplicity because IBM’s solution allows you to leverage in-database capabilities with the power of in-memory in a secure, agile solution.

Sound too good to be true?

Attend the Data Warehouse Spotlight and hear about these shifts in data warehouse services in the cloud along with the latest improvements with our on-premises solutions as well.  The session will cover our latest in Data Warehousing Cloud solutions, data warehouse appliances and super-fast in-memory technology.

Join us for the Insight session #4857 and find out how IBM is changing the client experience for data warehouse services.  We will cover exciting new announcements 

More details about the session:

About Nancy,

Nancy Hensley has been in the data warehousing and BI industry for over 19 years. Nancy worked in the early days of enterprise data warehousing, spatial warehousing and executive reporting as a customer in a Fortune 50 company and joined IBM in 1999. In 2004, Nancy lead the team that brought the first IBM data warehouse appliance to market. From her position leading the data warehouse architect team in the field, Nancy moved into the development organization focusing on data warehouse solutions and database technology. Today Nancy works in product marketing and strategy for IBM data warehouse solutions. You can follow Nancy on Twitter @nancykoppdw.

The Relative Merits of IBM and Teradata Data Warehouses

By Rich Hughes,

A recent report concludes IBM® PureData™ System for Analytics (PDA) has distinct advantages over a comparable Teradata Data Warehouse appliance.  The International Technology Group (ITG) determined the three year cost of ownership of Teradata’s 2750 as 150% more expensive than IBM’s PDA N200x.  ITG surveyed 17 Teradata and 21 IBM customers, and matched their production data warehouses by workload, system size, and industry.  As an example, both Teradata and IBM customers in this study were categorized into Telecom, Digital Media, Financial Services, and Retail industry sectors.  ITG then further built their survey profiles based on data volumes, number of users, and workload similarities within each industry category—thus promoting reasonable comparisons based on real world observations. Starting with bottom line cost-of-ownership, the initial system expenditures for each customer industry profile were in the same ball park.  That means that IBM’s PDA lower cost advantages were found in deployment, support, and personnel costs over the systems’ life time.    For the application deployment area, IBM’s PDA delivered the lion’s share of their applications in 20 days or less.  As the chart below shows and by stark contrast, none of the Teradata customers deployed as quickly, with the Teradata average deployment being more than three months. 1

Rich Picture1

Getting an analytics application deployed months earlier speaks to time-to-value, and attaining more business objectives. The simpler to use IBM® PureData™ System for Analytics appliance proved to be less expensive to operate because of fewer full time employees managing the system.  The Teradata Data Warehouse appliance requires more administration training to understand how to achieve performance goals.  For the ITG study, the 17 Teradata organizations average 1.7 full time employees to oversee the Teradata data base, as compared to less than 0.5 full time employees to tend to IBM’s PDA, on average.  The primary reason for this significant gap is that IBM’s PDA end users are able to directly access their analytic appliance as a self service data warehouse. One of the ITG study participants using both Teradata and IBM’s PDA summed up the differences this way: “…we don’t have to build indexes… users write directly to the system, they don’t need to go through a DBA…we work with complete data sets instead of having everything aggregated and summarized first…we don’t have to use data models. In comparison with Teradata systems, performance-tuning overhead was said to be virtually non-existent.” 1 ITG determined that IBM customers realized additional benefits versus Teradata because of the ease in which end users deploy applications, creating “… the potential for closer business alignment than conventional data warehouse techniques.”1  In a world where vendors make significant claims of performance and total cost, it is hard to argue with real customer experience. To learn more about ITG’s findings, the full version of ITG’s report can be found at http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=WH&infotype=SA&appname=SWGE_WA_UZ_USEN&htmlfid=WAL12377USEN&attachment=WAL12377USEN.PDF#loaded 1.

About Rich Hughes,

Rich Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs– and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands.  Hughes is an IBM Marketing Program Manager for Data Warehousing, and you can follow him on Twitter at @rhughes134.

Big Data: Playing a Zone Offense

By Dennis Duckworth

Many companies these days find themselves on the defensive when it comes to Big Data. Some don’t really know what Big Data is. Others have an understanding of what it is but they aren’t sure whether they have a Big Data problem. Still others know what it is, and have identified a Big Data problem to deal with, but they just don’t know how.

Folks in IT are especially likely to be put on the defensive regarding Big Data because the term gets thrown at them in new “requests” – maybe from the VP of Sales who just read an article about it in the New York Times and is afraid that his competitors might already be “doing it” and gaining an advantage or maybe from the CMO who wants to be able to get a 360° view of his customers and he just heard in a vendor webinar that Big Data is the way to do that.

As they say, sometimes the best defense is a good offense. IBM has found that an effective way of proactively dealing with Big Data is dividing up the environment for dealing with it, along with the corresponding analytics, into different pieces or zones based on characteristics of the data and the analytics needed. One version of an analytics zone architecture is shown in the diagram below, with the main zones being: Real-time data processing and analytics zone; Operational data zone; Landing, Exploration, and Archive data zone; Enterprise data warehouse and Data mart zone; and Deep analytics data zone.

Denis

When you break up your analytics environment in this way, it allows you to break up your huge Big Data problem into more manageable chunks – you can see the individual trees instead of the huge forest and you can prune just those trees that need attention rather than attempting to clear-cut the entire forest.

As an example, in response to the request from the CMO asking for a more complete view of your company’s customers, you may want to add the ability to analyze unstructured data that you are now getting from Twitter or Facebook. You don’t necessarily need to mess with your Enterprise data warehouse or your tactical data marts– they may be working just fine, continuing to process all your highly valuable structured data with proper enterprise-class integration, data governance and security, supporting the tens/hundreds/thousands of BI reports your organization needs every day/week/month in order to run smoothly just as it has for the past few years. But you may want to consider adding a Landing, Exploration, and Archive data zone or adding to it if you already have one. There is some very useful information in those social media feeds but there is also a lot of junk – you wouldn’t want to try to convert all of that incoming social data into structured data and put it into your Enterprise data warehouse. Rather, you would likely want to put it into a landing area where you could do some exploration on it and uncover the valuable nuggets that you then might extract into a data warehouse.

Or maybe you are in the manufacturing business and would like to implement a more proactive servicing system for some of the heavy machinery in your plant. Right now, it is policy to shut down the machines and service them on a regular schedule but you’ve noticed that some of the machines are fine and could go on many more weeks without service while there are others that fail before the scheduled service because they needed more immediate attention — both cases result in lost productivity. You could implement a system that read the data coming off the many sensors that are on and in the machines and run predictive models to alert you to when the machines were likely to need maintenance. Putting that data into a data warehouse and running analytics on it there might work but you likely don’t want to store all that repetitive sensor data, particularly when the data is within normal specifications and ranges. Rather, your needs might be better served by adding a Real-time data processing and analytics zone in order to process the data as it flowed from the sensors rather than landing it and then deciding what to do.

Go on the offensive with Big Data. IBM helps customers with situations like this every day — we have a comprehensive set of products in our Big Data & Analytics portfolio to help address needs in all of these analytics zones and I invite you to explore them further here: http://www-01.ibm.com/software/data/bigdata/

And for a cool poster of IBM’s Big Data and Analytics “Zone” architecture, you can go to: http://public.dhe.ibm.com/software/data/sw-library/bda/zone/lib/pdf/28433_ArchPoster_Wht_Mar_2014_v4.pdf

About Dennis Duckworth

Dennis Duckworth, Program Director of Product Marketing for Data Management & Data Warehousing has been in the data game for quite a while, doing everything from Lisp programming in artificial intelligence to managing a sales territory for a database company. He has a passion for helping companies and people get real value out of cool technology. Dennis came to IBM through its acquisition of Netezza, where he was Director of Competitive and Market Intelligence. He holds a degree in Electrical Engineering from Stanford University but has spent most of his life on the East Coast. When not working, Dennis enjoys sailing off his backyard on Buzzards Bay and he is relentless in his pursuit of wine enlightenment. View all posts by Dennis Duckworth