Why Are Customers Architecting Hybrid Data Warehouses?

By Mona Patel

As a leader in IT, you may be  incented or mandated to explore cloud and big data solutions to transform rigid data warehousing environments into agile ones to match how the business really wants to operate.  The following questions must come to mind:

  • How do I integrate new analytic capabilities and data sets to my current on-premises data warehouse environment?
  • How do I deliver self service solutions to accelerate the analytic process?
  • How do I leverage commodity hardware to lower costs?

For these questions, and more, organizations are architecting hybrid data warehouses.  In fact, these organizations moving towards hybrid are referred to as ‘Best In Class’ according to The Aberdeen Group’s latest research: “Best In Class focus on hybridity, both in their data infrastructure and with their analytical tools as well.  Given the substantial investments companies have made in their IT environment, a hybrid approach allows them to utilize these investments to the best of their ability while explore more flexible and scalable cloud-based solutions as well.”  To hear more about these ‘Best In Class’ organizations, watch the 45 minute webcast.

How do you get to this hybrid data warehouse architecture with the least risk and most reward?  IBM dashDB delivers the most flexible, cloud database services to extend and integrate with your current analytics and data warehouse environment, addressing all the challenges related to leveraging new sources of customer, product, and operational insights to build new applications, products, and business models.

To help our clients evaluate hybrid data warehouse solutions, Harvard Research Group (HRG) provides an assessment of IBM dashDB.  In this paper, HRG highlights product functionality, as well as 3 uses cases in Healthcare, Oil and Gas, and Financial Services.   Security, Performance, High Availability, In-Database Analytics, and more are covered in the paper to ensure future architecture enhancements optimize IT rather than adding new skills, complexities, and integration costs. After reading this paper, you will find that dashDB enables IT to respond rapidly to the needs of the business, keep systems running smoothly, and achieve faster ROI.

To know more on dashDB check out the video below:


About Mona,

mona_headshotMona Patel is currently the Portfolio Marketing Manager for IBM dashDB, the future of data warehousing.  With over 20 years of analyzing data at The Department of Water and Power, Air Touch Communications, Oracle, and MicroStrategy, Mona decided to grow her career at IBM, a leader in data warehousing and analytics.  Mona received her Bachelor of Science degree in Electrical Engineering from UCLA.


Start Small and Move Fast: The Hybrid Data Warehouse

by Mona Patel

In the world of cutting edge big data analytics, the same obstacles in gaining meaningful insight still exists – ease of getting data in and getting data out.  To address these long standing issues, the utmost flexibility is needed, especially when layered with the agile needs of the business.

Why spend millions of dollars replacing your data and analytics environment with the latest technology promise to address these issues, when can you to leverage existing investments, resources, and skills to achieve the same, and sometimes better, insight?

Consider a hybrid data warehouse.  This approach allows you to start small and move fast. It provides the best of both worlds – flexibility and agility without breaking the bank.  You can RAPIDLY serve up quality data managed by your data warehouse, blended with newer data sources and data types in the cloud, and apply integrated analytics such as Spark or R – all without additional IT resources and expertise.  How is this possible?  IBM dashDB.

Read Aberdeen’s latest report on The Hybrid Data Warehouse.

mona's blog


Watch Aberdeen Group’s Webcast on The Hybrid Data Warehouse.

Let me give you an example.  We live in a digital world, with organizations now very interested in improving customer data capture across mobile, web, IoT, social media, and more for newer insights.  A telecommunications client was facing heavy competition and wanted to quickly deliver unique mobile services for an upcoming event in order to acquire new customers by collecting and analyzing mobile and social media data.  Taking a hybrid data warehouse approach, the client was able to start small and move fast, uncovering new mobile service options.

Customer information generated from these newer data sources were blended together with existing customer data managed in the data warehouse to deliver newer insights.  IBM dashDB provided a high performing, public cloud data warehouse service that was up and running in minutes.  Automatic transformation of unstructured geospatial data into structured data, in-memory columnar processing, in-database geospatial analytics, integration with Tableau, and pricing were some of the key reasons IBM dashDB was chosen.

This brings me back to my first point – you don’t have to spend millions of dollars to capitalize on getting data in and getting data out.  For example, clients like the one described above took advantage of Cloudant JSON document store integration, enabling them to rapidly get data into IBM dashDB with ease– no ETL processing required.  Automatic schema discovery loads and replicates unstructured JSON documents that capture IoT, Web and mobile-based data into a structured format.  Getting data or information out was simple, as IBM dashDB provides in-database analytics and the use of familiar, integrated SQL based tools such as Cognos, Watson Analytics, Tableau, and Microstrategy.  I can only conclude that IBM dashDB is a great example of how a highly compatible cloud database can extend or modernize your on-premises data warehouse into a hybrid one to meet time-sensitive business initiatives.

What exactly is a hybrid data warehouse?  A hybrid data warehouse introduces technologies that extend the traditional data warehouse to provide key functionality required to meet new combinations of data, analytics and location, while addressing the following IT challenges:

  • Deliver new analytic services and data sets to meet time-sensitive business initiatives
  • Manage escalating costs due to massive growth in new data sources, analytic capabilities, and users
  • Achieve data warehouse elasticity and agility for ALL business data


Still not convinced on the power of a hybrid data warehouse?  Hear what Aberdeen Group’s expert Michael Lock has to say in this 30 min webcast.

About Mona,


Mona Patel is currently the Portfolio Marketing Manager for IBM dashDB, the future of data warehousing.  With over 20 years of analyzing data at The Department of Water and Power, Air Touch Communications, Oracle, and MicroStrategy, Mona decided to grow her career at IBM, a leader in data warehousing and analytics.  Mona received her Bachelor of Science degree in Electrical Engineering from UCLA.

IBM Fluid Query: Extending Insights Across More Data Stores

by Rich Hughes

Since its announcement in March, 2015, IBM Fluid Query has opened the door to better business insights for IBM PureData System for Analytics clients. Our clients have wanted and needed accessibility across a wide variety of data stores including Apache Hadoop with its unstructured stores, which is one of the key reasons for the massive growth in data volumes. There is also valuable data in other types of stores including relational databases that are often “systems of record” and “systems of insight”. Plus, Apache Spark is entering the picture as an up-and-coming engine for real-time analytics and machine learning.

IBM is pleased to announce IBM Fluid Query 1.5 to provide seamless integration with these additional data stores—making it even easier to get deeper insights from even more data.

IBM Fluid Query 1.5 – What is it?

IBM Fluid Query 1.5 provides access to data in other data stores from IBM PureData System for Analytics appliances. Starting with Fluid Query 1.0, users were able to query and quickly move data between Hadoop and IBM PureData System for Analytics appliances. This capability covered IBM BigInsights for ApacheHadoop, Cloudera, and Hortonworks.

Now with Fluid Query 1.5, we add the ability to reach into even more data stores including Spark and such popular relational database management systems as:

  • DB2 for Linux, UNIX and Windows
  • dashDB
  • PureData System for Operational Analytics
  • Oracle Database
  • Other PureData System for Analytics implementations

Fluid Query is able to direct queries from PureData System for Analytics database tables to all of these different data sources and get just the results back—thus creating a powerful analytic capability.

IBM Fluid Query Benefits

IBM Fluid Query offers two key benefits. First, it makes practical use of data stores and lets users access them with their existing SQL skills. Workbench tools yield productivity gains as SQL remains the query language of choice when PureData System for Analytics and Hadoop schemas logically merge. IBM Fluid Query provides the physical bridge over which a query is pushed efficiently to where the data needed for that query resides—whether in the same data warehouse, another data warehouse a relational or transactional database, Hadoop or Spark.

Second, IBM Fluid Query enables archiving and capacity management on PureData-based data warehouses. With Fluid Query, users gain:

  • better exploitation of Hadoop as a “Day 0” archive that is queryable with conventional SQL;
  • capabilities to make use of data in a Spark in-memory analytics engine
  • the ability to easily combine hot data from PureData with colder data from Hadoop;
  • data warehouse resource management with the ability to archive colder data from PureData to Hadoop to relieve resources on the data warehouse.

Managing your share of Big Data Growth

The design point for Fluid Query is that the query is moved to the data instead of bringing massive data volumes to the query. This is a best-of-breed approach where tasks are performed on the platform best suited for that workload.

For example, use the PureData System for Analytics data warehouse for production quality analytics where performance is critical to the success of your business, while simultaneously using Hadoop or Spark to discover the inherent value of those full-volume data sources. Or, create powerful analytic combinations across data in other operational systems or analytics warehouses with PureData stores without having to move and integrate data before analyzing it.

IBM Fluid Query 1.5 is now generally available as a software addition to PureData System for Analytics clients. If you want to understand how to take advantage of IBM® Fluid Query 1.5, check out these resources:

About Rich,

Rich HughesRich Hughes is an IBM Marketing Program Manager for Data Warehousing.  Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs, and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands. You can follow him on Twitter: @rhughes134

dashDB is Here, and This Changes Everything for Data Warehousing on the Cloud

By Nancy Hensley

data warehosuing, analytics, dashDB, cloud, DaaS

I remember the days of actually selling the idea of a data warehouse to organizations, trying to convince them to leverage data to provide some insight to their business. Back then, it was a nice-to-have and we would celebrate every new customer in the “terabyte club”.

Now, I can’t help but laugh at that, because the little backup device on my desk is a terabyte. The fact is, things have changed. We have generated more data in the last few years than ever before and this data can be gold for our businesses. Gone are the days of convincing organizations that leveraging data is important, today it’s critical.

Data warehousing has seen a lot of disruption over the last decade but we did not get to the intersection of knowledge and opportunity fast enough. The architecture got too complex and we spent all our time managing performance and our businesses were losing the race.

Yes it’s a race. We have to spot trends faster, capitalize on opportunities before the competition, grow faster, optimize more, offer more services, be easy to do business with and be the best in class. To get there, you need analytics and you need them faster than ever before. Sometimes you just need the data warehouse infrastructure out of your way.

Something had to change to support the new business climate.

The good news is that something has changed. IBM has announced our latest disruption to change data warehousing for the better ….

Meet dashDB
dashDB is a data warehousing and analytics as a service offering on the cloud. dashDB offers robust analytics at incredibly fast processing speeds all in a cloud-easy format that lets you load and analyze data extremely quickly.

Yup it’s THAT EASY to win the race.

At the intersection of cloud technologies and analytics, dashDB represents the sweet spot for both IT and line of business professionals looking for a competitive edge in the data:

  • For IT professionals, dashDB helps quickly deliver solutions that the business needs without having to spend time with managing the infrastructure to serve a new request. dashDB works as part of the data warehousing strategy— no matter your starting point, from extending your on-premises warehouse to starting something completely new.
  • For line of business professionals, dashDB offers something you really need…self-service. That’s right, be the master of your own data kingdom. dashDB lets you load your data and get started with analytics in a couple of hours, taking the infrastructure out of your way. Yup, that’s right, out of your way. Imagine the possibilities! Got an idea? Go get it. Want a deeper understanding of your customers? No problem! Want to better predict challenges before they happen? We give you the keys to your crystal ball with best in class predictive capabilities.
  • For data science professionals, you can load data and work with queries, models and other analytics techniques without the hassle of CAPEX and dependencies on staff. dashDB includes R support, comes with an ecosystem of partners who provide specialized analytics capabilities and is Watson Analytics ready.

In short, it is pretty cool. So what is the technology behind the scenes that makes these fast, easy analytics possible? dashDB combines robust in-database analytic processing with in-memory technology that delivers a performance boost and the enterprise-class SoftLayer cloud infrastructure. dashDB is available on the Bluemix platform to work with a wide variety of data types, and there is a specialized version that works with Cloudant JSON data stores.

What are you waiting for? Get growing with dashDB at www.dashDB.com. You can start using dashDB as a freemium. Experience the future of data analytics. Deep analytics, fast processing and cloud-easy.

Advanced Security in an Insecure Data World

By Rich Hughes,

“Customer data may be at risk” is an all too familiar corporate acknowledgement these days, a communication event no enterprise wants to face.  Target lost personal information from at least 40,000,000 customers, stolen by thieves in late 2013.  This was followed and exceeded by Home Depot’s announcement last month that 56,000,000 customer bank cards used at the retailer’s 1,900 stores had been compromised.  Yes, Virginia, even your recent ice cream treat transaction at Dairy Queen has found its way into hacker’s hands. Most importantly, data breaches like these disrupt the trusted bond between a retailer and their customer, and as a consequence, top and bottom line numbers are negatively impacted.

Addressing security concerns for data warehouses, the IBM® PureData™ System for Analytics N3001 was announced for General Availability on October 17, 2014.   The N3001 appliance family brings advanced security to your data in this insecure world. Building on the appliance simplicity model, all data is stored on self encrypting  disk (SED) drives, providing security while not impacting performance. The protection provided by the SED implementation supports the industries requiring the strictest security compliance — health care, government, and the financial sectors.  This system utilizes strong authentication preventing threats due to unauthorized access, based on industry standard Kerberos protocol.

The N3001 Self Encrypting Drive protects your data-at-rest.  Both temporary data and user data tables are encrypted, and then this security level is bolstered by a key management scheme.

How does this work?  The SED disk drives are unlocked when the IBM® PureData™ System for Analytics N3001 ships to your data center.  And while the SED disk encryption is the first security level, an Advanced Encryption Standard (AES) compliant,  256 bit key needs to be created to cover all N3001 disks—both on the host and at the Snippet Processing Unit compartments.  This second security tier, the AES 256 bit key, can be initialized at any point after your data is loaded into the appliance.

The key management utility allows flexibility to update and rotate keys depending on the frequency of change dictated by your security policies.  This keyed approach is analogous to a password one uses to protect the disk data on a personal computer.  The Kerberos authentication, SED drives, and AES key management come as standard issue with the IBM® PureData™ System for Analytics N3001.

IBM’s InfoSphere Data Privacy for Security for Data Warehousing is a separately priced option that organizations should consider when dealing with compliance challenges.  This package will enforce separation of duties, and will report incidents covering user behavior tracked by an audit trail.  Additionally, a business glossary provides the organization with the ability to define and document sensitive data, along with the agreed upon access levels for the appropriate groups.  Data masking and making data fields autonomous, yet viewable by privileged user groups is also important functionality which comes with the InfoSphere Data Privacy for Security for Data Warehousing package.

The IBM® PureData™ System for Analytics N3001 features advanced security based on hardware and software improvements.  When coupled with IBM’s InfoSphere Data Privacy for Security for Data Warehousing (which monitors data going in and out of your data warehouse), you can rest assured your corporation’s sensitive information is protected from unwanted intruders.

More information on the IBM® PureData™ System for Analytics N3001 family can be viewed at this LINK.  There are numerous sessions at the upcoming IBM Insights 2014 Conference (October 26-30) which highlight the speed, simplicity, and security message as seen in many successful data warehouses powered by Netezza technology.  The IBM® PureData™ System for Analytics N3001 is again changing the game for data warehouse appliances.

About Rich Hughes,

Rich Hughes is an IBM Marketing Program Manager for Data Warehousing.  Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs, and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands. You can follow him on @rhughes134

Big things come in small packages

By Wendy Lucas,

It may seem like it’s too early to talk about the holiday season.  We haven’t even hit Halloween yet, but some stores have already put out their holiday decorations and retail wares.   It’s the season of wonder for many children.  Younger children are excited by lots of gifts and the bigger the gift, the better.  As we get older, we learn to appreciate the sentiment behind the gift and quickly learn that it isn’t the quantity or size of the gift that matters.

Great value coming in a small package can be seen with IBM’s new PureData System for Analytics N3001-001, also affectionately called the “Mini Appliance.”  Don’t be fooled by its name.  This mini appliance packs a big punch in terms of it’s value for data warehousing as well as surrounding capabilities for a complete big data and business intelligence solution.

Why did IBM add a mini appliance to the family?

The value of Netezza data warehouse appliances, and later the PureData System for Analytics, has been realized by organizations for over a decade.  These appliances are designed and built with integrated hardware and software components that are very quickly deployed, put in use without tuning and used to very quickly gain insight from information.   Organizations who deal with vast amounts of information benefit from its simplicity, speed, scalability and built-in analytics.

But who are we to say what is “large” in terms of data?   It’s pretty easy to see how a large multi-national bank who deals with hundreds of terabytes of information would benefit from an enterprise class appliance to help them manage all that data and get value from that information, but what about small and midsized organizations?  Organizations dealing with small numbers of terabytes have the same need to use analytics to drive competitive advantage in their industries.   Small and mid-sized organizations may not have the same level of budget or IT resources as larger enterprises, but what if they could have access to an appliance that is purpose built to make advanced analytics simpler, faster, more accessible and more affordable?  What if they could take advantage of the same simplicity, speed and smarts?

And we shouldn’t limit our thinking to small and mid-sized organizations in the context of stand alone companies.  There are also departments within larger enterprises who have the need for a simple and fast analytic solution, or maybe IT organizations who need this same value in a test or development system.   For these reasons, the Mini Appliance makes a great addition to the PureData System for Analytics family.

Introducing the PureData System for Analytics Mini Appliance

The PureData System for Analytics Mini Appliance changes the dynamics for mid-sized organizations looking to take advantage of a high performance data warehouse appliance.   The Mini is a production-ready, rack mountable appliance that can handle up to 16 TB of user data (assuming 4x compression).  It has all the same benefits of it’s larger PureData for Analytics siblings with full function Netezza Platform Software (NPS ) 7.2.  It is built, tested and packaged at the factory and comes ready to install in an existing data center rack.  The Mini has the same ease of use with load and go simplicity for customers to get up and running in hours.

You might think that the “mini” version of anything is somehow limited in its features.  Not in this case.  The Mini Appliance provides enterprise grade, highly available, fully redundant components with integrated support including “call home” capability for automatic detection and reporting of any hardware issues.

The PureData System for Analytics N3001 product line now includes software entitlements for big data and business intelligence adding further value to the appliance within the information management ecosystem.  This means the appliance comes with extra software (at no extra cost!) that can be installed by the customer to provide business intelligence capabilities through the use of Cognos Business Intelligence software, data integration through the use of InfoSphere DataStage, Hadoop data services through InfoSphere BigInsights and real-time streaming analytics with InfoSphere Streams.  These starter kits for big data and business intelligence are also included with the Mini Appliance.  Yes, that’s right.  You get all the core value of the data warehouse appliance itself, plus the ability to integrate and load data as well as build reports and analytic applications, all in one solution.

There is so much value in these extra software entitlements, there will be more blogs written on just that topic alone, but in the meantime, read more on the PureData System for Analytics webpage or check out the solution brief describing these great software additions.

Fast value, out of the gate

Initial experience with the Mini Appliance has been great.  Here are just a few comments from partners who have used the appliance.

“Small to midsize customers who are invested or considering investing in software-based data warehouse solutions should really look at the PureData Mini Appliance. The appliance offers significant performance improvements over other software-based data warehouses, in a simple and self-contained environment, with a fraction of the operational maintenance costs.”

― Liam O’Heir, Vice President of Sales, New England Systems

 “We had the new PureData System for Analytics up and running, delivering results in 24 hours.  The appliance is simple and value is recognized quickly without the need to worry about indexing and/or tuning.”

― Michael Schuckman, Director, Big Data and Analytics, Micro Strategies


“Mid-market companies have similar big data needs as large enterprises.  The PureData Mini Appliance enables ‘big company capability’ with best-in-class performance in a mid-market package.”

– John Lucas, Director of Solutions Delivery, Avnet Services

Just in time for the holidays

If a data warehouse appliance with extended capabilities for big data and business intelligence is not on your holiday wish list yet, maybe it should be?  Big things really do come in small packages.

For more information, watch this video, read the data sheet or visit ibm.com/software/data/puredata/analytics/.

About Wendy,

Wendy Lucas is a Program Director for IBM Data Warehouse Marketing. Wendy has over 20 years of experience in data warehousing and business intelligence solutions, including 12 years at IBM. She has helped clients in a variety of roles, including application development, management consulting, project management, technical sales management and marketing. Wendy holds a Bachelor of Science in Computer Science from Capital University and you can follow her on Twitter at @wlucas001

Self-Service Analytics, Data Warehousing, and Information Management in the Cloud

By James Kobielus,

As we approach IBM Insight 2014, we call your attention to IBM Watson Analytics. Announced last month, Watson Analytics will go into public beta in mid-November, not long after Insight.

Whether or not you plan to attend this year’s event in Las Vegas, we invite you to participate in the upcoming public beta, which we strongly believe you’ll find transformative. What IBM has done is to reinvent and thereby democratize the business-analytics experience for the cloud era.

With Watson Analytics, which you can try for yourselves at Insight, IBM has put the power of sophisticated visual, predictive and cognitive analytics directly into the hands of the any users, even the least technically inclined. With a “freemium” option that will be a permanent element of the service upon full launch, you will be able to gain no-cost, on-demand, self-service access to sophisticated analytical capabilities. Marketing, sales, operations, finance and HR professionals can gain answers they need from all types of data–without needing to enlist a professional data scientist in the effort.

Watson Analytics’ built-in capabilities for advanced data management ensure that data is accessible rapidly and that large volumes of data are handled with ease, utilizing an embedded cloud data warehouse that incorporates IBM’s sophisticated DB2 with BLU Acceleration in-memory/columnar technology. In addition, embedded data refinery services enable business people, without any reliance on IT, to quickly find relevant, easily consumable raw data and transform that into relevant and actionable information.

As an added incentive for attending Insight, IBM will make further announcements that extend the value of Watson Analytics and of the sophisticated cloud data-warehousing and data-refinement services that power this supremely accessible and useful analytic experience. With this forthcoming announcement on cloud data warehousing, IBM continues to change the experience of using analytics today for our clients. We are making it easier for clients to be data-driven organizations and take advantage of new opportunities faster.

We hope to meet you at Insight!

About James, 

James Kobielus is IBM Senior Program Director, Product Marketing, Big Data Analytics solutions. He is an industry veteran, a popular speaker and social media participant, and a thought leader in big data, Hadoop, enterprise data warehousing, advanced analytics, business intelligence, data management, and next best action technologies. Follow James on Twitter : @jameskobielus