Three session guides get you started with data warehousing at IBM Insight at World of Watson

Join us October 24 to 27, 2016 in Las Vegas!

by Cindy Russell, IBM Data Warehouse marketing

IBM Insight has been the premiere data management and analytics event for IBM analytics technologies, and 2016 is no exception.  This year, IBM Insight is being hosted along with World of Watson and runs from October 24 to 27, 2016 at the Mandalay Bay in Las Vegas, Nevada.  It includes 1,500 sessions across a range of technologies and features keynotes by IBM President and CEO, Ginni Rometty; Senior Vice President of IBM Analytics, Bob Picciano; and other IBM Analytics and industry leaders.  Every year, we include a little fun as well, and this year the band is Imagine Dragons.

IBM data warehousing sessions will be available across the event as well as in the PureData System for Analytics Enzee Universe (Sunday, October 23).  Below are product-specific quick reference guides that enable you to see at a glance key sessions and activities, then plan your schedule.  Print these guides and take them with you or put the links to them on your phone for reference during the conference.

This year, the Expo floor is called the Cognitive Concourse, and we are located in the Monetizing Data section, Cognitive Cuisine experience area.  We’ll take you on a tour across our data warehousing products and will have some fun as we do it, so please stop by.  There is also a demo room where you can see live demos and engage with our technical experts, as well as a series of hands-on labs that let you experience our products directly.

The IBM Insight at World of Watson main web page is located here.  You can register and then use the agenda builder to create your personalized schedule.

IBM PureData System for Analytics session reference guide

Please find the session quick reference guide for PureData System for Analytics here:

Enzee Universe is a full day of dedicated PureData System for Analytics / Netezza sessions that is held on Sunday, October 23, 2016.  To register for Enzee Universe, select sessions 3459 and 3461 in the agenda builder tool.  This event is open to any full conference pass holder.

During the regular conference, there are also more than 35 PureData, Netezza, IBM DB2 Analytics Accelerator for z/OS (IDAA) technical sessions across all the conference tracks, as well as hands on labs.  There are several session being presented by IBM clients so you can see how they put PureData System for Analytics to use.  Click the link above to see the details.

IBM dashDB Family session reference guide

Please find the session quick reference guide for the dashDB family here:

There are a more than 40 sessions for dashDB, including a “Meet the Family” session that will help you become familiar with new products in this family of modern data management and data warehousing tools.  There is also a “Birds of a Feather” panel discussion on Hybrid Data Warehousing, and one that describes some key use cases for dashDB.  And, you can also see a demo, take in a short theatre session or try out a hands-on lab.

IBM BigInsights, Hadoop and Spark session reference guide

Please find the session quick reference guide for BigInsights, Hadoop and Spark topics here:

There are more than 65 sessions related to IBM BigInsights, Hadoop and Spark, with several hands on labs and theatre sessions. There is everything from an Introduction to Data Science to Using Spark for Customer Intelligence Analytics to hybrid cloud data lakes to client stories of how they use these technologies.

Overall, it is an exciting time to be in the data warehousing and analytics space.  This conference represents a great opportunity to build depth on IBM products you already use, learn new data warehousing products, and look across IBM to learn completely new ways to employ analytics—from Watson to Internet of Things and much more.  I hope to see you there.


IBM Fluid Query 1.7 is Here!

by Doug Dailey

IBM Fluid Query offers a wide range of capabilities to help your business adapt to a hybrid data architecture and more importantly it helps you bridge across “data silos” for deeper insights that leverage more data.   Fluid Query is a standard entitlement included with the Netezza Platform Software suite for PureData for Analytics (formerly Netezza). Fluid Query release 1.7 is now available, and you can learn more about its features below.

Why should you consider Fluid Query?

It offers many possible uses for solving business problems in your business. Here are a few ideas:
• Discover and explore “Day Zero” data landing in your Hadoop environment
• Query data from multiple cross-enterprise repositories to understand relationships
• Access structured data from common sources like Oracle, SQL Server, MySQL, and PostgreSQL
• Query historical data on Hadoop via Hive, BigInsights Big SQL or Impala
• Derive relationships between data residing on Hadoop, the cloud and on-premises
• Offload colder data from PureData System for Analytics to Hadoop to free capacity
• Drive business continuity through low fidelity disaster recovery solution on Hadoop
• Backup your database or a subset of data to Hadoop in an immutable format
• Incrementally feed analytics side-cars residing on Hadoop with dimensional data

By far, the most prominent use for Fluid Query for a data warehouse administrator is that of warehouse augmentation, capacity relief and replicating analytics side-cars for analysts and scientists.

New: Hadoop connector support for Hadoop file formats to increase flexibility

IBM Fluid Query 1.7 ushers in greater flexibility for Hadoop users with support for popular file formats typically used with HDFS.Fluid query 1.7 connector picture These include popular data storage formats like AVRO, Parquet, ORC and RC that are often used to manage bigdata in a Hadoop environment.

Choosing the best format and compression mode can result in drastic differences in performance and storage on disk. A file format that doesn’t support flexible schema evolution can result in a processing penalty when making simple changes to a table. Let’s just  say that if you live in the Hadoop domain, you know exactly what I am speaking of. For instance, if you want to use AVRO, do your tools have readers and writers that are compatible? If you are using IMPALA, do you know that it doesn’t support ORC, or that Hortonworks and Hive-Stinger don’t play well with Parquet? Double check your needs and tool sets before diving into these popular format types.

By providing support for these popular formats,  Fluid Query allows you to import, store, and access this data through local tools and utilities on HDFS. But here is where it gets interesting in Fluid Query 1.7: you can also query data in these formats through the Hadoop connector provided with IBM Fluid Query, without any change to your SQL!

New: Robust connector templates

In addition, Fluid Query 1.7 now makes available a more robust set of connector templates that are designed to help you jump start use of Fluid Query. You may recall we provided support for a generic connector in our prior release that allows you to configure and connect to any structured data store via JDBC. We are offering pre-defined templates with the 1.7 release so you can get up and running more quickly. In cases where there are differences in user data type mapping, we also provide mapping files to simplify access.  If you have your own favorite database, you can use our generic connector, along with any of the provided templates as a basis for building a new connector for your specific needs. There are templates for Oracle, Teradata, SQL Server, MySQL, PostgreSQL, Informix, and MapR for Hive.

Again, the primary focus for Fluid Query is to deliver open data access across your ecosystem. Whether the data resides on disk, in-memory, in the Cloud or on Hadoop, we strive to enable your business to be open for data. We recognize that you are up against significant challenges in meeting demands of the business and marketplace, with one of the top priorities around access and federation.

New: Data movement advances

Moving data is not the best choice. Businesses spend quite a bit of effort ingesting data, staging the data, scrubbing, prepping and scoring the data for consumption for business users. This is costly process. As we move closer and closer to virtualization, the goal is to move the smallest amount of data possible, while you access and query only the data you need. So not only is access paramount, but your knowledge of the data in your environment is crucial to efficiently using it.

Fluid Query does offer data movement capability through what we call Fast Data Movement. Focusing on the pipe between PDA and Hadoop, we offer a high speed transfer tool that allows you to transfer data between these two environments very efficiently and securely. You have control over the security, compression, format and where clause (DB, table, filtered data). A key benefit is our ability to transfer data in our proprietary binary format. This enables orders of magnitude performance over Sqoop, when you do have to move data.

Fluid Query 1.7 also offers some additional benefits:
• Kerberos support for our generic database connector
• Support for BigInsights Big SQL during import (automatically synchronizes Hive and Big SQL on import)
• Varchar and String mapping improvements
• Import of nz.fq.table parameter now supports a combination of multiple schemas and tables
• Improved date handling
• Improved validation for NPS and Hadoop environment (connectors and import/export)
• Support for BigInsights 4.1 and Cloudera 5.5.1
• A new Best Practices User Guide, plus two new Tutorials

You can download this from IBM’s Fix Central or the Netezza Developer’s Network for use with the Netezza Emulator through our non-warranted software.


Take a test drive today!

About Doug,
Doug Daily
Doug has over 20 years combined technical & management experience in the software industry with emphasis in customer service and more recently product management.He is currently part of a highly motivated product management team that is both inspired by and passionate about the IBM PureData System for Analytics product portfolio.

Virtual Enzee webcast roundup for 2016

By Cindy Russell

The first Virtual Enzee webcast of 2016 is scheduled for January 29th!  I will be updating this blog during 2016 so you have a handy resource to find out what sessions are upcoming and also listen to the replays on demand.

  1. Unifying Data Access across the Logical Data Warehouse with IBM Fluid Query
    IBM Fluid Query helps bring your enterprise into focus and eliminate some of the traditional barriers that exist between Fluid query enzee for wordpressdisparate data in your enterprise. In this session, we’ll review some common user stories for using Fluid Query through the lens of PureData System for Analytics/Netezza and BigInsights. Register here:|

  3. Tame Spatial Queries with Netezza In-Database Analytics
    Attend this Virtual Enzee to learn how Netezza supports spatial data types and queries, how it can shorten complex spatial analytic projects and how it integrates with and complements existing geospatial platforms and solutions.  Register:

  5. Accelerating Open-Source R with IBM PureData System for Analytics (Netezza), January 29, 2016 at 11AM ET
    R is increasingly becoming the platform and programming language of choice for many data scientists. learn how you can leverage Open-Source R on your IBM PureData System for Analytics/Netezza appliances! Register here:


Build skills for 2016 and Beyond: Data Warehousing and Analytics Top 10 Resources

by Cindy Russell, IBM Data Warehouse Marketing

Skills are always an essential consideration in technical careers and it is important for data warehousing professionals to expand their knowledge to handle the proliferation of data types and volumes in 2016 and beyond.

These are my “top 10” resource picks that you may want to explore. I am choosing these because of their popularity and also because they represent new technologies you may face in 2016 as you modernize your data warehouse and extend it beyond its traditional realm to meet new analytics needs.

  1. Gartner Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics – I am recommending this report because it provides an overview of the trends, issues and marketplace leaders in data warehousing. It calls out the need for the Logical Data Warehouse, which is a key element of a modernization strategy. I believe the Logical Data Warehouse will be of increasing importance to your operations in the coming months. Read a summary of the report.
  2. Logical Data Warehouse – Due to the massive and rapid growth of data volumes and types, a single centralized data warehouse cannot meet all of the new needs for analytics by itself. The data warehouse now becomes part of a Logical Data Warehouse in which a set of “fit for purpose” stores are used to house a range of data. This blog by Wendy Lucas was published in 2014, but is still a good primer on the concept if you need one.
  3. IBM Fluid Query information and entitlement for PureData clients – In 2015, we released a series of “agile” announcements of IBM Fluid Query. This is a tool that PureData System for Analytics clients can use to query more data sources for deeper insights. This tool is a key element when you have a Logical Data Warehouse where data stores include Hadoop, databases, other data warehouses and more. PureData clients can take advantage of this technology as part of the entitlements. Start learning with our blog series and webcast.
  4. dashDB, data warehousing on the cloud – dashDB was launched in 2014 as the IBM fully managed data warehouse in the cloud. Some initial use cases cloud be: setting up self-service data science sandboxes, establishing test environments or cost-effectively housing data that is already external, such as social media feeds. dashDB is based on the Netezza and BLU Acceleration in-memory computing technologies. If you have workloads you want to place on the cloud, dashDB is a good solution. This webcast and a TDWI Checklist for cloud get you started.
  5. Hadoop and Big SQL – Hadoop is a scalable, cost-effective, open source file system that can store a range of structured or unstructured data as part of a Logical Data Warehouse. It can also be used to help you manage capacity on the data warehouse, for example as a queryable historical archive. Read this blog by our expert to learn the basics. IBM provides a free open source distribution, IBM Open Platform with Apache Hadoop. For those looking to augment the IBM Open Platform, IBM BigInsights adds enterprise-grade features including visualization, exploration and advanced analytics. Within the family is an implementation that includes Big SQL—enabling you to use familiar SQL skills to query data in Hadoop. Explore the above content options, then get started with a no charge trial.
  6. Apache Spark –IBM announced a major commitment to Apache Spark in June, 2015 and has already made available a series of Spark-based products and cloud services. You will be seeing more of Spark across the IBM Analytics portfolio, so it is a good technology to learn. Apache Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical Map Reduce program cannot provide, Spark is the alternative. It performs at speeds up to 100 times faster than Map Reduce for iterative algorithms or interactive data mining. Spark provides in-memory cluster computing for speed, and supports the Java, Scala, and Python APIs for ease of development. I recommend this no charge Big Data University course on Spark fundamentals.
  7. Update to IBM Netezza Analytics software – For those of you who are PureData System for Analytics clients, there is an update to the Netezza Analytics software. Doug Daily is one of our experts in this area, and he created an announcement blog to help you understand what new capabilities you can leverage.
  8. Virtual Enzee on demand webcasts – IBM offers webcasts on topics related to data warehousing and PureData System for Analytics. Browse the “Virtual Enzee” webcast library to stay up to date on PureData through these on demand webcasts.
  9. Learn Cognos Analytics for user self-service applications – Some of our clients use Cognos BI in conjunction with their data warehouses for super-fast reporting. Cognos Analytics was announced at IBM Insight as a guided, self-service capability that provides a personal approach to analytics. As your users are demanding more insights, self-service may be a sound solution to some of their needs. Browse the blog and web site to learn more.
  10. IBMGo on demand keynotes from IBM Insight – If you were unable to attend IBM Insight 2015, IBMGo brings some of the main sessions to you! It is a great way to learn about the bigger IBM Analytics solutions and points of view. Start here.

Tweet this blog

What’s new: Netezza Platform Software and INZA software for PureData Systems for Analytics

by Doug Dailey

The IBM PureData Systems for Analytics team has just released a new set of enhancements over current software versions of Netezza Platform Software (NPS), INZA and IBM Fluid Query. These include enhanced  integration, security, real-time analytics for z Systems and usability features, all included in our latest software suite that has been posted on Fix Central.

There will be something here for everyone, whether you are looking to increase security, gain more leverage with DB2 Analytics Accelerator for z/OS*, improve your day-to-day experience or integrate PureData System (Netezza technology) into a Logical Data Warehouse. This post covers the new capabilities and enhancements in NPS 7.2.1 and INZA 3.2.1 software.  Refer to my IBM Fluid Query 1.6 post  for more information.

Strengthening end-to-end security for PureData and DB2 Analytics Accelerator for z/OS

With the advent of self-encrypted disk drives in our N3001 model, we laid the groundwork for securing data at rest. Not only do you have state of the art disk encryption keys by Seagate and Hitachi at work from a hardware standpoint, but you also have added peace of mind through a second tier of security that protects host drives and those drives associated with the Snippet Processing Unit. A local keystore with flexible CLI on the N3001 system enabled you to protect your most valuable assets. This release adds support for KMIP, which now allows 3rd party and IBM targeted key management software to backup, store and manage host and SPU keys on your system. Additional attention was paid to hardening the host systems for the DB2 Analytics Accelerator powered by PureData.

Speaking of DB2 Analytics Accelerator, this release of NPS provides key functionality recently added to DB2 Analytics Accelerator in version 5.1 which incorporates Netezza Analytics as a core component to help accelerate the use of predictive analytics applications (e.g., SPSS) such as data mining and in-database modeling. By extending support for the mainframe EBCDIC code to INZA software with support for new sets of procedures, you can run real-time analytics on DB2 Analytics Accelerator and establish work areas for data scientists. In-database transformation supports IBM DataStage balanced optimization and ETL/ELT consolidation processing.

This optimized, integrated appliance has been hardened to not only support self-encrypting drives available through PureData Systems for Analytics N3001 systems, but it now accounts for encryption of data-in-motion by encrypting network with the mainframe, FIPS-enabled RHEL, LFTP and secure VPN. Updated performance around continuous load operations better supports enterprise clients running highly concurrent trickle-feed loads under heavy processing of simultaneous mixed workloads to ensure faster data synchronization and TTV for insights. EBCDIC support for Netezza Analytics provides the ability to execute sophisticated in-database algorithms on DB2 Analytics Accelerator that allow micro-analytics across transactional, historical and real-time data.  NPS software now supports the following algorithms: Decision Tree, Regression Tree, Naïve Bayes, K-means Clustering and Two-Step Clustering.

PureData IDAA images

Making life easier through an improved User Experience

If these aren’t enough, we also targeted some areas to improve overall user experience by providing tooling and support that will make life easier for DBAs, system administrators and application developers:

  • Improved throughput and consistency for trickle-feed and highly concurrent smaller load operations.
  • nzload enhancements reduce TTV and shorten ETL activities; recordDelimiter, newline, timestamp, merge, datedelim, timedelim, and monitor.
  • New merge capability improves RI and positions Oracle migrations to PureData System.
  • nzSQL for Windows greatly improves usability for managing PureData System from the Windows desktop environment.
  • nzSQL support for external remote tables allows users to run load/unload operations from Linux clients to/from a remote file rather than host-only loads.
  • PureData will natively support Microsoft .NET and open a new range of possibilities for partner solutions.
  • JDBC support for JDK 1.7. in both NPS and INZA software ensures support for latest Hadoop distributions and also for Fluid Query.
  • New 64-bit BNR connectors are now certified for the latest versions of Tivoli, Netbackup and EMC.
  • PureData improves uptime by reducing requirements to stop and start NPS when user connections are exceeded.
  • ODBC support is now available for comments through DSN, odbc.ini and connection string (single, multi, inline, nested comments), as well as support for the LIMIT clause.

SQL enhancements

We’ve incorporated support for newer Client Kit OS versions and platforms with this release. Support for Windows 8, Windows 2012 R2, Ubuntu, and a completely new Power PC RHEL client for Little Endian. Support for Power on Little Endian positions PureData Systems for IBM BigInsights and the IBM Open Platform. We have also included additional SQL support for:

  • Support for DROP TABLE IF EXISTS
  • Single slice support for JOINS with multi-column distribution keys
  • SQL push-down of NULL aware
  • New table-based Zone Maps

Client download of these new releases

NPS 7.2.1 and INXA 3.2.1 software is available at no charge to existing PureData clients. It can be easily downloaded from IBM Support Fix Central. Note that business partners and prospective clients can download and explore these new releases on Netezza Developer Network (additional information below).

fluid query download from fix central

Packaging and distribution

From a packaging perspective we refreshed IBM Netezza Platform Developer Software to this latest NPS 7.2.1 release to ensure the software suite is current from IBM’s Passport Advantage.

Supported Appliances Supported Software
  • N3001
  • N2002
  • N2001
  • N100x
  • C1000
  • Netezza Platform Software v7.2.1
  • Netezza Client Kits v7.2.1
  • Netezza SQL Extension Toolkit v7.2.1
  • Netezza Analytics v3.2.1
  • IBM Fluid Query v1.6
  • Netezza Performance Portal v2.1.1
  • IBM Netezza Platform Development Software v7.2.1

For the Netezza Developer Network we continue to expand the ability to easily pick up and work with non-warranted products for basic evaluation by refreshing the Netezza Emulator to NPS 7.2.1 with INZA 3.2.1. You will find a refresh of our non-warranted version of Fluid Query 1.6 and the complete set of Client Kits that support NPS 7.2.1.

NDN download button

Feel free to download and play with these as a prelude to PureData Systems for Analytics purchase or as a quick way to validate new software functionality with your application. We maintain our commitment to business partners working with our systems by maintaining the latest systems and software for you to access. Bring your application or solution and work to certify, qualify and validate them.

For additional information on Fluid Query 1.6, refer to my what’s new post.

* DB2 Analytics Accelerator for z/OS is a high-performance appliance that integrates the IBM z Systems infrastructure with IBM PureData™ for Analytics, powered IBM Netezza technology. The solution transforms your mainframe into a highly-efficient transactional and analytics processing environment. This enables clients to exploit z Systems data where it originates.

Doug Daily About Doug,
Doug has over 20 years combined technical & management experience in the software industry with emphasis in customer service and more recently product management.He is currently part of a highly motivated product management team that is both inspired by and passionate about the IBM PureData System for Analytics product portfolio.

Get smart on IBM Data Warehousing at IBM Insight 2015

A quick reference guide to IBM Data Warehousing sessions for BLU Acceleration in-memory database, PureData System for Analytics and new IBM Fluid Query

by Cindy Russell, IBM Data Warehouse Marketing

IBM Insight is always educational and fun, and this year is no exception. Many IBM technical experts and IBM clients will be presenting on a range of topics. This is an excellent opportunity to learn more about IBM products you already use, as well as products and technologies that you don’t. Here is a summary view of some keynotes, breakout sessions and events to consider as you plan your schedule.

I have included my “editor’s pick” sessions in boldface type. You can use the Insight session tool to find more detail on the sessions that interest you. And for those of you who have already registered for Insight, build your agenda now by logging into  Please note that session schedules are subject to change.

General sessions and Data Warehousing overview sessions

  • Data Management Keynote is Monday October 26, from 1 – 2 PM in the Mandalay Ballroom

Monday breakout sessions

  • DDW-3353: The Evolution of Data Warehousing is “Logical”
  • DDW-3361: Shifts in Data Warehousing and Enabling Self Service to Drive More Agile Analytics
  • DDW-2267: Gartner Perspective on Big Data and Enterprise Analytics

Tuesday breakout sessions

  • DDW-3983: Ford is Changing the Way the World Moves With IBM Big Data and Analytics

Thursday breakout sessions

  • DDW-2675: Which Analytic Model is Right for My Data? A Comparison of Modern Warehouse Architectures
  • DDW-2659: The Data Reservoir: More Than Storage, Optimizing Your Data for Insight
  • DDW-2739: Operational Analytics at the Speed of Thought: The Modern Enterprise Data Warehouse
  • DDW-1951: Model Driven Approaches to Consistently Managing and Governing the Logical Data Warehouse

Expo Hall events and demos

In addition to breakout sessions, there will be some informal talks and opportunities to connect with our experts in the Expo Hall. Here are the sessions that apply to IBM Data Warehousing products. For Expo Hall hours click here.

Monday events and information

  • Expo Hall booth number: 860
  • VAL-4125: AMA: How IBM Fluid Query Solves Your Complex Big Data and Analytics Problems
  • VAL-4126: 20m Talk How IBM Fluid Query Solves Your Complex Big Data and Analytics Problems
  • Demo room: FE-06 Fluid Query, DCM-15 IBM PureData for Analytics (INZA), DCM-17 IBM Industry Data Model, DCM-19 IBM DB2 with BLU Acceleration

Tuesday events and information

  • Expo Hall booth number: 860
  • DDW-4031: Meet the Experts IBM DB2 BLU and dashDB
  • Demo room: FE-06 Fluid Query, DCM-15 IBM PureData for Analytics (INZA), DCM-17 IBM Industry Data Model, CM-19 IBM DB2 with BLU Acceleration

Wednesday events events and information

  • Expo Hall booth number: 860
  • DDW-4013: IBM Fluid Query – Unifying Data Access Across the Logical Data Warehouse
  • DDW-4079: Meet the Experts: IBM PureData System for Analytics (Netezza)
  • Demo room: FE-06 Fluid Query, DCM-15 IBM PureData for Analytics (INZA), DCM-17 IBM Industry Data Model, CM-19 IBM DB2 with BLU Acceleration

DB2 with BLU Acceleration in-memory database

BLU Acceleration is in-memory computing technology in DB2 for Linux, UNIX and Windows. If you are experiencing slow reporting on data in structured databases, then BLU Acceleration can help you deliver results much more quickly. Clients report that queries that used to take hours now process in seconds using BLU Acceleration technology!

Here are some sessions to consider:

Monday breakout sessions

  • DDW-2619: What’s New in BLU Acceleration Tips and Insights on the Latest In Memory Columnar Technologies

Tuesday breakout sessions

  • DDW-1202: Implementing a Data Warehouse and BI Solution with DB2 BLU Acceleration, InfoSphere and Cognos
  • DDW-3916: Revitalize your Data Warehouse: Taking Advantage of the Latest Technologies (client presentation from Blue Cross and Blue Shield of Tennessee)
  •  DDB-3593: Scaling Up BLU Acceleration with Consistent Performance in a High
  • DDB-2815: Advances in Analytics Using DB2 with BLU Acceleration on Intel Architecture

Wednesday breakout sessions

  • DDW-1647: A Comparison Between DB2 with BLU Acceleration and Other In Memory Databases
  • DDW-2436: POWER Systems Running DB2 with BLU Acceleration: Delivering Top Performance

Thursday breakout sessions

  • DDW-3665: Wall Street Success Stories of DB2 with BLU Acceleration
  • DDW-2469: How DB2 with BLU Acceleration Helps a Bank Make Money: A Real World Data Analytics Case Study
  • DDW-2972: Apache Spark and DB2 with BLU Acceleration: Making ‘People Flow’ in Cities Measurable and Analyzable

PureData System for Analytics and IBM Fluid Query

PureData System for Analytics is a data warehousing appliance that delivers data service to today’s demanding analytic applications. It is offers built-in expertise, as well as integrated hardware, software and storage capabilities specifically for high performance data workloads. It simplifies procurement, installation and management so you can focus on other high-value projects. IBM Fluid Query is a new addition to PureData that lets you analyze more data sources such as Hadoop and many others for deeper insights. You can also download the Enzee Conference Guide for a list of ALL sessions with PureData/Netezza content here:

Monday breakout sessions

  • DDW-3366: PureData for Analytics/Netezza Data Warehouse Appliance – Overview and Update
  • DDW-2663: IBM Fluid Query The “Power” Behind the Data Reservoir/Logical Data Warehouse
  • DDW-3500: How a Digital Media Firm Uses PureData System for Analytics, Cognos, SPSS to Hone Creative Marketing

Tuesday breakout sessions

  • DDW-1909: One Query Drives It All Fluid Hadoop in the Unified Data Warehouse
  • DDW-1150: Performance Optimization With IBM PureData System for Analytics, powered by Netezza
  • DDW-1216: Mattel’s Big Data Ecosystem Journey: Beginning The Integration of Unstructured Data
  • DDW-3588: BB&T and Netezza: Practical and Best Practices for Building an Analytics Platform
  • DDW-3094: N3001 001 Mini Appliance The Most Affordable PureData System for Analytics

Wednesday breakout sessions

  • DDW-1164: Business Outcomes and Implementation Strategy for Enterprise Data Warehouse in Healthcare
  • DDW-2150: Integrating BigInsights and PureData System for Analytics With Query Federation and Data Movement
  • DDW-1213: IBM PureData System for Analytics Successfully Changed How the Blackhawk Network Leverages Data
  • DDW-3073: Werner Implements Netezza and Information Server to Enable Smarter Decision Making
  • DDW-1723: Improving PureData System for Analytics Performance at Kimberly Clark
  • DDW-2145: Experian Case Study Conversion From SQL Server to Netezza
  • DDW-2109: How IBM Fluid Query Solves Your Complex Big Data and Analytics Problems

Thursday breakout sessions

  • DDW-3094: N3001 001 Mini Appliance The Most Affordable PureData System for Analytics
  • DDW-3369: Insight Into Your PureData System for Analytics Appliance Using IBM Netezza Performance Portal Tool
  • DDW-2515: Realizing Solutions with IBM PureData System for Analytics
  • DDW-1708: PureData System for Analytics for regulatory reports, what you need on the top of the top technology

Enzee Universe

Don’t miss Enzee Universe on Sunday, October 25th.  Enzee Universe is a conference within the Insight conference dedicated to a full day of PureData – Netezza technology.  This event is free for all registered Insight attendees. Just add sessions 3967 and 3968 to your agendas!

  • DDW-3967: Enzee Universe Part 1 Technical Sessions and Best Practices
  • DDW-3968: Enzee Universe Part 2 Business Update and Product Strategy


IBM dashDB is a fully managed cloud data warehouse service. It offers massive scalability and performance through its MPP architecture, and is compatible with a wide range of business intelligence toolsets and analytics. dashDB’s integrated, in-database analytics let you quickly realize more value from your data. dashDB includes aspects of the Netezza and BLU Acceleration technologies.

Use the session tool to search on the dashDB keyword.

IBM DB2 Analytics Accelerator for z/OS

IBM DB2 Analytics Accelerator for z/OS is a high-performance appliance that integrates the IBM z Systems infrastructure with IBM PureData System for Analytics, powered by IBM Netezza technology. The solution transforms your mainframe into a highly-efficient transactional and analytics processing environment.

Learn more about the sessions for this product here.


How to get the most out of your PureData System for Analytics using Hadoop as a cost-efficient extension

By Ralf Goetz

Today’s requirements for collecting huge amounts of data are different from several years back when only relational databases satisfied the need for a system of record.

Now, new data formats need to be acquired, stored and processed in a convenient and flexible way. Customers need to integrate different systems and platforms to unify data access and acquisition without losing control and security.

The logical data warehouse

More and more relational databases and Hadoop platforms are building the core of a Logical Data Warehouse in which each system handles the workload which it can handle best. We call this using “fit for purpose” stores.

An analytical data warehouse appliance such as PureData System for Analytics is often at the core of this Logical Data Warehouse and it is efficient in many ways. It can host and process several terabytes of valuable, high-quality data enabling lightning fast analytics at scale. And it has been possible (with some effort) to move bulk data between Hadoop and relational databases using Sqoop – an open source component of Hadoop. But there was no way to query both systems using SQL – a huge disadvantage.

Two options for combining relational database and Hadoop

Why move bulk data between different systems or run cross-systems analytical queries? Well, there are several use cases for this scenario but I will only highlight two of them based on a typical business scenario in analytics.

The task: an analyst needs to find out how the stock level of the company’s products will develop throughout the year. This stock level is being updated very frequently and produces lots of data in the current data warehouse system implemented on PureData System for Analytics. Therefore the data cannot be kept in the system for more than a year (hot data). A report on this hot data indicates that the stock level is much too high and needs to be adjusted to keep stock costs low. This would normally trigger immediate sales activities (e.g. a marketing and/or sales campaign with lower prices).

“We need a report, which could analyze all stock levels for all products for the last 10+ years!”

Yet, a historical report, which could analyze all stock levels for all products for the last 10+ years would have indicated that the stock level at this time of the year is a good thing, because a high season is approaching. Therefore, the company would be able to sell most of their products and satisfy the market trend. But how can the company provide such a report with so much data?


The company would have 2 use case options to satisfy their needs:

  1. Replace the existing analytical data warehouse appliance with a newer and bigger one (This would cost some dollars and has been covered in another blog post.), or
  2. Use an existing Hadoop cluster as a cheap storage and processing extension for the data warehouse appliance (Note that a new, yet to be implemented Hadoop cluster would probably cost more than a bigger PureData box as measured by Total Cost of Ownership).

Option 2 would require a mature, flexible integration interface between Hadoop and PureData. Sqoop would not be able to handle this, because it requires more capabilities than just bulk data movement capabilities from Hadoop to PureData.

IBM Fluid Query for seamless cross-platform data access using standard SQL

These requirements are only two of the reasons why IBM has introduced IBM Fluid Query in March, 2015 as a no charge extension for PureData System for Analytics. Fluid Query enables bulk data movement from Hadoop to PureData and vice versa AND operational SQL query federation. With Fluid Query, data residing in Hadoop distributions from Cloudera, Hortonworks and IBM BigInsights for Apache Hadoop can be combined with the data residing in PureData using standard SQL syntax.

“Move and query all data, find the value in the data and integrate only if needed.”

This enables users to seamlessly query older, cooler data and hot data without the complexity of data integration with a more exploratory approach: move and query all data, find the value in the data and integrate only if needed.

IFQ_Goetz_graphic 2_566 x 243

IBM Fluid Query can be downloaded and installed as a free add-on for PureData System for Analytics.

Try it out today. IBM Fluid Query is technology that is available for PureData System for Analytics.  Clients can download and install this software and get started right away with these new capabilities.  Download it here on Fix Central. Doug Dailey’s “Getting Started with Fluid Query” blog for more information and documentation links to get started is highly recommended reading.  Update: Learn about Fluid Query 1.5, announced July, 2015.

IBM Fluid Query Minimum System Requirements

About Ralf,
Ralf GoetzRalf is an Expert Level Certified IT Specialist in the IBM Software Group. Ralf joined IBM trough the Netezza acquisition in early 2011. For several years, he led the Informatica tech-sales team in DACH region and the Mahindra Satyam BI competency team in Germany. He then became part of the technical pre-sales representative for Netezza and later for the PureData System for Analytics. Ralf is still focusing on PDA but is also supporting the technical sales of all IBM BigData products. Ralf holds a Master degree in computer science.

Do you want to learn more about Big Data and modern data warehousing?