by Rich Hughes
Since its announcement in March, 2015, IBM Fluid Query has opened the door to better business insights for IBM PureData System for Analytics clients. Our clients have wanted and needed accessibility across a wide variety of data stores including Apache Hadoop with its unstructured stores, which is one of the key reasons for the massive growth in data volumes. There is also valuable data in other types of stores including relational databases that are often “systems of record” and “systems of insight”. Plus, Apache Spark is entering the picture as an up-and-coming engine for real-time analytics and machine learning.
IBM is pleased to announce IBM Fluid Query 1.5 to provide seamless integration with these additional data stores—making it even easier to get deeper insights from even more data.
IBM Fluid Query 1.5 – What is it?
IBM Fluid Query 1.5 provides access to data in other data stores from IBM PureData System for Analytics appliances. Starting with Fluid Query 1.0, users were able to query and quickly move data between Hadoop and IBM PureData System for Analytics appliances. This capability covered IBM BigInsights for Apache™ Hadoop, Cloudera, and Hortonworks.
Now with Fluid Query 1.5, we add the ability to reach into even more data stores including Spark and such popular relational database management systems as:
- DB2 for Linux, UNIX and Windows
- PureData System for Operational Analytics
- Oracle Database
- Other PureData System for Analytics implementations
Fluid Query is able to direct queries from PureData System for Analytics database tables to all of these different data sources and get just the results back—thus creating a powerful analytic capability.
IBM Fluid Query Benefits
IBM Fluid Query offers two key benefits. First, it makes practical use of data stores and lets users access them with their existing SQL skills. Workbench tools yield productivity gains as SQL remains the query language of choice when PureData System for Analytics and Hadoop schemas logically merge. IBM Fluid Query provides the physical bridge over which a query is pushed efficiently to where the data needed for that query resides—whether in the same data warehouse, another data warehouse a relational or transactional database, Hadoop or Spark.
Second, IBM Fluid Query enables archiving and capacity management on PureData-based data warehouses. With Fluid Query, users gain:
- better exploitation of Hadoop as a “Day 0” archive that is queryable with conventional SQL;
- capabilities to make use of data in a Spark in-memory analytics engine
- the ability to easily combine hot data from PureData with colder data from Hadoop;
- data warehouse resource management with the ability to archive colder data from PureData to Hadoop to relieve resources on the data warehouse.
Managing your share of Big Data Growth
The design point for Fluid Query is that the query is moved to the data instead of bringing massive data volumes to the query. This is a best-of-breed approach where tasks are performed on the platform best suited for that workload.
For example, use the PureData System for Analytics data warehouse for production quality analytics where performance is critical to the success of your business, while simultaneously using Hadoop or Spark to discover the inherent value of those full-volume data sources. Or, create powerful analytic combinations across data in other operational systems or analytics warehouses with PureData stores without having to move and integrate data before analyzing it.
IBM Fluid Query 1.5 is now generally available as a software addition to PureData System for Analytics clients. If you want to understand how to take advantage of IBM® Fluid Query 1.5, check out these resources:
- IBM Fluid Query 1.5 solution brief
- IBM Fluid Query 1.5 overview video (2.5 mins)
- Virtual Enzee – The Logical Data Warehouse, Hadoop and PureData System for Analytics (based on Fluid Query 1.0; on demand webcast)
- Getting started with Fluid Query blog
- Learn about Fluid Query 1.0 blog
- Live chat August 4th on the new Fluid Query 1.5
Rich Hughes is an IBM Marketing Program Manager for Data Warehousing. Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs, and has been with IBM since 2004. Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University. Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands. You can follow him on Twitter: @rhughes134