One Cloud Data Warehouse, Three Ways

by Mona Patel

There’s something very satisfying about using a single, cloud database solution to solve many business problems.  This is exactly what BPM Northwest experiences with IBM dashDB when delivering Data and Analytics solutions to clients worldwide.

The exciting success with dashDB compelled BPM Northwest to share implementations and best practices with IDC.

In the webcast they team up to discuss the value and realities of moving analytical workloads to the cloud.   Challenges around governance, data integration, and skills are also discussed as organizations are very interested and driven to seize the opportunities of a cloud data warehouse.

In the webcast, you will hear three ways that you can utilize IBM dashDB:

  • New applications, with some integration with on-premises systems
  • Self-service, business-driven sandbox
  • Migrating existing data warehouse workloads

After watching the webcast, please think about how IBM dashDB use cases discussed can apply to your challenges and if a hybrid data warehouse is the right solution for you.

Want to give IBM dashDB on Bluemix a try?  Before you sign up for a free trial, take a tutorial tour on the IBM dashDB YouTube channel to learn how to load data from your desktop, enterprise, and internet data sources, and then see how to run simple to complex SQL queries with your favorite BI tool, or integrated R/R Studio. In fact, watch how IBM dashDB integrates with other value added Bluemix services such as Dataworks Lift and Watson Analytics so that you can bring together all relevant data sources for newer insights.

mona_blog

About Mona,

mona_headshotMona Patel is currently the Portfolio Marketing Manager for IBM dashDB, the future of data warehousing.  With over 20 years of experince analyzing data at The Department of Water and Power, Air Touch Communications, Oracle, and MicroStrategy, Mona decided to grow her career at IBM, a leader in data warehousing and analytics.  Mona received her Bachelor of Science degree in Electrical Engineering from UCLA.

Advertisements

Things you need to know when switching from Oracle database to Netezza (Part 3)

by Andrey Vykhodtsev

In my previous two posts I covered the differences in architecture between IBM PureData System for Analytics and Oracle Database, as well as differences in SQL. (See below for links.) In this post, I am going to cover another important topic – additional structures that speed-up data access.

Partitions, Indexes, Materialized Views

Oracle database relies on Indexes, Partitions and Materialized views for performance. In Oracle, indexes are designed 19712947_s_blue data arrow backgroundto speed-up point searches or range searches that touch a very small percentage of the data. Because of the B-Tree index structure, if you touch a large percentage of the data, using the index will be much slower than the full scan of the whole table. If you have this problem, then you probably have decided to use partitioning. In Oracle, Partitioning is a paid feature that goes only with certain editions. You also have Materialized views with which you can put results of the complex queries on disk for later re-use. These structures are designed with general purpose (analytical processing + transactional processing ) in mind, and can be complex and unwieldy to maintain.

By contrast, with PureData you have fewer worries. The trade-off, as I said in my first post, is that PureData is not a general-purpose system, but rather an analytical-processing system.

We use ZoneMaps in PureData instead of indexes. In essence, a ZoneMap is just a table of minimum and maximum values for all columns that have certain types. ZoneMaps are extremely compact, and they don’t need to be created or maintained. But this is not all. ZoneMap filtering takes place at the hardware level. (Remember mention of FPGA, Field Programmable Gate Arrays in my first post?) The system will not scan data that does not need to be scanned for a particular query. Therefore I/O is greatly reduced. If you update data or delete data based on a condition, ZoneMaps also are taken into account.

Because of ZoneMaps, you don’t need to partition your data. ZoneMaps take advantage of the natural ordering of data. For example, if you insert data daily, ZoneMap on the date field will become completely sorted. Range searches on this field will be extremely fast.

In addition to ZoneMaps, there are couple of other techniques you can use to optimize query access to a certain table. First is called CBT, Clustered Based Table. This is not a separate structure that needs to be maintained, but rather an internal table organization method. If you choose a table to be CBT, you can provide up to 4 fields, on which you will have extremely fast searches.

The only additional structure that PureData has is called “Materialized View”, but this is a bit different concept than in Oracle. In PureData, materialized view is a subset of columns from one table that can be sorted differently than the base table, therefore speeding up access on the sorted columns. Because materialized views are ZoneMapped, they have some properties of the indices, but they are not actually indices. Materialized views might be needed if you have “tactical queries”, queries that require fast and frequent access to small portions of data. Otherwise, you don’t usually need them.

In Conclusion

As you see, in PureData it is much simpler to maintain efficient data access. Instead of creating and maintaining indexes for the subset of columns on each table, PureData automatically creates ZoneMaps for you. I know from experience what a nightmare index maintenance in a large data warehouse might be. Partitioning is another technique that is not needed in PureData. Instead of indexes and partitions, we use much simpler structures, that are automatically maintained, and applied on hardware level (in FPGA), with the speed of streaming data.
In  my next posts, I am going to cover a few more topics that you need to be aware of when migrating from Oracle to PDA. Please stay tuned, and follow me on Twitter: @vykhand

Other posts in this series

About Andrey,
Andrey VykhodtsevAndrey Vykhodtsev is Big Data Technical Sales Expert covering Central and Eastern Europe Region in IBM. He has more than 12 years of experience in Data Warehousing and Analytics, and has worked as senior data warehouse developer, analyst, architect, consultant in multiple industries, including Financial sector and Telecommunications.

Leveraging In-Memory Computing For Fast Insights

By Louis T Cherian,

It is common knowledge that an in-memory database is fast, but what if you had an even faster solution?
Think of a next generation in-memory database, which is

  • Faster, with speed of thought analytics to get insights
  • Simpler, with reduced complexity and improved performance
  • Agile, with multiple deployment options and low risk for migration
  • Competitive, by delivering products to market much faster

We are talking about combination of innovations that make IBM BLU Acceleration, the next generation in-memory solution.

So, what really goes into making IBM BLU Acceleration, the next generation in-memory solution?

  • The in-chip analytics allows the data to flow through the CPU very quickly, making it faster than “conventional” in-memory solutions
  • With actionable compression, one can perform a broad range of operations on data, while it is still compressed
  • With data skipping, any data that is not needed to be touched to answer a query is skipped over and that results in dramatic performance improvements
  • The ability to run all operational reports on transactional data as it is captured with the help of shadow tables ,  arguably  the most notable feature in the  DB2 10.5 “Cancun Release”

To know more about leveraging in-memory computing for fast insights with IBM BLU Acceleration, watch this video: http://bit.ly/1BZq1lo

For more information, visit : http://www-01.ibm.com/software/data/data-management-and-data-warehousing/dw.html

About Louis T. Cherian,

Louis T. Cherian is currently a member of the worldwide product marketing team at IBM that focuses on data warehouse and database technology. Prior to this role, Louis has held a variety of product marketing roles within IBM, and in Tata Consultancy Services, prior to joining IBM.  Louis has done his PGDBM from Xavier Institute of Management and Entrepreneurship, and also has an engineering degree in computer science from VTU Bangalore.

IBM’s Point of View on Data Warehouse Modernization  

By Louis T. Cherian,

The world of Data Warehousing continues to evolve, with an unimaginable amount of data being produced each moment and advancement of technologies that allow us to consume this data.  This provides new capabilities for organizations to make better informed business decisions, faster.
To take advantage of this opportunity in today’s era of Big Data and the Internet of things, our customers really need to have a solid Data Warehouse modernization strategy. Organizations should look to optimize with new technology and capabilities like:

  • in-memory databases,  to speed analytics,
  • Hadoop to analyze unstructured data to enhance existing analytics,
  • Data warehouse appliances with improved capabilities and performance

To understand more about the importance of Data Warehouse Modernization and to get answers to questions like:

  • What is changing in the world of Data Warehousing?
  • Why should customers act now and what should they do?
  • What is the need for companies to modernize their Data Warehouse?
  • How are IBM Data Warehousing Solutions able to address the need of Data Warehouse Modernization?

Watch this video by the IBM Data Warehousing team to know more about the breadth and depth of IBM Data Warehouse solutions. For more information, you can visit our website .

About Louis T. Cherian,

Louis T. Cherian is currently a member of the worldwide product marketing team at IBM that focuses on data warehouse and database technology. Prior to this role, Louis has held a variety of product marketing roles within IBM, and in Tata Consultancy Services, prior to joining IBM.  Louis has done his PGDBM from Xavier Institute of Management and Entrepreneurship, and also has an engineering degree in computer science from VTU Bangalore.

 

Safety Insurance Company Gains a Better View of its Data And its Customers with IBM PureData System for Analytics and Cognos

By Louis T.Cherian,

The success of a firm in the highly competitive insurance industry depends not only on its ability to get new customers, but also retaining the most valuable customers. One way to do this is to offer these customers the most suitable policies at the best rates. So how can a company the size of Safety Insurance, which has been in existence since 1979 identify their most valuable customers?

And how could they maintain consistency when offering multi-policy incentives, given that they  deal in dozens of types of policies to millions of policyholders, when  the customer data is fragmented across numerous policy systems? Moreover, with its customer data fragmented across numerous policy systems, the actuaries were spending all their time building new databases instead of analyzing them, and eventually ending up in getting multiple versions of the truth, making it difficult for the business to make informed decisions.

This is where the combination of IBM PureData System for Analytics and Cognos opened up a whole new world of analytics possibilities that enables Safety Insurance to run their business more wisely and more efficiently.

How did they do it?

  • Switching to a powerful analytics solution

Safety Insurance teamed up with New England Systems, (IBM Business Partner) and decided to deploy the IBM PureData System for Analytics to provide a high-performance data warehouse platform that would unite data from all of its policy and claims systems. They also implemented IBM Cognos Business Intelligence to provide sophisticated automated analysis and reporting tools.

  • Accelerating delivery of mission-critical information

Harnessing the immense computing power of IBM PureData System for Analytics enables Safety Insurance to generate reports in a fraction of the time previously needed. Moreover, automating report generation enables actuaries to focus on their actual job which is analyzing figures rather than building and compiling and them. Automation also standardizes the reporting process, which improves consistency and reduces the company’s reliance on a particular analyst’s individual knowledge

  • Identifying and retaining high-value customers

By providing a “single view” of the customer across all types of insurance gives a new level of insight into customer relationships and total customer value. By revealing how valuable a particular policyholder is to the overall business, the company will be able to provide more comprehensive service, better combinations of products, and consistent application of multi-policy discounts.

To know more about this success story, watch this video where Christopher Morkunas (Data Governance Manager) from Safety Insurance Company talks about how they were able to gain a better view of their existing data and its products with the combination of IBM PureData System for Analytics and Cognos.

About Louis T. Cherian,

Louis T. Cherian is currently a member of the worldwide product marketing team at IBM that focuses on data warehouse and database technology. Prior to this role, Louis has held a variety of product marketing roles within IBM, and in Tata Consultancy Services, prior to joining IBM.  Louis has done his PGDBM from Xavier Institute of Management and Entrepreneurship, and also has an engineering degree in computer science from VTU Bangalore.

 

What Should You Look For In Your Cloud Data Warehouse?

By Rahul Agarwal,

The business benefits of cloud computing are well documented; according to an IBM study, organizations using cloud computing gain a competitive advantage over their peers and can generate two times more revenue and profit.[1]

But is the cloud the right place for data warehousing which has traditionally been deployed on-premise (requiring a significant investment in hardware infrastructure)? A study by the Aberdeen group finds that organizations are increasingly using cloud-based analytics to gain advantages such as four times faster business intelligence deployment times and have 50% more users actively engaged with analytics.[2]

So what parameters should you look for in your cloud data warehouse?

Simplicity

Your data warehouse on the cloud should let you focus on your data and your business problems, not the business of data warehousing (including tuning, planning and integration). It should be simple to set up; ideally providing ‘load-and-go’ simplicity. In addition, it should provide the ability to easily ingest data from a myriad of sources including structured, semi-structured (think JSON) and unstructured.

Speed

Speed-driven data and analytics practices are quickly emerging as a key source of competitive advantage for companies across the world.[3] Hence, it is extremely important for you to try to minimize the time it takes to convert raw data that exists in your enterprise into actionable insight. Today a number of high performance technologies like in-memory computing and in-database analytic capabilities provide the ability to analyze data with high speed and precision. By running the analytics in the database where the data resides you will gain huge efficiencies. When you couple in-memory technology with analytics, you are able to get answers to your business questions as fast as you can think of the next question to ask – no waiting for analytic results to run.

Interoperability with business intelligence tools

Your cloud data warehouse should provide you the ability to write and execute your own analytic queries, or leverage other analytic and BI capabilities provided by tools such as Cognos, Looker, Aginity Workbench, Tableau, and others.  Integration with such tools will help you better visualize and interact with your data, enabling a richer business intelligence experience.

Security

Your cloud data warehouse should be designed to keep your data secure with the same rigor that has come to be expected from an on-premise data warehouse. Any security/data breach can put your business operations at risk and has the potential not only to damage your company’s reputation but also its top and bottom line.

Introducing dashDB

dashDB is a fully managed data warehouse in the cloud meets all of these criteria for simplicity, security and analytics in an instant. dashDB is simple to get up and running and helps you to build the capability to deliver answers to your business questions easily and as fast as you can think.

To learn more and get started with the freemium, check out www.dashdb.com.

About Rahul Agarwal

Rahul Agarwal is a member of the worldwide product marketing team at IBM that focuses on data warehouse and database technology. Rahul has held a variety of business management, product marketing, and other roles in other companies including HCL Technologies and HP before joining IBM.  Rahul studied at the Indian Institute of Management, Kozhikode and holds a bachelor of engineering (electronics) degree from the University of Pune, India. Rahul’s Twitter handle :  @rahulag80

[1] http://www-03.ibm.com/press/us/en/pressrelease/42304.wss

[2] https://www14.software.ibm.com/webapp/iwm/web/signup.do?source=sw-infomgt&S_PKG=ov26256

[3] http://www-935.ibm.com/services/us/gbs/thoughtleadership/2014analytics/

Data Warehousing – No assembly required

By Wendy Lucas,

In my last blog, I wrote about how big things come in small packages when talking about the great value that comes in the new PureData System for Analytics Mini Appliance.  I must be in the holiday spirit early because I’m going to stick with the holiday theme for this discussion.

Did youWL 1 ever receive a gift that had multiple components to it, maybe one that required a bunch of assembly before you got to truly enjoy the gift?   I’m not talking about Lincoln Logs (do they still sell those?) or Legos where the assembly is half the fun.

I’m talking about things like a child’s bicycle that comes with the frame, handle bar, wheels, tires, kickstand, seat, nuts and bolts as a bunch of parts inside a box.

What is more exciting? Receiving a box of parts or receiving the shiny red bicycle already assembled and ready to take for an immediate ride?

WL 2

In this world where we require instant satisfaction and immediate results, we don’t have time to assemble the bike. Do your system administrators have time to custom build a solution of hardware and software for your data warehouse?  Forget about that hardware and software being truly designed, integrated and optimized for analytic workloads.  What value are your users getting while the IT staff are doing that?  Do your DBAs have enough time to tune the system for every new data source that’s added or every new report requirement that one of your users needs?  We live in a world that demands agile response to changing requirements and immediate results.

Simple is still better for faster deployment

In this very complex world, simple solutions are better.  Just like the child preferring the bike that is already assembled and ready to go, the IBM PureData System for Analytics, powered by Netezza technology has been delivering on the promise of simplicity and speed for over a decade.  Don’t just take my word for it.  In a recent study, International Technology Group compared costs and time to value with PureData compared to both Teradata and Oracle.[i]   They researched customers deploying all three solutions and had some notable findings.  While over 75% of PureData customers deployed their appliances in under 3 weeks, not a single Teradata customer deployed in that same time frame and only one Oracle customer achieved that window.

Simple is still better for lower costs

Not only is the data warehouse appliance simple to deploy but it is architected for speed with minimal tuning or administration.  The same studies found that Teradata has 3.8x and Oracle 3.5x higher deployment costs than PureData System for Analytics and use more DBA resources to maintain the system.

Simple is still better, and now even more secure

The PureData System for Analytics N3001 series that was just announced has the same speed and simplicity of it’s predecessors, but adds improved performance, self-encrypting drives and big data and business intelligence starter kits.  The self-encrypting drives encrypt all user and temp data for added security without any performance overhead or incremental cost to the appliance.

For more anecdotal examples of why simple is still better, watch this video or you can read this white paper or visit ibm.com/software/data/puredata/analytics/ for more information.

[i] ITG: Comparing Costs and Time to Value with Teradata Data Warehouse Appliance, May 2014.

ITG: Comparing Costs and Time to Value with Oracle Exadata Database Machine X3, June 2014.

About Wendy,

Wendy Lucas is a Program Director for IBM Data Warehouse Marketing. Wendy has over 20 years of experience in data warehousing and business intelligence solutions, including 12 years at IBM. She has helped clients in a variety of roles, including application development, management consulting, project management, technical sales management and marketing. Wendy holds a Bachelor of Science in Computer Science from Capital University and you can follow her on Twitter at @wlucas001