What Should You Look For In Your Cloud Data Warehouse?

By Rahul Agarwal,

The business benefits of cloud computing are well documented; according to an IBM study, organizations using cloud computing gain a competitive advantage over their peers and can generate two times more revenue and profit.[1]

But is the cloud the right place for data warehousing which has traditionally been deployed on-premise (requiring a significant investment in hardware infrastructure)? A study by the Aberdeen group finds that organizations are increasingly using cloud-based analytics to gain advantages such as four times faster business intelligence deployment times and have 50% more users actively engaged with analytics.[2]

So what parameters should you look for in your cloud data warehouse?


Your data warehouse on the cloud should let you focus on your data and your business problems, not the business of data warehousing (including tuning, planning and integration). It should be simple to set up; ideally providing ‘load-and-go’ simplicity. In addition, it should provide the ability to easily ingest data from a myriad of sources including structured, semi-structured (think JSON) and unstructured.


Speed-driven data and analytics practices are quickly emerging as a key source of competitive advantage for companies across the world.[3] Hence, it is extremely important for you to try to minimize the time it takes to convert raw data that exists in your enterprise into actionable insight. Today a number of high performance technologies like in-memory computing and in-database analytic capabilities provide the ability to analyze data with high speed and precision. By running the analytics in the database where the data resides you will gain huge efficiencies. When you couple in-memory technology with analytics, you are able to get answers to your business questions as fast as you can think of the next question to ask – no waiting for analytic results to run.

Interoperability with business intelligence tools

Your cloud data warehouse should provide you the ability to write and execute your own analytic queries, or leverage other analytic and BI capabilities provided by tools such as Cognos, Looker, Aginity Workbench, Tableau, and others.  Integration with such tools will help you better visualize and interact with your data, enabling a richer business intelligence experience.


Your cloud data warehouse should be designed to keep your data secure with the same rigor that has come to be expected from an on-premise data warehouse. Any security/data breach can put your business operations at risk and has the potential not only to damage your company’s reputation but also its top and bottom line.

Introducing dashDB

dashDB is a fully managed data warehouse in the cloud meets all of these criteria for simplicity, security and analytics in an instant. dashDB is simple to get up and running and helps you to build the capability to deliver answers to your business questions easily and as fast as you can think.

To learn more and get started with the freemium, check out www.dashdb.com.

About Rahul Agarwal

Rahul Agarwal is a member of the worldwide product marketing team at IBM that focuses on data warehouse and database technology. Rahul has held a variety of business management, product marketing, and other roles in other companies including HCL Technologies and HP before joining IBM.  Rahul studied at the Indian Institute of Management, Kozhikode and holds a bachelor of engineering (electronics) degree from the University of Pune, India. Rahul’s Twitter handle :  @rahulag80

[1] http://www-03.ibm.com/press/us/en/pressrelease/42304.wss

[2] https://www14.software.ibm.com/webapp/iwm/web/signup.do?source=sw-infomgt&S_PKG=ov26256

[3] http://www-935.ibm.com/services/us/gbs/thoughtleadership/2014analytics/

Data Warehouse Modernization: Vetting Forrester’s Return-on-Investment Calculations

By James Kobielus,

It’s well known that, in a prior life, I was an industry analyst focusing on the data warehousing (DW) market. So I think I have a good mental radar for identifying high-quality, data-driven DW research when I see it.

When you’re researching the potential return on investment (ROI) for a DW, you have to be rigorously quantitative, precise, and comprehensive in your approach. Enterprises often place the DW at the very heart of their big data and analytics strategies. Solid ROI metrics must support DW projects of any scope, and the range of competing alternatives demands a decision-support framework that facilitates apples-to-apples comparisons. Many DW projects involve modernizations at various levels, so ROI calculations must be adept at characterizing the potential bottom-line impact of new technologies, platforms, tools, and practices.

Sure, anybody can pull ROI estimates out of thin air, but finding metrics that can help you make DW investments with confidence can prove tricky. In that regard, I’ve long felt that Forrester Consulting’s Total Economic Impact (TEI) methodology is the best ROI calculation framework for information-technology (IT) investments of any sort. Grounded in Forrester’s extensive survey- and interview-based research, its TEI studies incorporate fine-grained benefit, cost, risk, and flexibility variables into an underlying spreadsheet-based model. The drivers, use cases, assumptions, formulas, and data intrinsic to Forrester’s ROI calculations are totally transparent, so you can vet them for yourself. On any specific TEI use case under scrutiny, Forrester projects the resulting ROI analysis over a risk-adjusted 5-year horizon from the point of view a typical “composite organization” that uses the technology in question.

Back in my Forrester days, I sweated these details when constructing a now-outdated TEI study of the DW appliance market. So naturally I was very curious when, over the holiday season, IBM made available a new Forrester TEI covering our entire Information Management (IM) solution portfolio, but with a core focus on DW.

On my first pass through the report, I noticed the sorts of high-level rollup numbers that usually figure into most marketing collateral or blogs on these kinds of studies. Specifically, Figure 1 states a 5-year risk-adjusted return of 148% and total benefits (present value) of $31.2 million for the typical composite organization. Still being an analyst at heart, I drilled more deeply into the study itself to determine what exactly it refers to.

The first thing you see, from Figure 2, is that, among the three use cases in this Forrester TEI, “DW modernization” accounts for around $5m of the benefits, with “security intelligence extension” a little over $3m and a whopping ~$23m from “enhanced 360-degree view of the customer.” Clearly, all of those are essentially DW-related returns.

When vetting a TEI, it’s best to single out the specific use case of interest. In my case, I focused on Forrester’s DW modernization use case, which estimates the quantitative bottom line from cost reductions and value enhancements due to more efficient storage and processing, speedier performance, and agile analytics. These are in line with the chief DW modernization drivers cited in Figure 6, which were derived from Forrester’s in-depth decision-maker interviews.

In terms of concrete decision support for DW professionals evaluating modernization initiatives, the real payoff from this study is on pages 27-30. These spell out the full assumptions for the use case, including scope of solutions included, size of the composite organization’s IT budget, percentage of that budget allocated to data and storage, number and growth of terabytes of DW storage, percent reductions in storage cost, number of staff using big data analytics, and so on.

Pay close attention to the solution scope under DW modernization. Forrester took the right approach by not limiting their analysis to DWs in the older, much more limited sense of premises-based analytic databases specializing only in structured, at-rest data for operational business intelligence. As they state on page 27, they included the broader sweep of big-data analytics, information integration, and governance solutions in IBM’s IM solution portfolio.

If they’d gone with a traditional DW scope, such as the one this former analyst included in his 2010 study, Forrester would have ignored the substantial evolution that this marketplace has experienced in this decade. If Forrester had stuck with that scope in this latest study, it would probably have limited its TEI to IBM PureData for Analytics, IBM DB2 with BLU Acceleration, and IBM Digital Analytics Accelerator for System Z. But it did the right thing this time around (reflecting what our customers are doing) by including our Hadoop, streaming, discovery, and InfoSphere IIG offerings in the scope of a hybridized, cloud-focused DW infrastructure.

To see how far mainstream DW solutions have advanced into cloud-centric hybrid architectures, check out this blog I published a few months ago on the new IBM dashDB. I’m assuming that Forrester’s exclusion of dashDB, as well as Watson Analytics and DataWorks, from this recent study was due principally to their need to lock down their project’s scope many months ago before these specific solutions were launched.

For enterprise analytics and IT professionals, the DW modernization ROI that you calculate for your own situation depends on the assumptions you make and how you adjust the Forrester TEI model’s parameters to align with those. The beauty of the Forrester TEI methodology is that its model can be easily customized and use cases easily extended to do justice to the complex range of technologies in DW modernization initiatives. Depending on the project and your requirements, DW modernization may include various blends of new technologies (e.g., Hadoop, in-memory), new topologies (e.g., hybrid, distributed, and zone architectures), new sources (e.g., machine, social, & mobile data), new form factors (e.g., cloud, appliance), new tooling (e.g., governance, curation, archiving), new development frameworks (e.g., MapReduce), and new scaling and performance approaches (e.g., consolidation, compression, scale-out).

If I have any quibble with the latest Forrester TEI, it’s with their apparent exclusion of traditional DW use cases, such as operational BI (the focus of our Cognos portfolio), from their scope. Also, Forrester doesn’t give the newer DW use cases, such as in-database analytics for statistical modeling and data science (the focus of our SPSS portfolio), as much emphasis as I’d wish.

But those are just scoping issues that can be easily addressed if Forrester ever chooses to take this TEI analysis in those directions in coming years.

About James, 

James Kobielus is IBM Senior Program Director, Product Marketing, Big Data Analytics solutions. He is an industry veteran, a popular speaker and social media participant, and a thought leader in big data, Hadoop, enterprise data warehousing, advanced analytics, business intelligence, data management, and next best action technologies. Follow James on Twitter : @jameskobielus

Hybrid Data Warehousing – The Best of All Worlds

By Wendy Lucas, 

When it comes to data warehousing, organizations are progressing along the maturity curve at their own individual pace.  Today, most organizations have some form of warehouse and business intelligence in place, or recognize the need for it and the benefits it can drive.  But we all know that technology doesn’t stand still.  And so, you are now faced with a new step in your progression towards data warehouse maturity – the move to cloud.

Building Momentum

Cloud applications started with a fairly narrow focus.  A few years ago, you may have viewed the cloud as a viable platform for mobile applications or just a way to keep your contacts synchronized between your devices (by the way, that is still my favorite cloud use case).  IT organizations have begun looking to the cloud as a way to cut costs, but the strong momentum behind cloud adoption indicates there is more to it than that!

According to a recent IBM Tech Trend study, cloud adoption is up 92% since 2012.  The same study shows that organizations identified as pacesetters are 10x more likely to increase workforce efficiency with the cloud, 5x more likely to enhance communication and collaboration and report 4X better customer experience.   Pick your research outlet and you will find similar statistics.

One of Forrester’s top cloud computing predictions for 2015 is that “hybrid cloud management gets real” in terms of having the tools to allow you to manage across multiple on-premise and cloud platforms.

I believe that cloud use cases are the driving factor behind the growth and momentum of cloud technologies. Data warehousing on the cloud is no exception, where the general need is to deliver analytics to the organization faster.  Let’s explore specific data warehouse use cases.

Use case 1: development, testing, prototyping and sandboxing

A safe place to start might be establishing a cloud environment for warehouse development and testing.   Do you need the ability to test key functions like ETL processes or analytic applications without the need to setup more costly infrastructure on-premise?  Why not consider testing in the cloud? Perhaps you need an environment in which to do quick prototyping or sandboxing?   Whether it’s an environment that is a temporary or persistent, a cloud data warehouse instance can be quickly stood up and used for prototyping and sandboxing with very minimal cost.

Use case 2: Do more with less when when you are at capacity

Organizations are also considering cloud as a way to expand capacity of their existing data warehouse.  In the context of the logical data warehouse, data assets can reside on the cloud to serve up specific types of data to specific applications.

Use case 3: self-service analytics

Organizations can use the cloud as a data layer for self-service business intelligence and analytic capability, especially for applications that need data that’s already in the cloud, for example if your marketing organization need to analyze unstructured social media data.

Both IT and the line of business can benefits from these and other use cases.  IT organizations are able to reduce infrastructure costs and simplify budgets by shifting capital expense to an operational expense model.   Perhaps most importantly, the flexibility and agility of a cloud option provides faster time to insight for end users who need insight immediately.

What should I move to the cloud?

If cloud is so great, why not move everything to the cloud?  The reality is there are some applications that will remain on-premise for some time to come (or forever).  Systems that require large amounts of on-premise or sensitive data or that are generating large volumes of data may not be easily moved to the cloud.  It may make sense to leave these in the on-premise data warehouse systems that have matured over decades and are fulfilling the needs of those applications quite well.   But like discussed above, you may not want to incur capital expenditure or longer deployment times for things like data marts, development and test environments or analytics for data already in the cloud and these represent ideal opportunities to use a cloud data warehouse

There isn’t a one-size fits all answer, which is why hybrid environments make the most sense.  A hybrid environment can provides the best of all worlds – the ability to keep your large, on-premise warehouses in place, allow compliance with security and regulatory reporting, and fulfill the needs of traditional reporting and analysis.  , All of this is done while continuing to reduce costs and increase flexibility and speed of deployment for new applications in the cloud.  Just like most things, its best to pick the right tool for the job.

What tools can help me get there?

IBM data warehouse solutions offer the breadth and depth of capabilities required to effectively support a hybrid environment.   On cloud, IBM dashDB is our exciting new data warehouse and analytics as a service that concluded the beta program for Cloudant and enterprise plan on December 18th and is now generally available.  It pulls together the lightning fast performance of DB2 with BLU Acceleration with market leading in-database analytic capabilities from Netezza.  Think of it as the combination of the fastest data warehouse and analytic platform, combined with the flexibility and agility of the cloud.  dashDB will continue to evolve in a way that preserves analytic and application portability between the cloud and on-premise systems.  Most importantly, as you modernize with a hybrid cloud approach, the enterprise plan is available to support you at scale.

And of course on premises, IBM offers DB2 with BLU Acceleration as a software-only solution or the IBM PureData System for Analytics as a ready to go data warehouse appliance. Putting these pieces together, you can support your hybrid data warehousing needs with proven technologies that offer the best of all worlds.

For more information, please visit dashDB.com.

About Wendy,

Wendy Lucas is a Program Director for IBM Data Warehouse Marketing. Wendy has over 20 years of experience in data warehousing and business intelligence solutions, including 12 years at IBM. She has helped clients in a variety of roles, including application development, management consulting, project management, technical sales management and marketing. Wendy holds a Bachelor of Science in Computer Science from Capital University and you can follow her on Twitter at @wlucas001