Why Are Customers Architecting Hybrid Data Warehouses?

By Mona Patel

As a leader in IT, you may be  incented or mandated to explore cloud and big data solutions to transform rigid data warehousing environments into agile ones to match how the business really wants to operate.  The following questions must come to mind:

  • How do I integrate new analytic capabilities and data sets to my current on-premises data warehouse environment?
  • How do I deliver self service solutions to accelerate the analytic process?
  • How do I leverage commodity hardware to lower costs?

For these questions, and more, organizations are architecting hybrid data warehouses.  In fact, these organizations moving towards hybrid are referred to as ‘Best In Class’ according to The Aberdeen Group’s latest research: “Best In Class focus on hybridity, both in their data infrastructure and with their analytical tools as well.  Given the substantial investments companies have made in their IT environment, a hybrid approach allows them to utilize these investments to the best of their ability while explore more flexible and scalable cloud-based solutions as well.”  To hear more about these ‘Best In Class’ organizations, watch the 45 minute webcast.

How do you get to this hybrid data warehouse architecture with the least risk and most reward?  IBM dashDB delivers the most flexible, cloud database services to extend and integrate with your current analytics and data warehouse environment, addressing all the challenges related to leveraging new sources of customer, product, and operational insights to build new applications, products, and business models.

To help our clients evaluate hybrid data warehouse solutions, Harvard Research Group (HRG) provides an assessment of IBM dashDB.  In this paper, HRG highlights product functionality, as well as 3 uses cases in Healthcare, Oil and Gas, and Financial Services.   Security, Performance, High Availability, In-Database Analytics, and more are covered in the paper to ensure future architecture enhancements optimize IT rather than adding new skills, complexities, and integration costs. After reading this paper, you will find that dashDB enables IT to respond rapidly to the needs of the business, keep systems running smoothly, and achieve faster ROI.

To know more on dashDB check out the video below:

 

About Mona,

mona_headshotMona Patel is currently the Portfolio Marketing Manager for IBM dashDB, the future of data warehousing.  With over 20 years of analyzing data at The Department of Water and Power, Air Touch Communications, Oracle, and MicroStrategy, Mona decided to grow her career at IBM, a leader in data warehousing and analytics.  Mona received her Bachelor of Science degree in Electrical Engineering from UCLA.

Advertisements

What Should You Look For In Your Cloud Data Warehouse?

By Rahul Agarwal,

The business benefits of cloud computing are well documented; according to an IBM study, organizations using cloud computing gain a competitive advantage over their peers and can generate two times more revenue and profit.[1]

But is the cloud the right place for data warehousing which has traditionally been deployed on-premise (requiring a significant investment in hardware infrastructure)? A study by the Aberdeen group finds that organizations are increasingly using cloud-based analytics to gain advantages such as four times faster business intelligence deployment times and have 50% more users actively engaged with analytics.[2]

So what parameters should you look for in your cloud data warehouse?

Simplicity

Your data warehouse on the cloud should let you focus on your data and your business problems, not the business of data warehousing (including tuning, planning and integration). It should be simple to set up; ideally providing ‘load-and-go’ simplicity. In addition, it should provide the ability to easily ingest data from a myriad of sources including structured, semi-structured (think JSON) and unstructured.

Speed

Speed-driven data and analytics practices are quickly emerging as a key source of competitive advantage for companies across the world.[3] Hence, it is extremely important for you to try to minimize the time it takes to convert raw data that exists in your enterprise into actionable insight. Today a number of high performance technologies like in-memory computing and in-database analytic capabilities provide the ability to analyze data with high speed and precision. By running the analytics in the database where the data resides you will gain huge efficiencies. When you couple in-memory technology with analytics, you are able to get answers to your business questions as fast as you can think of the next question to ask – no waiting for analytic results to run.

Interoperability with business intelligence tools

Your cloud data warehouse should provide you the ability to write and execute your own analytic queries, or leverage other analytic and BI capabilities provided by tools such as Cognos, Looker, Aginity Workbench, Tableau, and others.  Integration with such tools will help you better visualize and interact with your data, enabling a richer business intelligence experience.

Security

Your cloud data warehouse should be designed to keep your data secure with the same rigor that has come to be expected from an on-premise data warehouse. Any security/data breach can put your business operations at risk and has the potential not only to damage your company’s reputation but also its top and bottom line.

Introducing dashDB

dashDB is a fully managed data warehouse in the cloud meets all of these criteria for simplicity, security and analytics in an instant. dashDB is simple to get up and running and helps you to build the capability to deliver answers to your business questions easily and as fast as you can think.

To learn more and get started with the freemium, check out www.dashdb.com.

About Rahul Agarwal

Rahul Agarwal is a member of the worldwide product marketing team at IBM that focuses on data warehouse and database technology. Rahul has held a variety of business management, product marketing, and other roles in other companies including HCL Technologies and HP before joining IBM.  Rahul studied at the Indian Institute of Management, Kozhikode and holds a bachelor of engineering (electronics) degree from the University of Pune, India. Rahul’s Twitter handle :  @rahulag80

[1] http://www-03.ibm.com/press/us/en/pressrelease/42304.wss

[2] https://www14.software.ibm.com/webapp/iwm/web/signup.do?source=sw-infomgt&S_PKG=ov26256

[3] http://www-935.ibm.com/services/us/gbs/thoughtleadership/2014analytics/

Data Warehouse Modernization: Vetting Forrester’s Return-on-Investment Calculations

By James Kobielus,

It’s well known that, in a prior life, I was an industry analyst focusing on the data warehousing (DW) market. So I think I have a good mental radar for identifying high-quality, data-driven DW research when I see it.

When you’re researching the potential return on investment (ROI) for a DW, you have to be rigorously quantitative, precise, and comprehensive in your approach. Enterprises often place the DW at the very heart of their big data and analytics strategies. Solid ROI metrics must support DW projects of any scope, and the range of competing alternatives demands a decision-support framework that facilitates apples-to-apples comparisons. Many DW projects involve modernizations at various levels, so ROI calculations must be adept at characterizing the potential bottom-line impact of new technologies, platforms, tools, and practices.

Sure, anybody can pull ROI estimates out of thin air, but finding metrics that can help you make DW investments with confidence can prove tricky. In that regard, I’ve long felt that Forrester Consulting’s Total Economic Impact (TEI) methodology is the best ROI calculation framework for information-technology (IT) investments of any sort. Grounded in Forrester’s extensive survey- and interview-based research, its TEI studies incorporate fine-grained benefit, cost, risk, and flexibility variables into an underlying spreadsheet-based model. The drivers, use cases, assumptions, formulas, and data intrinsic to Forrester’s ROI calculations are totally transparent, so you can vet them for yourself. On any specific TEI use case under scrutiny, Forrester projects the resulting ROI analysis over a risk-adjusted 5-year horizon from the point of view a typical “composite organization” that uses the technology in question.

Back in my Forrester days, I sweated these details when constructing a now-outdated TEI study of the DW appliance market. So naturally I was very curious when, over the holiday season, IBM made available a new Forrester TEI covering our entire Information Management (IM) solution portfolio, but with a core focus on DW.

On my first pass through the report, I noticed the sorts of high-level rollup numbers that usually figure into most marketing collateral or blogs on these kinds of studies. Specifically, Figure 1 states a 5-year risk-adjusted return of 148% and total benefits (present value) of $31.2 million for the typical composite organization. Still being an analyst at heart, I drilled more deeply into the study itself to determine what exactly it refers to.

The first thing you see, from Figure 2, is that, among the three use cases in this Forrester TEI, “DW modernization” accounts for around $5m of the benefits, with “security intelligence extension” a little over $3m and a whopping ~$23m from “enhanced 360-degree view of the customer.” Clearly, all of those are essentially DW-related returns.

When vetting a TEI, it’s best to single out the specific use case of interest. In my case, I focused on Forrester’s DW modernization use case, which estimates the quantitative bottom line from cost reductions and value enhancements due to more efficient storage and processing, speedier performance, and agile analytics. These are in line with the chief DW modernization drivers cited in Figure 6, which were derived from Forrester’s in-depth decision-maker interviews.

In terms of concrete decision support for DW professionals evaluating modernization initiatives, the real payoff from this study is on pages 27-30. These spell out the full assumptions for the use case, including scope of solutions included, size of the composite organization’s IT budget, percentage of that budget allocated to data and storage, number and growth of terabytes of DW storage, percent reductions in storage cost, number of staff using big data analytics, and so on.

Pay close attention to the solution scope under DW modernization. Forrester took the right approach by not limiting their analysis to DWs in the older, much more limited sense of premises-based analytic databases specializing only in structured, at-rest data for operational business intelligence. As they state on page 27, they included the broader sweep of big-data analytics, information integration, and governance solutions in IBM’s IM solution portfolio.

If they’d gone with a traditional DW scope, such as the one this former analyst included in his 2010 study, Forrester would have ignored the substantial evolution that this marketplace has experienced in this decade. If Forrester had stuck with that scope in this latest study, it would probably have limited its TEI to IBM PureData for Analytics, IBM DB2 with BLU Acceleration, and IBM Digital Analytics Accelerator for System Z. But it did the right thing this time around (reflecting what our customers are doing) by including our Hadoop, streaming, discovery, and InfoSphere IIG offerings in the scope of a hybridized, cloud-focused DW infrastructure.

To see how far mainstream DW solutions have advanced into cloud-centric hybrid architectures, check out this blog I published a few months ago on the new IBM dashDB. I’m assuming that Forrester’s exclusion of dashDB, as well as Watson Analytics and DataWorks, from this recent study was due principally to their need to lock down their project’s scope many months ago before these specific solutions were launched.

For enterprise analytics and IT professionals, the DW modernization ROI that you calculate for your own situation depends on the assumptions you make and how you adjust the Forrester TEI model’s parameters to align with those. The beauty of the Forrester TEI methodology is that its model can be easily customized and use cases easily extended to do justice to the complex range of technologies in DW modernization initiatives. Depending on the project and your requirements, DW modernization may include various blends of new technologies (e.g., Hadoop, in-memory), new topologies (e.g., hybrid, distributed, and zone architectures), new sources (e.g., machine, social, & mobile data), new form factors (e.g., cloud, appliance), new tooling (e.g., governance, curation, archiving), new development frameworks (e.g., MapReduce), and new scaling and performance approaches (e.g., consolidation, compression, scale-out).

If I have any quibble with the latest Forrester TEI, it’s with their apparent exclusion of traditional DW use cases, such as operational BI (the focus of our Cognos portfolio), from their scope. Also, Forrester doesn’t give the newer DW use cases, such as in-database analytics for statistical modeling and data science (the focus of our SPSS portfolio), as much emphasis as I’d wish.

But those are just scoping issues that can be easily addressed if Forrester ever chooses to take this TEI analysis in those directions in coming years.

About James, 

James Kobielus is IBM Senior Program Director, Product Marketing, Big Data Analytics solutions. He is an industry veteran, a popular speaker and social media participant, and a thought leader in big data, Hadoop, enterprise data warehousing, advanced analytics, business intelligence, data management, and next best action technologies. Follow James on Twitter : @jameskobielus

What the Future Holds for the Database Administrator (DBA)

By Rich Hughes,

Scanning the archives as far back as 2000 reveals articles speculating on the future of the DBA.  With mounting operational costs attributed to the day-to-day maintenance of data warehouses, even 15 years ago, this was a fair question to ask.  The overhead of creating indexes, tuning individual queries, on top of the necessary nurturing of the infrastructure had many organizations looking for more cost effective alternatives.

Motivated to fix the I/O bottleneck that traditionally handicapped data warehouses, and inspired by the design goals of reduced administration and easy data access for users, the data warehouse appliance was born.  Netezza built the original data warehouse appliance that by brilliantly applying hardware and software combinations, brought the query request much closer to the data.  This breakthrough paved the way for lower administrative costs and forced others in the data warehouse market to think of additional ways to solve the I/O problem.

To be sure, Netezza disruptive technology of no indexing, great performance, and ease of administration, left many DBAs feeling threatened.  But what was really threatened was the frustrating and never ending search for data warehouse performance via indexing.  Netezza DBAs got their nights and weekends back, and adjusted by making themselves more valuable to their organizations by using the time saved with no-indexing to get closer to the business.  Higher level skills taken on by DBAs included data stewardship and data modeling, and in this freer development environment, advanced analytics took root.  In the data warehouse appliance world, much more DBA emphasis was placed on the business applications because the infrastructure was designed to run for the most part, unassisted.

Fast forward to current day where the relentless pursuit of IT cost efficiencies while providing more business value continues.  Disruptive technologies in the past decade have been invented to fill this demand, like the Hadoop Ecosystem and the maturing Cloud computing environment.  Hardware advances have pushed in-memory computing, Solid State Drives are in the process of phasing out spinning disk storage, and 128 bit CPUs and operating systems are on the drawing boards.  Databases like IBM’s dashDB have benefitted by incorporating several of these newer hardware and software advances

So 15 years into the new Millennium what’s a DBA to do? Embrace change and realize there is plenty of good news and much data to administer.  While the Cloud’s Infrastructure and Platform services will decrease on-premise DBA work over time, the added complexity will demand new solutions for determining the right mixture of on, off, and hybrid premise platforms.  Juggling the organizational data warehouse work load requires different approaches if the Cloud’s elasticity and cheaper off-hour rates are to be leveraged.

Capacity planning and data retention take on new meaning in a world where, while it is now possible to store and access everything, what is the return value of all that information? The DBA will be involved in cataloging the many new data sources as well as getting a handle on the unstructured data provided by the Internet of Things.  Moving data, when to move data, to persist or not, how does this data interact with existing schemas are all good questions to be considered for the thoughtful DBA.  And that is just on the ingest side of the ledger.  Who gets access, what are the security levels, how can applications be rapidly developed, how does one re-use SQL in a NoSQL world, and how to best federate all this wonderful data are worthwhile areas for reasonable study.

In summary, the role of the Database Administrator has always been evolving, forced by technology advances and rising business demands.  The DBA has and will continue to be one that requires general knowledge of several IT disciplines, with the opportunity to specialize.  Historically the DBA, by keeping current, can go deeper in a particular technology– a move that benefits both their career and their organization’s needs.  The DBA can logically move into an architecture or Data Scientist position, the higher skill sets for today’s world.  What has not changed is the demand to deliver reliable, affordable, and valuable information.

About Rich Hughes,

Rich Hughes is an IBM Marketing Program Manager for Data Warehousing.  Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs, and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands. You can follow him on @rhughes134