Why Simple, by Design, is Still Better

By Rich Hughes,

“Simplicity is the ultimate sophistication” is the succinct design dictum by perhaps the greatest designer of all time, Leonardo da Vinci. A proficient aviation engineer took the simplicity standard to another level by coining the design phrase “Keep it simple stupid”. Lockheed engineer Kelly Johnson invented the KISS principle because his project responsibility demanded ease-of-use for flight operations. The 1950s era aircraft, once created, had to be maintained by mechanics of average skill level, using an elementary tool kit, while working under combat conditions. In the 1970s, Johnson’s simplicity required aircraft engineering methodology inspired programmers as a leading design philosophy for the emerging software industry.

In a May 2014 article entitled Simple is STILL Better, Mike Kearney describes data warehouse implementation using the KISS software design philosophy.   Kearney draws a distinction between the first generation of data warehouses built on general purpose Relational Database Management Systems (RDBMS), and the second generation data warehouse— most prominently represented by the PureData System for Analytics family.

The earlier RDMBS platforms started around 1980, while the latter systems, powered by Netezza technology, came to the market in the early 2000s. The data warehouses built on the 1980s era, general purpose RDBMS have historically suffered from not being able to easily handle the heavy I/O demanded from users accessing large data volumes.  Conversely, the PureData System for Analytics was specifically designed to overcome the I/O problem,  provide data warehouse users with fast access,  and deliver the results in a simple to use platform.

Think of an NFL football stadium built long ago, and later refitted for the modern game. Chicago’s Soldier field was erected in 1924 at a cost of $14,000,000, and seated about 74,000 fans. Soldier Field’s exterior remained while the interior was gutted during extensive renovations that brought the stadium up to NFL standards, but reduced the seating to 63,000. The team was forced to find a temporary home during the 2001-03 renovation, and Bear’s fans travelled the 140 miles to Champaign, IL, for the 2002 Chicago ‘home’ games.

First generation RDBMS data warehouses need extensive renovation, not unlike Soldier Field, to adapt to data warehouse demands. The PureData System for Analytics, built from the ground up, resembles the new, fan friendly home of the San Francisco 49ers. Both Levi’s Stadium and PureData System for Analytics were purpose-built around the design principles of easy user access and manageable operations.

Kearney’s article, which can be viewed at this link, highlights speed delivered with simplicity as the combined reason for the successful data warehouses that are powered by the purpose built Netezza technology. The burden for general purpose but refitted for data warehousing platforms, is the not-so-hidden complexity that drives up ownership costs and reduce business value.  The necessary, but non value adding administrative expenses limit the potential generated from data warehouses based on earlier technologies. On the other hand,  the “… IBM PureData System shields DBAs from the complications of data management and business users enjoy immediate access to their data as soon as their new system is installed”.

Several success stories validate the inherent advantages of speed delivered with simplicity:

“With the IBM PureData System for Analytics, we can reduce the time to analyze complex GIS data from days to minutes — a more than 98 percent improvement.” 

  • Steve Trammell, Strategic Alliances Marketing Manager, Corporate Alliances and IT Marketing at Esri

“Premier Healthcare Alliance has benefited from PureData System for Analytics in four ways. Its simpler administration, faster query response time, faster load times, as well as in database analytics”

  • Todd Wilkes, Vice President for Enterprise Solution Development for the Premier Healthcare Alliance.

 

There might be nostalgic appeal to watch football from historic, but retrofitted venues, just as there is inertia in standing pat with first generation data warehouse technology. But for the best business payback and time-to-value, the simply fast PureData System for Analytics is designed for your success.

About Rich Hughes,

Rich Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs– and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands.  Hughes is an IBM Marketing Program Manager for Data Warehousing, and you can follow him on Twitter at @rhughes134.

IBM Insight 2014: Data Warehousing and Analytics in the Era of Big Data

By Wendy Lucas,  

Rapidly evolving technology, demanding business requirements and a world where insight drives competitive advantage are continuing to change the definition of the traditional data warehouse.  Where does today’s data warehouse fit into the information management landscape?  Is your data warehouse evolving to fit the needs of big data and analytics?

We all know the world is changing.  Technology allows us to consume more data and generate new insight.  As technology allows IT to deliver more to an organization, businesses need to quickly generate insight from information to accelerate informed decision-making and meet new user demands for mobility and self-service.  Rushing to meet these demands can easily generate complexity within traditional data warehouse architectures.   Is your architecture getting too complex?

Yet at the same time IT solutions grow in complexity, organizations are still challenged with adding more capability and answering the following questions:

  • How should my architecture change to handle big data?
  • How do I deliver faster insight for my users?
  • How do I simplify operations so my resources can spend more time innovating?
  • How do I add new data or capability without impacting performance?

Meeting these challenges requires solutions that provide simplicity in terms of implementation and administration, and seamlessly integrate with other information sources to deliver on the value in big data analytics.   And let’s not forget performance.  Great performance is a given.  It’s expected.  Solutions must have the unique combination of speed and simplicity to deliver business results and provide a platform that is agile and flexible to adapt to rapidly changing business needs. 

IBM’s Data Warehousing Solutions

IBM solutions for data warehousing are developed to simplify and accelerate the delivery of insights for business analytics, helping you meet the challenges of data warehousing.   It is IBM’s point of view that the data warehouse plays a fundamental role in providing the data foundation for analytics, but that the data warehouse must be modernized to take advantage of new data sources and systems of engagement.   We define the modern data warehouse as the means by which organizations can intelligently store and quickly access data, and deliver data to where it provides the most value.

“Through 2020, over 90% of big data implementations will augment, not replace, existing data warehouses.”[1]

Data Warehouse modernization builds on your existing foundation.  It does not require a rip and replace.  We see clients modernize their environments by adding new data sources or capacity, increasing usage or analytic capability, accelerating performance for faster time to insight, and by exploiting new technology innovations such as Hadoop, in-memory and columnar technology to boost performance of analytics.  IBM offers solutions to meet each of these entry points to data warehouse modernization. 

Data Warehousing and Analytics at Insight 2014

The Data Warehousing and Analytics track at the IBM Insight 2014 conference is dedicated to helping you turn these information challenges into opportunities.   More than 40 Elective Sessions, 8 hands-on Labs and Expert Exchanges will feature key topics such as:

  • Data Warehouse architecture and modernization
  • Data Warehouse best practices
  • Integrating data warehouse and Hadoop
  • Accelerating time- to-value and speed to insight
  • Client sessions with experiences from various industries including consumer products, financial services, healthcare, insurance and retail
  • Product sessions featuring DB2 with BLU Acceleration, IBM PureData System for Analytics (powered by Netezza), IBM DB2 Analytics Accelerator, and Informix Warehouse

Sunday, October 26 is a full day dedicated to PureData-Netezza enthusiasts.  Don’t miss this full day dedicated to discussing technical tips and best practices for the PureData System for Analytics, powered by Netezza technology.  Start off your week with this comprehensive, immersive trip through Enzee Universe, followed by a cocktail reception.  Register today and add these two sessions to your schedule:

  • IWS-6951 Enzee Universe Part 1 – Technical Sessions & Best Practices
  • IWS-6952 Enzee Universe Part 2 – Business and Product Strategy

Be sure to catch these spotlight/keynote sessions:

  • IWM-4857 The State of Data Warehousing in the Big Data World
  • IWM-4859 Driving Analytics with Common Architectural Patterns

And here are just a few examples of other informative electives:

  • IWS-4681 IBM DB2 Analytics Accelerator:  trends and directions
  • IWA-5303 Designing an integrated Big Data and DW landscape with IBM industry models
  • IWS-5338 Why University of Toronto is loving BLU – faster, smaller and simpler in only a few hours
  • IWS-5571 Case study with Dick’s Sporting Goods on Oracle to IBM PureData System for Analytics migration
  • IWM-6069 Comparing the Total Cost of Ownership of PureData for Analytics to its’ competitors
  • IWS-6295 DB2 with BLU Acceleration: the deep dive on DB2’s new super fast, super easy engine for analytics
  • IWS-6326 What’s new with IBM PureData System for Analytics, powered by Netezza

… and many others.  In addition to elective sessions, there are Expo Hall exhibits, demos and meetings with subject matter experts.  Visit the Insight website more information, and then build your agenda using the Agenda Preview Tool.

[1] “The State of Data Warehousing in 2014”, June 19, 2014, Gartner.

About Wendy Lucas

Wendy Lucas is a Program Director for IBM Data Warehouse Marketing. Wendy has over 20 years of experience in data warehousing and business intelligence solutions, including 12 years at IBM. She has helped clients in a variety of roles, including application development, management consulting, project management, technical sales management and marketing. Wendy holds a Bachelor of Science in Computer Science from Capital University and you can follow her on Twitter at @wlucas001

Modernizing the Data Warehouse Client Experience

By Nancy Hensley,

The world of Data Warehousing continues to evolve.  Not only are users clamoring for analytics that run as fast as you can think but they also want better consumability and improved accessibility.  Basically- consumers of the data warehouse want superfast analytics that are very accessible, on demand, and highly simplified.

Ten years ago we all would have fallen off our chair at that request.  Data Warehouses were big, complex and expensive.  We were so busy managing them it was tough to keep the consumers of the warehouse happy. The challenge of the traditional warehouse stemmed from time to value, the ability to expand analytic services, speed and cost.  Because of those challenges, most organizations limited data warehouse services to business analysts, data scientists and a few savvy departments.  But that wasn’t the goal when we started the journey of data warehousing.

The idea of analytics for everyone was always the goal of data warehousing, however complexity slowed us down dramatically. The operational side of our clients needed real time performance with very low latency while many business units wanted complex reporting and analysis.  The combination of the two workloads made the architecture complex and difficult to manage both service levels and performance.

Time to value for data warehousing  got a significant boost with the introduction of the appliance and we were able to deliver services in days as opposed to months.  Complex reporting and in-database performance improved and in-memory allowed for analysis as fast as you can think.  All that said and improvements made, while the data warehouse appliance continues to provide great value and performance, the demand for analytics continues to grow at a rate that is usually greater than IT staff and budgets can keep up with.  The real value of the data warehouse is only realized when you can deliver business intelligence more pervasively across your business and to all touchpoints in your operations.  The traditional data warehouse model is challenged in it’s ability to scale to support this goal. 

Self-service has also become an increasingly important requirement.  The ability to seize the moment and do what if analysis, examine the competitive landscape, understand in real time brand sentiment or other short term requests that are difficult to accommodate in the on- premise world.  Hardware needs to be procured and performance of current analytics cannot be compromised to meet these more short term but important needs of self service.

Enter the Cloud in the world of Data Warehousing

As Data Warehousing moves to the cloud we see yet another evolution and step towards simplifying the delivery of business analytics for our clients.   IBM has been focused on capturing the enterprise data warehouse experience in the cloud by combining the best of our on-premise technology in a highly consumable cloud offering supporting both cloud and hybrid ground to cloud deployments.   Simple, powerful, agile and yes, secure! 

Now, not only does self service become possible, it becomes simple.  Analytic portability between on-premises and cloud is now possible so you can develop & analyze in the cloud and move to on-premise or the opposite.  And you do not need to sacrifice performance for simplicity because IBM’s solution allows you to leverage in-database capabilities with the power of in-memory in a secure, agile solution.

Sound too good to be true?

Attend the Data Warehouse Spotlight and hear about these shifts in data warehouse services in the cloud along with the latest improvements with our on-premises solutions as well.  The session will cover our latest in Data Warehousing Cloud solutions, data warehouse appliances and super-fast in-memory technology.

Join us for the Insight session #4857 and find out how IBM is changing the client experience for data warehouse services.  We will cover exciting new announcements 

More details about the session:

About Nancy,

Nancy Hensley has been in the data warehousing and BI industry for over 19 years. Nancy worked in the early days of enterprise data warehousing, spatial warehousing and executive reporting as a customer in a Fortune 50 company and joined IBM in 1999. In 2004, Nancy lead the team that brought the first IBM data warehouse appliance to market. From her position leading the data warehouse architect team in the field, Nancy moved into the development organization focusing on data warehouse solutions and database technology. Today Nancy works in product marketing and strategy for IBM data warehouse solutions. You can follow Nancy on Twitter @nancykoppdw.

Don’t Be Distracted By The Noise – Stick to the Facts When Comparing Analytic Appliances

By Wendy Lucas,

Vendor comparisons of analytic appliances will rage on, long after the life span of this blog.  However, I would suggest you shouldn’t be distracted by comparisons that aren’t rooted in fact or aren’t core to the value a data warehouse should provide.  Don’t be distracted by the noise!  Stick to things that matter:  how is the DW servicing end users, how fast is the time to value, is it easy to use and maintain and what about it’s total cost of ownership?

In a recent paper, Teradata has challenged the IBM PureData System for Analytics on a number of points.  Let’s review just a few of them.

Concurrency

Teradata claims the PureData System for Analytics has a limit of “63 concurrent transactions per system” and that Teradata can handle “up to millions of concurrent queries and transactions on a single system.”  What they don’t mention is that the Teradata 1XXX and 2XXX series systems have two throttles that are turned on by default, limiting the system to 52 concurrent requests, system wide.  What might be a better question … is concurrency as important as throughput and meeting service levels?  If your system doesn’t respond to queries as fast, queries build up and you have more running concurrently, hence the importance for concurrency in a Teradata system. 

Tuning

Teradata will discuss key tasks that DBAs need to perform on a regular basis, including, “making tuning a priority.”   If DBAs are focused on tuning, they are likely reacting to problem queries after the fact.  Do you really want your end users experiencing poor performance while DBAs work their magic?   Most PureData System for Analytics customers bought their first system because of the performance, low total cost of ownership, and ease of use. They bought their next larger system, or a second, third, fourth, and so on because other organizations or groups in the business became envious of how fast the PureData System for Analytics was able to provide actionable insight into the parts of the business using it, without tuning.

Total Cost of Ownership

What we don’t hear about in Teradata material is total cost of ownership.   Let’s take a look.  In a recent report, the International Technology Group (ITG) determined the three year cost of ownership of Teradata’s 2750 as 150% more expensive than IBM’s PureData System for Analytics N200x.   This was based on surveys of actual customer experience comparing total cost of acquisition and cost of resources to maintain the system.  For the ITG study, the 17 Teradata organizations average 1.7 full time employees to oversee the Teradata data base, as compared to less than 0.5 full time employees to tend to IBM’s PureData System for Analytics, on average. 

Stay focused on what matters.  Don’t be distracted!  For more details, check out the paper authored by Dwaine Snow from IBM, “Why customers are migrating from Teradata to IBM PureData System for Analytics”  or follow similar topics on his blog at http://www.dwainesnow.com/Site/Blog/Blog.html

And for more information and exciting product announcements, join us at Enzee Universe and the Data Warehousing track at the IBM Insight conference in Vegas!

About Wendy Lucas

Wendy Lucas is a Program Director for IBM Data Warehouse Marketing. Wendy has over 20 years of experience in data warehousing and business intelligence solutions, including 12 years at IBM. She has helped clients in a variety of roles, including application development, management consulting, project management, technical sales management and marketing. Wendy holds a Bachelor of Science in Computer Science from Capital University and you can follow her on Twitter at @wlucas001

Data Warehousing – A History of Disruptive Technology

Summary of a blog post by Adam Ronthal, Technical Product Marketing & Strategy, Big Data, Cloud, and Appliances, IBM

Cloud is changing the economics of deploying warehousing and analytic environments while introducing new levels of agility. 

Disruptive technologies fundamentally change the way we think about doing things — either because they represent a shift in efficiency, a shift in economics, or ideally, both.  This is what Netezza did in 2003, with its first true data warehouse appliance that dramatically reduced complexity.  Other vendors rapidly followed suit, and the industry was changed forever.  This disruptive appliance technology defined the “new normal” for data warehousing.

And now, Cloud represents the next wave in disruptive technology because it again dramatically reduces complexity in a variety of ways that will fundamentally change the economics of data warehousing. 

Learn more about cloud-based data warehousing in Adam’s full blog post

About Adam, 

Adam Ronthal has more than 20 years of experience in technical operations, system administration, and data warehousing and analytics. An IBMer, he is currently involved in Big Data & Cloud Strategy. You can follow Adam on Twitter at @ARonthal.

 

Why in-memory?

Summary of a blog post by Amit Patel, IBM Program Director Data Warehouse Solutions Marketing, 

Memory is the new disk. Disk is the new tape. And all of this opens fascinating new possibilities.   

The chain is only as strong as its weakest link. For system performance, the weak link is disk storage. While there have been tremendous improvements in microprocessors, system design, and software, disk storage remains the bottleneck.  Disks need to spin and physics only allows them to spin so fast. For analytic systems, this means that complex queries running against large data sets on disk can take too long to complete. Businesses cannot get the answers they need at a moment’s notice.

This is where in-memory computing—which relies on main memory for data storage—comes in. Main memory is much faster than disk storage.  Memory is now the new disk, and disk is the new tape. This means that in-memory computing opens up fascinating new possibilities for analytics where answers can be delivered near instantaneously. Hours and days become minutes and seconds in getting results.

Three key shifts allow in-memory computing to make business sense: processing capabilities have improved, DRAM prices have fallen, and software for in-memory computing is now available in a fast easy-to-use format.  All of this opens up entirely new possibilities for businesses to gain a competitive edge with their data. Businesses that don’t take advantage of it risk falling behind their competition.

Read Amit Patel’s full post.

The Relative Merits of IBM and Teradata Data Warehouses

By Rich Hughes,

A recent report concludes IBM® PureData™ System for Analytics (PDA) has distinct advantages over a comparable Teradata Data Warehouse appliance.  The International Technology Group (ITG) determined the three year cost of ownership of Teradata’s 2750 as 150% more expensive than IBM’s PDA N200x.  ITG surveyed 17 Teradata and 21 IBM customers, and matched their production data warehouses by workload, system size, and industry.  As an example, both Teradata and IBM customers in this study were categorized into Telecom, Digital Media, Financial Services, and Retail industry sectors.  ITG then further built their survey profiles based on data volumes, number of users, and workload similarities within each industry category—thus promoting reasonable comparisons based on real world observations. Starting with bottom line cost-of-ownership, the initial system expenditures for each customer industry profile were in the same ball park.  That means that IBM’s PDA lower cost advantages were found in deployment, support, and personnel costs over the systems’ life time.    For the application deployment area, IBM’s PDA delivered the lion’s share of their applications in 20 days or less.  As the chart below shows and by stark contrast, none of the Teradata customers deployed as quickly, with the Teradata average deployment being more than three months. 1

Rich Picture1

Getting an analytics application deployed months earlier speaks to time-to-value, and attaining more business objectives. The simpler to use IBM® PureData™ System for Analytics appliance proved to be less expensive to operate because of fewer full time employees managing the system.  The Teradata Data Warehouse appliance requires more administration training to understand how to achieve performance goals.  For the ITG study, the 17 Teradata organizations average 1.7 full time employees to oversee the Teradata data base, as compared to less than 0.5 full time employees to tend to IBM’s PDA, on average.  The primary reason for this significant gap is that IBM’s PDA end users are able to directly access their analytic appliance as a self service data warehouse. One of the ITG study participants using both Teradata and IBM’s PDA summed up the differences this way: “…we don’t have to build indexes… users write directly to the system, they don’t need to go through a DBA…we work with complete data sets instead of having everything aggregated and summarized first…we don’t have to use data models. In comparison with Teradata systems, performance-tuning overhead was said to be virtually non-existent.” 1 ITG determined that IBM customers realized additional benefits versus Teradata because of the ease in which end users deploy applications, creating “… the potential for closer business alignment than conventional data warehouse techniques.”1  In a world where vendors make significant claims of performance and total cost, it is hard to argue with real customer experience. To learn more about ITG’s findings, the full version of ITG’s report can be found at http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=WH&infotype=SA&appname=SWGE_WA_UZ_USEN&htmlfid=WAL12377USEN&attachment=WAL12377USEN.PDF#loaded 1.

About Rich Hughes,

Rich Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs– and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands.  Hughes is an IBM Marketing Program Manager for Data Warehousing, and you can follow him on Twitter at @rhughes134.