dashDB is Here, and This Changes Everything for Data Warehousing on the Cloud

By Nancy Hensley

data warehosuing, analytics, dashDB, cloud, DaaS

I remember the days of actually selling the idea of a data warehouse to organizations, trying to convince them to leverage data to provide some insight to their business. Back then, it was a nice-to-have and we would celebrate every new customer in the “terabyte club”.

Now, I can’t help but laugh at that, because the little backup device on my desk is a terabyte. The fact is, things have changed. We have generated more data in the last few years than ever before and this data can be gold for our businesses. Gone are the days of convincing organizations that leveraging data is important, today it’s critical.

Data warehousing has seen a lot of disruption over the last decade but we did not get to the intersection of knowledge and opportunity fast enough. The architecture got too complex and we spent all our time managing performance and our businesses were losing the race.

Yes it’s a race. We have to spot trends faster, capitalize on opportunities before the competition, grow faster, optimize more, offer more services, be easy to do business with and be the best in class. To get there, you need analytics and you need them faster than ever before. Sometimes you just need the data warehouse infrastructure out of your way.

Something had to change to support the new business climate.

The good news is that something has changed. IBM has announced our latest disruption to change data warehousing for the better ….

Meet dashDB
dashDB is a data warehousing and analytics as a service offering on the cloud. dashDB offers robust analytics at incredibly fast processing speeds all in a cloud-easy format that lets you load and analyze data extremely quickly.

Yup it’s THAT EASY to win the race.

At the intersection of cloud technologies and analytics, dashDB represents the sweet spot for both IT and line of business professionals looking for a competitive edge in the data:

  • For IT professionals, dashDB helps quickly deliver solutions that the business needs without having to spend time with managing the infrastructure to serve a new request. dashDB works as part of the data warehousing strategy— no matter your starting point, from extending your on-premises warehouse to starting something completely new.
  • For line of business professionals, dashDB offers something you really need…self-service. That’s right, be the master of your own data kingdom. dashDB lets you load your data and get started with analytics in a couple of hours, taking the infrastructure out of your way. Yup, that’s right, out of your way. Imagine the possibilities! Got an idea? Go get it. Want a deeper understanding of your customers? No problem! Want to better predict challenges before they happen? We give you the keys to your crystal ball with best in class predictive capabilities.
  • For data science professionals, you can load data and work with queries, models and other analytics techniques without the hassle of CAPEX and dependencies on staff. dashDB includes R support, comes with an ecosystem of partners who provide specialized analytics capabilities and is Watson Analytics ready.

In short, it is pretty cool. So what is the technology behind the scenes that makes these fast, easy analytics possible? dashDB combines robust in-database analytic processing with in-memory technology that delivers a performance boost and the enterprise-class SoftLayer cloud infrastructure. dashDB is available on the Bluemix platform to work with a wide variety of data types, and there is a specialized version that works with Cloudant JSON data stores.

What are you waiting for? Get growing with dashDB at www.dashDB.com. You can start using dashDB as a freemium. Experience the future of data analytics. Deep analytics, fast processing and cloud-easy.

Advanced Security in an Insecure Data World

By Rich Hughes,

“Customer data may be at risk” is an all too familiar corporate acknowledgement these days, a communication event no enterprise wants to face.  Target lost personal information from at least 40,000,000 customers, stolen by thieves in late 2013.  This was followed and exceeded by Home Depot’s announcement last month that 56,000,000 customer bank cards used at the retailer’s 1,900 stores had been compromised.  Yes, Virginia, even your recent ice cream treat transaction at Dairy Queen has found its way into hacker’s hands. Most importantly, data breaches like these disrupt the trusted bond between a retailer and their customer, and as a consequence, top and bottom line numbers are negatively impacted.

Addressing security concerns for data warehouses, the IBM® PureData™ System for Analytics N3001 was announced for General Availability on October 17, 2014.   The N3001 appliance family brings advanced security to your data in this insecure world. Building on the appliance simplicity model, all data is stored on self encrypting  disk (SED) drives, providing security while not impacting performance. The protection provided by the SED implementation supports the industries requiring the strictest security compliance — health care, government, and the financial sectors.  This system utilizes strong authentication preventing threats due to unauthorized access, based on industry standard Kerberos protocol.

The N3001 Self Encrypting Drive protects your data-at-rest.  Both temporary data and user data tables are encrypted, and then this security level is bolstered by a key management scheme.

How does this work?  The SED disk drives are unlocked when the IBM® PureData™ System for Analytics N3001 ships to your data center.  And while the SED disk encryption is the first security level, an Advanced Encryption Standard (AES) compliant,  256 bit key needs to be created to cover all N3001 disks—both on the host and at the Snippet Processing Unit compartments.  This second security tier, the AES 256 bit key, can be initialized at any point after your data is loaded into the appliance.

The key management utility allows flexibility to update and rotate keys depending on the frequency of change dictated by your security policies.  This keyed approach is analogous to a password one uses to protect the disk data on a personal computer.  The Kerberos authentication, SED drives, and AES key management come as standard issue with the IBM® PureData™ System for Analytics N3001.

IBM’s InfoSphere Data Privacy for Security for Data Warehousing is a separately priced option that organizations should consider when dealing with compliance challenges.  This package will enforce separation of duties, and will report incidents covering user behavior tracked by an audit trail.  Additionally, a business glossary provides the organization with the ability to define and document sensitive data, along with the agreed upon access levels for the appropriate groups.  Data masking and making data fields autonomous, yet viewable by privileged user groups is also important functionality which comes with the InfoSphere Data Privacy for Security for Data Warehousing package.

The IBM® PureData™ System for Analytics N3001 features advanced security based on hardware and software improvements.  When coupled with IBM’s InfoSphere Data Privacy for Security for Data Warehousing (which monitors data going in and out of your data warehouse), you can rest assured your corporation’s sensitive information is protected from unwanted intruders.

More information on the IBM® PureData™ System for Analytics N3001 family can be viewed at this LINK.  There are numerous sessions at the upcoming IBM Insights 2014 Conference (October 26-30) which highlight the speed, simplicity, and security message as seen in many successful data warehouses powered by Netezza technology.  The IBM® PureData™ System for Analytics N3001 is again changing the game for data warehouse appliances.

About Rich Hughes,

Rich Hughes is an IBM Marketing Program Manager for Data Warehousing.  Hughes has worked in a variety of Information Technology, Data Warehousing, and Big Data jobs, and has been with IBM since 2004.  Hughes earned a Bachelor’s degree from Kansas University, and a Master’s degree in Computer Science from Kansas State University.  Writing about the original Dream Team, Hughes authored a book on the 1936 US Olympic basketball team, a squad composed of oil refinery laborers and film industry stage hands. You can follow him on @rhughes134

Big things come in small packages

By Wendy Lucas,

It may seem like it’s too early to talk about the holiday season.  We haven’t even hit Halloween yet, but some stores have already put out their holiday decorations and retail wares.   It’s the season of wonder for many children.  Younger children are excited by lots of gifts and the bigger the gift, the better.  As we get older, we learn to appreciate the sentiment behind the gift and quickly learn that it isn’t the quantity or size of the gift that matters.

Great value coming in a small package can be seen with IBM’s new PureData System for Analytics N3001-001, also affectionately called the “Mini Appliance.”  Don’t be fooled by its name.  This mini appliance packs a big punch in terms of it’s value for data warehousing as well as surrounding capabilities for a complete big data and business intelligence solution.

Why did IBM add a mini appliance to the family?

The value of Netezza data warehouse appliances, and later the PureData System for Analytics, has been realized by organizations for over a decade.  These appliances are designed and built with integrated hardware and software components that are very quickly deployed, put in use without tuning and used to very quickly gain insight from information.   Organizations who deal with vast amounts of information benefit from its simplicity, speed, scalability and built-in analytics.

But who are we to say what is “large” in terms of data?   It’s pretty easy to see how a large multi-national bank who deals with hundreds of terabytes of information would benefit from an enterprise class appliance to help them manage all that data and get value from that information, but what about small and midsized organizations?  Organizations dealing with small numbers of terabytes have the same need to use analytics to drive competitive advantage in their industries.   Small and mid-sized organizations may not have the same level of budget or IT resources as larger enterprises, but what if they could have access to an appliance that is purpose built to make advanced analytics simpler, faster, more accessible and more affordable?  What if they could take advantage of the same simplicity, speed and smarts?

And we shouldn’t limit our thinking to small and mid-sized organizations in the context of stand alone companies.  There are also departments within larger enterprises who have the need for a simple and fast analytic solution, or maybe IT organizations who need this same value in a test or development system.   For these reasons, the Mini Appliance makes a great addition to the PureData System for Analytics family.

Introducing the PureData System for Analytics Mini Appliance

The PureData System for Analytics Mini Appliance changes the dynamics for mid-sized organizations looking to take advantage of a high performance data warehouse appliance.   The Mini is a production-ready, rack mountable appliance that can handle up to 16 TB of user data (assuming 4x compression).  It has all the same benefits of it’s larger PureData for Analytics siblings with full function Netezza Platform Software (NPS ) 7.2.  It is built, tested and packaged at the factory and comes ready to install in an existing data center rack.  The Mini has the same ease of use with load and go simplicity for customers to get up and running in hours.

You might think that the “mini” version of anything is somehow limited in its features.  Not in this case.  The Mini Appliance provides enterprise grade, highly available, fully redundant components with integrated support including “call home” capability for automatic detection and reporting of any hardware issues.

The PureData System for Analytics N3001 product line now includes software entitlements for big data and business intelligence adding further value to the appliance within the information management ecosystem.  This means the appliance comes with extra software (at no extra cost!) that can be installed by the customer to provide business intelligence capabilities through the use of Cognos Business Intelligence software, data integration through the use of InfoSphere DataStage, Hadoop data services through InfoSphere BigInsights and real-time streaming analytics with InfoSphere Streams.  These starter kits for big data and business intelligence are also included with the Mini Appliance.  Yes, that’s right.  You get all the core value of the data warehouse appliance itself, plus the ability to integrate and load data as well as build reports and analytic applications, all in one solution.

There is so much value in these extra software entitlements, there will be more blogs written on just that topic alone, but in the meantime, read more on the PureData System for Analytics webpage or check out the solution brief describing these great software additions.

Fast value, out of the gate

Initial experience with the Mini Appliance has been great.  Here are just a few comments from partners who have used the appliance.

“Small to midsize customers who are invested or considering investing in software-based data warehouse solutions should really look at the PureData Mini Appliance. The appliance offers significant performance improvements over other software-based data warehouses, in a simple and self-contained environment, with a fraction of the operational maintenance costs.”

― Liam O’Heir, Vice President of Sales, New England Systems

 “We had the new PureData System for Analytics up and running, delivering results in 24 hours.  The appliance is simple and value is recognized quickly without the need to worry about indexing and/or tuning.”

― Michael Schuckman, Director, Big Data and Analytics, Micro Strategies

 

“Mid-market companies have similar big data needs as large enterprises.  The PureData Mini Appliance enables ‘big company capability’ with best-in-class performance in a mid-market package.”

– John Lucas, Director of Solutions Delivery, Avnet Services

Just in time for the holidays

If a data warehouse appliance with extended capabilities for big data and business intelligence is not on your holiday wish list yet, maybe it should be?  Big things really do come in small packages.

For more information, watch this video, read the data sheet or visit ibm.com/software/data/puredata/analytics/.

About Wendy,

Wendy Lucas is a Program Director for IBM Data Warehouse Marketing. Wendy has over 20 years of experience in data warehousing and business intelligence solutions, including 12 years at IBM. She has helped clients in a variety of roles, including application development, management consulting, project management, technical sales management and marketing. Wendy holds a Bachelor of Science in Computer Science from Capital University and you can follow her on Twitter at @wlucas001

Self-Service Analytics, Data Warehousing, and Information Management in the Cloud

By James Kobielus,

As we approach IBM Insight 2014, we call your attention to IBM Watson Analytics. Announced last month, Watson Analytics will go into public beta in mid-November, not long after Insight.

Whether or not you plan to attend this year’s event in Las Vegas, we invite you to participate in the upcoming public beta, which we strongly believe you’ll find transformative. What IBM has done is to reinvent and thereby democratize the business-analytics experience for the cloud era.

With Watson Analytics, which you can try for yourselves at Insight, IBM has put the power of sophisticated visual, predictive and cognitive analytics directly into the hands of the any users, even the least technically inclined. With a “freemium” option that will be a permanent element of the service upon full launch, you will be able to gain no-cost, on-demand, self-service access to sophisticated analytical capabilities. Marketing, sales, operations, finance and HR professionals can gain answers they need from all types of data–without needing to enlist a professional data scientist in the effort.

Watson Analytics’ built-in capabilities for advanced data management ensure that data is accessible rapidly and that large volumes of data are handled with ease, utilizing an embedded cloud data warehouse that incorporates IBM’s sophisticated DB2 with BLU Acceleration in-memory/columnar technology. In addition, embedded data refinery services enable business people, without any reliance on IT, to quickly find relevant, easily consumable raw data and transform that into relevant and actionable information.

As an added incentive for attending Insight, IBM will make further announcements that extend the value of Watson Analytics and of the sophisticated cloud data-warehousing and data-refinement services that power this supremely accessible and useful analytic experience. With this forthcoming announcement on cloud data warehousing, IBM continues to change the experience of using analytics today for our clients. We are making it easier for clients to be data-driven organizations and take advantage of new opportunities faster.

We hope to meet you at Insight!

About James, 

James Kobielus is IBM Senior Program Director, Product Marketing, Big Data Analytics solutions. He is an industry veteran, a popular speaker and social media participant, and a thought leader in big data, Hadoop, enterprise data warehousing, advanced analytics, business intelligence, data management, and next best action technologies. Follow James on Twitter : @jameskobielus

dashDB is here, and this changes everything for data warehousing on the cloud

data warehousing, analytics, dashDB, cloud, DaaS

By Nancy Hensley

I remember the days of actually selling the idea of a data warehouse to organizations, trying to convince them to leverage data to provide some insight to their business. Back then, it was a nice-to-have and we would celebrate every new customer in the “terabyte club”.

Now, I can’t help but laugh at that, because the little backup device on my desk is a terabyte. The fact is, things have changed. We have generated more data in the last few years than ever before and this data can be gold for our businesses. Gone are the days of convincing organizations that leveraging data is important, today it’s critical.

Data warehousing has seen a lot of disruption over the last decade but we did not get to the intersection of knowledge and opportunity fast enough. The architecture got too complex and we spent all our time managing performance and our businesses were losing the race.

Yes it’s a race. We have to spot trends faster, capitalize on opportunities before the competition, grow faster, optimize more, offer more services, be easy to do business with and be the best in class. To get there, you need analytics and you need them faster than ever before. Sometimes you just need the data warehouse infrastructure out of your way.

Something had to change to support the new business climate.

The good news is that something has changed. IBM has announced our latest disruption to change data warehousing for the better ….

Meet dashDB
dashDB is a data warehousing and analytics as a service offering on the cloud. dashDB offers robust analytics at incredibly fast processing speeds all in a cloud-easy format that lets you load and analyze data extremely quickly.

Yup it’s THAT EASY to win the race.

At the intersection of cloud technologies and analytics, dashDB represents the sweet spot for both IT and line of business professionals looking for a competitive edge in the data:

    • For IT professionals, dashDB helps quickly deliver solutions that the business needs without having to spend time with managing the infrastructure to serve a new request. dashDB works as part of the data warehousing strategy— no matter your starting point, from extending your on-premises warehouse to starting something completely new.

 

    • For line of business professionals, dashDB offers something you really need…self-service. That’s right, be the master of your own data kingdom. dashDB lets you load your data and get started with analytics in a couple of hours, taking the infrastructure out of your way. Yup, that’s right, out of your way. Imagine the possibilities! Got an idea? Go get it. Want a deeper understanding of your customers? No problem! Want to better predict challenges before they happen? We give you the keys to your crystal ball with best in class predictive capabilities.

 

  • For data science professionals, you can load data and work with queries, models and other analytics techniques without the hassle of CAPEX and dependencies on staff. dashDB includes R support, comes with an ecosystem of partners who provide specialized analytics capabilities and is Watson Analytics ready.

In short, it is pretty cool. So what is the technology behind the scenes that makes these fast, easy analytics possible? dashDB combines robust in-database analytic processing with in-memory technology that delivers a performance boost and the enterprise-class SoftLayer cloud infrastructure. dashDB is available on the Bluemix platform to work with a wide variety of data types, and there is a specialized version that works with Cloudant JSON data stores.

What are you waiting for? Get growing with dashDB at www.dashDB.com. You can start using dashDB as a freemium. Experience the future of data analytics. Deep analytics, fast processing and cloud-easy.

Saying Goodbye to Messy Data with Data Warehousing and Analytics with Cloudant and dashDB

December 18, 2104 update: dashDB on Cloudant has concluded its current beta program and is now officially available!  Read the details.

By Alan Hoffman and Adam Ronthal

At Cloudant, we  aim to help you get more from your live data. Our JSON document store provides a simple and intuitive operational data store for powering live applications. It’s flexible and scalable and has many different APIs for finding and managing data.

We are excited to announce the integration between Cloudant and dashDB (beta). dashDB is the new IBM hosted data warehousing solution for the Cloud. Starting today, you can easily replicate data from Cloudant to dashDB for deeper offline reporting and analytics.

This post walks through how to quickly and easily you can create a data warehouse from one or more of your Cloudant databases. Below, we’ll show you the working of the warehousing functionality and the new Schema Discovery Process. And we’ll follow up with a couple of examples of what you can do with your newly warehoused data.

Finally, we should say that this new feature is very much in beta. That is, we are deliberately releasing a feature that is not-fully-baked with the hope of gathering feedback from early users. The feature has restricted functionality and no guarantees for data integrity. Just the same, there is a great deal of awesome stuff in here, but it’s by no means production ready.

Schema Discovery Process

Before getting too deep into the tutorial, we should pause briefly to discuss what we call the Schema Discovery Process (SDP.) This is where the magic happens. dashDB is built on a relational database, where the data is stored in structured relational tables. Cloudant stores JSON documents, where all the data is encapsulated in a single record. To move data between these two systems we need to be able to translate our JSON docs into a schema (or set of tables) that dashDB understands.

This is exactly what the SDP does. It scans your Cloudant database and intuits the implicit structures in your data. It then creates that proper schema in dashDB and copies the data over. Is it perfect? Of course not. But it does work, especially for relatively simple and homogenous Cloudant databases. The SDP can help you discover how your data is organized, and that can power a whole suite of functionality within Cloudant. (More on this later.)

Prerequisites

● You’ll need a Cloudant account. You access your account is through the Cloudant dashboard. If you don’t have an account, go to http://cloudant.com and sign up. Go ahead, we’ll wait.

● You’ll need some data as well. For this post we are going to use a small set of New York City taxi data. If you want to replicate this to your account, you can find the data set at https://examples.cloudant.com/nyctaxi.

Getting started

Log into your Cloudant account and you will land on the ‘Databases’ tab. Notice that there is a new tab on the main left-hand menu labeled ‘Warehouse.’

dashDB

If you click on the ‘Warehouse’ tab, you’ll find an overview with information about what you can do with this new functionality and how to use it. You should peruse the information accessed from that tab at your leisure. But if you click on the ‘New Warehouse’ button at the top, you will see a simple panel for setting up your first dashDB warehouse.

dashDB

First, you’ll want to select the data you want to move to the warehouse in dashDB. You can add up to 10 Cloudant databases to your warehouse. The type-ahead search box should help you find databases even if you have hundreds. After selecting your databases, click “Create Warehouse” and the SDP magic is kicked off.

dashDB

 

A brief aside on limitations

While the SDP is doing its thing, we should talk about the restrictions of this beta offering. There are a couple of big gotchas that you need to be aware of. The dashDB beta is primarily a tool for development, testing new workflows for your data, and helping you figure out if this functionality works with your application. First and foremost is the limitation on the number of warehouses you can have at one time. During the beta, you can only have one. You can include any number of Cloudant databases in this one warehouse, but there is a 1GB (compressed) size limit, which corresponds to about 10GB of Cloudant data. If you try to create a second warehouse, you will be forced to delete the first one before continuing. Secondly, you cannot update a warehouse. You have the option to ‘rescan’ a warehouse, which essentially deletes and re-creates the warehouse from scratch. Finally, you cannot connect to an existing dashDB warehouse or account. You have to launch dashDB from Cloudant to take advantage of this integration.

On we go

When you click the ‘Create Warehouse’ button, the SDP begins to scan your data and build a schema for dashDB. The SDP communicates with Cloudant via the _warehouser database, which is automatically created when you run your first job. You will see a document in that database that holds all of the metadata about the warehousing job. For those of you familiar with the _replicator database, this process works in a similar fashion. You can check the status and other metadata for your job with this document.

dashDB

When the SDP has finished doing its thing, you’ll be notified on the dashboard and your brand-new warehouse is ready for you to work with. When you click the warehouse name, it launches the dashDB console in a new window. You are ‘leaving’ Cloudant at this point and entering dashDB. From here, you will be able to analyze your data in the rows, columns, and tables that the SDP has created for you.

Exploring dashDB

The dashDB interface allows you to work with your data and get insights quickly.

dashDB

The first thing you’ll probably want to do is inspect your data, which you can do by clicking ‘Inspect Data.’ Take a look at the table or tables that were created, the data types assigned to each column, and the contents of the tables. (The database also includes some sample data that is pre-loaded, so don’t be surprised if there is more there than your Cloudant data!)   Do a quick inspection to make sure the data is what you expect it to be and that it was properly brought over from its JSON origins!

We should also mention the ‘overflow table.’ When you drill into your warehouse you should see a table with the _overflow suffix. You’ll find two types of records there. First are any documents that don’t exactly fit into the schema built by the SDP. Non-conforming documents are put here in their entirety. The second type of record is an error record. If any piece of the data transfer pipeline fails for a particular document, we’ll put that document’s ID along with the error message in this table. During the beta phase for dashDB and when you are first putting your data into a warehouse, you should keep an eye on this table. Having many entries here indicates that something has gone wrong.

With dashDB you can run SQL queries and view the results directly in the console. Most people will likely connect a separate BI tool like IBM Cognos, SPSS, Watson Analytics, or a third-party tool from one of our large ecosystem of supported partners. That said, it’s nice to be able to jump right in with standard SQL to get a feel for what dashDB enables! From the pull-down menu on the left, go to “Manage > Run Query.”

dashDB dashDB

You can also perform other standard data warehousing tasks like loading data (either directly from the console or via a separate ETL tool like IBM InfoSphere DataStage). Loading other data is useful if you want to bring in other data sources to further your analysis of your Cloudant data. For example, with the NYC taxi dataset, you might want to bring in weather information to see if there is a correlation between rain and the average time of a trip. dashDB makes it easy to import data from external sources. To load data by using the console, go to “Manage > Load Data”.

Finally, dashDB has tight integration with R and built-in, in-database analytic algorithms that allow you to perform everything from linear regression and k-means clustering, to geospatial analytics.

To get started with R, simply go to “Analyze > Develop R Scripts” from the pull-down menu at the top left. From there you can launch RStudio. You’ll have to provide your dashDB login credentials, but they are easily obtainable from the “Set Up > Connect Applications > Connection settings menu in the console.

dashDB

With RStudio, you can develop and import sample R scripts, and generate plots and visualizations of your data easily. You can download the sample R script for this data here, (and the weather data it uses here), and walk through some analysis of the taxi trip data. From RStudio, simply upload the taxi_trip_demo.R file, and you can get started!

dashDB

We’ve put together some initial analysis for you to show you how it’s done:

dashDB

What next?

This is a beta product. We have a long list of features that we want to add before we are production ready. There are the obvious ones: giving you the ability to create more than one warehouse, raising the data volume cap on warehouses, improving performance, etc. We also plan to give you the ability to do incremental updates to warehouses and even schema evolution through the SDP. Speaking of which, we’d like to make the SDP more configurable. For now, the SDP is a coarse tool, but we’ll be adding more fine-grained control for advanced users.

Most importantly, we want to hear from you, the users of Cloudant and dashDB. What can we do to make Cloudant and dashDB work better for you? What features do you want or need? What use cases are you exploring? We would love to hear from you. Please email us, Alan and John (ahoffman@us.ibm.com and jjpark@ca.ibm.com) with all your feedback. We hope to hear from you soon.

Twitter Chat Announcement : Demystifying the Data Refinery on Oct 22nd from 1:00 pm – 2:00 pm ET

Organizations are storing large volumes of data in the hope of leveraging it for advanced analytics. Architectures that include data lakes, data reservoirs and data hubs are designed to help organizations manage growing volume, variety and velocity of data, and to make the data available for analysis by everyone in the organization. While a data lake can provide value, it alone isn’t sufficient for managing and analyzing disparate sources of data.

What are the additional capabilities needed to enable these centralized pools of data to deliver value? Is there a way to provide self-service access to everyone within the organization in a sustainable manner? What are the gaps that need to be addressed?  Join us in #makedatawork twitter chat to get answers for some of these questions and more.

Our special guests for the chat are R Ray Wang (@rwang0), Principal Analyst, Founder, and Chairman of Constellation Research, Inc.; David Corrigan (@dcorrigan), Director of Product Marketing for IBM InfoSphere; Paula Wiles Sigmon (@paulawilesigmon), Program Director of  Product Marketing for IBM InfoSphere; and James Kobielus (@jameskobielus), IBM big data evangelist, speaker and writer. Twitter handle @IBM_InfoSphere will be moderating the chat.

You can follow along and join the discussion using the hashtag #makedatawork. Here are the questions we’ll be discussing, as well as reference articles to help inspire the conversation on Wednesday, October 22, 1:00 p.m. ET.

#makedatawork chat questions

  1. Can the new paradigms for storing data – data lakes, data reservoirs etc., replace traditional platforms for managing data from disparate sources?
  2. How is a data refinery different from a data reservoir or a data lake? Is it a marketing gimmick?
  3. Where does a data refinery fit into an existing enterprise information architecture?
  4. What are the critical capabilities for a data refinery?
  5. What are the additional capabilities and services needed to make data in a data lake clean, relevant, and accessible to all?
  6. As we evolve toward self-service data refinement for the business user, what will be the role of IT?
  7. What advice do you have for people and organizations trying to streamline access to clean, relevant data for business users?

Looking forward to your participation!

Regards,

Team IBM Data Warehousing