Build skills for 2016 and Beyond: Data Warehousing and Analytics Top 10 Resources

by Cindy Russell, IBM Data Warehouse Marketing

Skills are always an essential consideration in technical careers and it is important for data warehousing professionals to expand their knowledge to handle the proliferation of data types and volumes in 2016 and beyond.

These are my “top 10” resource picks that you may want to explore. I am choosing these because of their popularity and also because they represent new technologies you may face in 2016 as you modernize your data warehouse and extend it beyond its traditional realm to meet new analytics needs.

  1. Gartner Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics – I am recommending this report because it provides an overview of the trends, issues and marketplace leaders in data warehousing. It calls out the need for the Logical Data Warehouse, which is a key element of a modernization strategy. I believe the Logical Data Warehouse will be of increasing importance to your operations in the coming months. Read a summary of the report.
  2. Logical Data Warehouse – Due to the massive and rapid growth of data volumes and types, a single centralized data warehouse cannot meet all of the new needs for analytics by itself. The data warehouse now becomes part of a Logical Data Warehouse in which a set of “fit for purpose” stores are used to house a range of data. This blog by Wendy Lucas was published in 2014, but is still a good primer on the concept if you need one.
  3. IBM Fluid Query information and entitlement for PureData clients – In 2015, we released a series of “agile” announcements of IBM Fluid Query. This is a tool that PureData System for Analytics clients can use to query more data sources for deeper insights. This tool is a key element when you have a Logical Data Warehouse where data stores include Hadoop, databases, other data warehouses and more. PureData clients can take advantage of this technology as part of the entitlements. Start learning with our blog series and webcast.
  4. dashDB, data warehousing on the cloud – dashDB was launched in 2014 as the IBM fully managed data warehouse in the cloud. Some initial use cases cloud be: setting up self-service data science sandboxes, establishing test environments or cost-effectively housing data that is already external, such as social media feeds. dashDB is based on the Netezza and BLU Acceleration in-memory computing technologies. If you have workloads you want to place on the cloud, dashDB is a good solution. This webcast and a TDWI Checklist for cloud get you started.
  5. Hadoop and Big SQL – Hadoop is a scalable, cost-effective, open source file system that can store a range of structured or unstructured data as part of a Logical Data Warehouse. It can also be used to help you manage capacity on the data warehouse, for example as a queryable historical archive. Read this blog by our expert to learn the basics. IBM provides a free open source distribution, IBM Open Platform with Apache Hadoop. For those looking to augment the IBM Open Platform, IBM BigInsights adds enterprise-grade features including visualization, exploration and advanced analytics. Within the family is an implementation that includes Big SQL—enabling you to use familiar SQL skills to query data in Hadoop. Explore the above content options, then get started with a no charge trial.
  6. Apache Spark –IBM announced a major commitment to Apache Spark in June, 2015 and has already made available a series of Spark-based products and cloud services. You will be seeing more of Spark across the IBM Analytics portfolio, so it is a good technology to learn. Apache Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical Map Reduce program cannot provide, Spark is the alternative. It performs at speeds up to 100 times faster than Map Reduce for iterative algorithms or interactive data mining. Spark provides in-memory cluster computing for speed, and supports the Java, Scala, and Python APIs for ease of development. I recommend this no charge Big Data University course on Spark fundamentals.
  7. Update to IBM Netezza Analytics software – For those of you who are PureData System for Analytics clients, there is an update to the Netezza Analytics software. Doug Daily is one of our experts in this area, and he created an announcement blog to help you understand what new capabilities you can leverage.
  8. Virtual Enzee on demand webcasts – IBM offers webcasts on topics related to data warehousing and PureData System for Analytics. Browse the “Virtual Enzee” webcast library to stay up to date on PureData through these on demand webcasts.
  9. Learn Cognos Analytics for user self-service applications – Some of our clients use Cognos BI in conjunction with their data warehouses for super-fast reporting. Cognos Analytics was announced at IBM Insight as a guided, self-service capability that provides a personal approach to analytics. As your users are demanding more insights, self-service may be a sound solution to some of their needs. Browse the blog and web site to learn more.
  10. IBMGo on demand keynotes from IBM Insight – If you were unable to attend IBM Insight 2015, IBMGo brings some of the main sessions to you! It is a great way to learn about the bigger IBM Analytics solutions and points of view. Start here.

Tweet this blog