By David Birmingham,
The PureData For Analytics, powered by Netezza Technology machine has its own natural attraction. The more we use it, the more of our people want their hands on “that kind of power”. Sharing users, applications, solutions and environments can sometimes challenge the most savvy architect and the cleverest of our admins. With this kind of scale, we want to make sure the machine is not only firing-on-all-cylinders, but that it’s not firing any more cylinders than it has to!
In short, the greatest strength of the Netezza technology is its power, and the greatest challenge of the technology is – it’s power! Let’s face it, this kind of power can make a terrible data model look like a superstar, and can make inefficient queries look like the nectar-of-life. If a query in our “former technology” required six hours but now running “as is” in Netezza only takes six minutes, isn’t that awesome? But wait, what if it could run in three seconds with just a few tweaks here and there, aren’t we burning 100% more capacity than we need to? And if the machine appears to be under stress now, wouldn’t we want to get some of that back?
When first deployed, the sky was blue and the birds were singing, so why does the solution feel like it’s sinking into clay? Or rather, as Netezza’s competitors would claim: “The more data you load, the slower it gets.” This is only true if we’re running the machine in first-gear. We see a lot of people who spend years in first-gear before they start to really stress the machine. By that time, they may have wrapped a lot of stuff around first-gear data models, queries and may have even standardized on first-gear approaches. Or worse, they have built whole frameworks that institutionalize first-gear performance for everyone.
What if we had a Ferrari and kept it in first gear as we ran-around-town for errands? When we took it out on the open highway, it would seem like it’s running out of power. In fact, the Netezza machine’s sheer muscle and automatic optimization can mask creeping inefficiencies for long periods of time, until we reach a “tipping point”. Things get slower, the machine is under stress and the cause seems inexplicable. We might think it was the last-thing-we-installed but in truth, that was just the proverbial last-straw on the camel’s back. It’s a pretty strong camel after all. Now we want to know how to reel-in these issues to a healthier state.
The path to this end isn’t particularly difficult or daunting, but it sure seems that way when we have tens or hundreds of billions of rows, aggressive deadlines, fire-breathing users and fewer hours in the day to make-things-happen.
In this session at Insight 2014 we’ll take a deep dive into the machine’s capabilities, how it is able to reach stratospheric processing goals and why it might be missing those goals (lately anyhow) in your neck-of-the-woods. Like a monster-truck with sheer-power in its metal, we can easily climb out of the clay and it doesn’t need a tow-truck (like a large-scale overhaul). All we really need to do is hit the clutch, shift gears and punch it. We’ll show you how to take a step back (hit the clutch), re-factor the necessary solution parts without breaking anything (shift gears) and the rest is soon in the rear-view mirror.
Session Details :
- Track : Data Warehouse
- Sub track : Accelerate your Data Warehouse with Tools and Models
- Session # : IWA-4124
- Title : Performance tuning tips for the IBM PureData System for Analytics, powered by Netezza
- Abstract : The PureData System for Analytics, powered by Netezza has enormous power for data processing. However, inefficiencies can creep in at various stages, requiring deliberate and sometimes comprehensive troubleshooting. Learn more in this session.
- Speaker Info : David Birmingham (Brightlight Consulting)
- IBM Insight 2014 Link for Registration : http://www-01.ibm.com/software/events/insight/registration/
About David Birmingham,
David Birmingham is a Senior Principal Consultant at Brightlight Consulting and a popular contributor to the Enzee Community through the Enzee Universe, the Enzee Community, his IBM Developerworks Blog “Netezza Underground” and has authored two books on Data Warehousing and the Netezza Technology: Netezza Underground (2008,2014) and Netezza Transformation (2011). Follow him on twitter @enzeevoice