Sunday, July 7, 2013

Columnstore Indexes: the best thing since cake (and I love cake!)

Most times with software upgrades comes a few "oh, that's great I think I may be able to utilize that" going through my mind. When I read and dove deeper into SQL Server 2012's new feature, Columnstore Indexes, my heart actually started to race. It takes a true love of data warehouses to feel this way, but it's as if SQL Server answered some of my questions and frustrations over the past few years. These frustrations grew around dealing with TBs worth of data and how normal row indexes (normal custered and non-clustered indexes are actual row indexes) just were not optimized for VLDB optimized querying. 

From msdn online, here is a summary of the difference: 

"An xVelocity memory optimized columnstore index, groups and stores data for each column and then joins all the columns to complete the whole index. This differs from traditional indexes which group and store data for each row and then join all the rows to complete the whole index."

The biggest difference is that data is grouped and stored one column at a time. The benefits are: because only the columns needed are read, this results in less disk reads, better compression, improved buffer pool usage (reducing I/O), and utilizes batch processing which reduces CPU.

Wah? Yes! Wah? I'll say it again, YES! And yes, it works with table partitions. Of course, there are logical limitations to it, like the most important being the table must be Read Only. But, for normal data warehouse architecture, data is not transactional and is loaded at certain times, most normally once or maybe a few times a day. So dropping and recreating indexes are something most data warehouse engineers are familiar with, in detail. Or you can always implement partitions and then switch, as well as other options to get around the read only requirement. There are other limitations, which I encourage you to explore, as well as some gotchas.

A fantastic, and thorough walk through of this new, exciting, feature in SQL Server 2012 can be found here: What's new in SQL Server 2012: Using the new Columnstore Index, it's a YouTube video by Kevin S. Goff.

My hat's off to the SQL Server development team, warms my heart that you addressed the growing needs of us data warehouse minions.