Thursday, May 22, 2014

Introducing What_To_Compress V2

Hello Dear Reader!  I'm about 32,000 feet in the air leaving St. Louis after a great Performance Tuning Workshop for Pragmatic Works.  While there Jason Strate (@StrateSQL | Blog) and I had a two day class, and I was able meet some great people and speak with the St Louis SSUG about the new features in SQL Server 2014.  It was a good trip, but I'm happy to be on the way home.

"So Balls", you say, "The blog title is Introducing What_To_Compress, what is that?"

Great question Dear Reader!  For quite a few years I've been presenting, blogging, and writing about Compression.  On the Resource Page I have a list of those presentations as well as the scripts I use.  I'd been thinking about putting together a script to give compression recommendations since I first did my deep dive at the PASS Summit on the subject back in 2011.

About a year ago I did just that.  I've tossed this script around to some co-workers, friends, SQL People, and MVP's and asked for feedback.  I'm finally at the point that I'm ready to release the first version.  So without further ado, here it is.

WHAT_TO_COMPRESS V2

First off this is not a stored procedure like Jason's sp_IndexAnalysis script.  I'll probably make it dynamic in the future, but that is a next release.  If you want to jump straight to the download click here to get What_To_CompressV2.sql. 


This takes the Best Practices that I've been preaching about and applies it to telling you what to compress.  It looks at every Index, every Table, by partition, and gathers the In_Row_Data, Row_OverFlow_Data, and Lob_Data counts.  It tells you the Percentage of COMPRESSIBLE and UNCOMPRESSIBLE data per table, what the Scan and Update patterns are for your tables & Indexes, and makes a recommendation on the level of compression you should use.

It also gives you my detailed reasoning behind the recommendation that I've given you.  For example:

"The Percentage of Scan and Seek operations is 0.00% and the average amount of Update operations is 0.00%.  Data that can be compressed makes up 100.00% of this table.  There is no workload for the current table.  Please wait for the usage statistics to become representative of a typical work load.  If this is a typical work load, this is an excellent candidate for Page Compression.  Test with sp_estimate_data_compression_savings.  Remember that it takes 5% of the tables size and moves it to tempDB.  Validate that you have enough room on your server to perform this operation before attempting."

"The Percentage of Scan and Seek operations is 0.60% and the average amount of Update operations is 99.00%.  Data that can be compressed makes up 100.00% of this table.  However Based on the workload of this server this table should not be compressed.  If you apply Row or Page Compression it will have a higher CPU cost because of the low Seek and Scan Ratio.  Test with sp_estimate_data_compression_savings.  Remember that it takes 5% of the tables size and moves it to tempDB.  Validate that you have enough room on your server to perform this operation before attempting."

"The amount of Uncompressible data in this table does not make it a match for compression.  Data that can be compressed makes up 17.12% of this table.  While data that cannot be compressed makes up 82.88% of this table."

There is one parameter within the script that allows you to set the number of Pages in a table that you want to consider for compression.  By default the number is set at 8 pages, but you can increase that if you would like.

"So Balls", you say, "This sounds great but isn't their a built in stored procedure that can estimate size compression already in SQL Server?"

Yes, there is Dear Reader.  The built in stored procedure has a few things that we should discuss.

SP_ESTIMATE_DATA_COMPRESSION_SAVINGS


The first thing you should know before you use a tool is how it works, and what it does.  You wouldn't normally use a nail gun to open a beer.  You could, but it's not the right tool for the job.

The way sp_estimate_data_compression_savings works is that it takes the table that you specify, moves 5% of it into tempdb applies the compression you specify, and then extrapolates that estimate out over the size of your entire table.  It does a nice job of taking fragmentation into account in order not to give you an inaccurate information.  The key phrase that defines my root concern is, *it takes 5% of your table and moves into tempdb*.  For small tables this probably isn't an issue.  For VLDBs that have very large tables, this is a pretty big deal.

There are some well meaning community scripts available on blogs and codeplex that take sp_estimate_data_compression_savings and wrap it in a cursor to estimate the space savings for each table.  They do this estimation for Row and Page compression, for every table in your database.

This step tells us the space savings, but their are other settings we should take into account.  We should look at those before we begin estimating compression savings across the board.  What should we look at first?


  1. Our Allocation Units.  Only IN_ROW_DATA compresses.  Tables with a lot of LOB data types may not see any advantage in compression.  Even if they slightly compress the over head on those tables can make queries less efficient.
  2. Do we read from our tables?  If we do a lot of scans, seeks, and lookups from our tables this could indicate whether Page or Row compression would give us the best performance.
  3. Do we update our tables often?  Notice I said update.  Not delete, not insert, update.  When we apply compression we remove all the extra white space from fixed length data making all data types, that can use compression, in affect variable length fields.  This can lead to increased Page Splits, specifically mid-Page Splits, aka LOP_DELETE_SPLITs.  For more on Page Splits and mid-Page Splits see my blog, How to Find Bad Page Splits.
  4. SP_ESTIMATE_DATA_COMPRESSION_SAVINGS doesn't look at any of these.  Why estimate all the different compression types without first identifying your table that are proper candidates and looking at what the overall size of those tables are.
  5. You have to have Enterprise Edition to run sp_estimate_data_compression_savings.  You can run What_To_Compress on Standard Edition.


I wouldn't avoid using sp_estimate_data_compression_savings.  However, it wouldn't be the first thing that I run when looking at what to compress.

FEEDBACK

Okay Dear Reader, I need your help.  Run this.  Let me know what you think, what it's missing, anything you can think of.  I'll try to get a v3 out in the next couple months based on feedback.  Most of all if it works, drop me a line!  I love hearing success stories on compression.  Send your emails to: bball@pragmaticworks.com.  And as always Dear Reader, Thank you for stopping by.


Thanks,

Brad



Wednesday, April 30, 2014

Decks & Demos Live

Hello Dear Reader!  Just a quick update.  I've been traveling a lot and I'm behind in posting some Decks & Demos.  Here are my decks from the Inside the Query Optimizer talk that I did recently at the Denver SSUG and also at SQL Saturday 283 Puerto Rico.  Click here for the Slide Deck and Click Here for the Demo Scripts!  They are also posted on the Resource Page.

Thank you so much to those who attended and the Organizers for having me!!

Thanks Again,

Brad

Friday, February 21, 2014

Up Coming Speaking Events

Hello Dear Reader!  We are 2 months into 2014 and the New Year is off to a busy start. This year already I've spoken to 3 User Groups and had 1 Workshop on Performance Tuning. This is only the beginning.  Between  now and May I've got 12 more presentations.  While I'm out and about I hope that I will get to see and meet up with a lot of SQL Family.

First Up Tampa!

SQL Saturday #273 is just around the corner.  I am one of a whole host of other great SQL Server Experts that will be on hand for the free training and free learning.  Stop by!  

February 22nd
SQL Saturday Tampa #273
Trimming Indexes Getting Your Database In Shape! (Click the link to go sign up)

Do you have the PASS Summit Blues?  We are 11 months away from the next great gathering of SQL Minds.  During this time of the year you watch your DVD's/Downloads/Streaming content from last.  You look at the presentations at the SQL Saturday's and start to wonder what will be planned for next year.  You miss your friends, you miss the atmosphere, you miss the Summit.

Well Pragmatic Works has the cure for the PASS Summit Blues.  All this month we have been doing our presentations from last year for Free, as part of our Training on the T's.  Next week you will get a chance to see my presentation from the Summit, Plus a little extra content as I take up not ONE, but TWO time slots.

Hope to see you there!

February 25th
Pragmatic Works Training on the T's
SQL Internals Deep Dive Part 1 (Click the link to go sign up)


February 27th
Pragmatic Works Training on the T's
SQL Internals Deep Dive Part 2  (Click the link to go sign up)

This March SQL Server MVP Jorge Segarra (@SQLChicken | Blog) and myself take our two man show on the road.  We did Tampa last month, next up Reston VA.  I'm hoping to catch up with some old friend's while in VA, and I'm sure you want to miss the two of us in action.  We still have seats, but they are going fast!  (Click the link to go sign up)

March 18th - 20th
Pragmatic Works
Performance Tuning Workshop  (Click the link to go sign up)
Reston, VA

Last year one of my favorite all time memories was SQL Saturday Boston.  I'd never been to Boston when I submitted and I'd always wanted to go.  Little did I realize that I would end up back in Boston 5 more times before the end of 2013.  I can't wait to go back and join the amazing speaker line up that Mike Hilwig(@MikeHilwig|Blog) has put together!

March 29th
SQL Saturday Boston #262
Trimming Your Indexes Getting Your Database In Shape  (Click the link to go sign up)

I'm in Portland Oregon for the first time ever.  As if that wasn't enough reason to show up, I'm presenting with none other than my colleague Roger Wolter(@RWolter50 | Blog).  Roger is a former PM for SQL Server for Microsoft.  He has forgotten more about computer science that I've ever known.  He's a nice guy and he's flat out brilliant.  He has worked on some of the largest and most complicated Service Broker installations in the world, he has vast in the field experience with SQL Server.

Roger and I are teaching the Performance Tuning Workshop not once, but twice this year.  I hope to see you there, because this Dear Reader is going to be a lot of fun.


April 8th
Pragmatic Works
Performance Tuning Workshop  (Click the link to go sign up)
Portland, OR

My good friends Jose Rivera(@SQLConqueror|Blog) and Guillero Caicedo(@SQLTitan|Blog) invited SQL MVP Jorge Segarra and myself to present a full day Pre-Con for SQL Saturday Puerto Rico.  I've gotten to know Jose and Guillero quite well from different SQL Saturday's and the PASS Summit.  This is a great honor as it will mark my 1st ever Pre-Con for a SQL Saturday!

Jorge and I will be pulling out all the stops to give you a full day of information and Performance Tuning knowledge that you can use.  I hope to see you there!

April 11th
SQL Saturday Puerto Rico #283  (Click the link to go sign up)
Full Day Performance Tuning Workshop

April 12th
SQL Saturday
Puerto Rico #283
TBD

Roger Wolter and I tackle SQL Server one more time in the Mile High City.  Denver is a beautiful town and I cannot wait to come out and visit will all the #SQLFamily out there.  We will also be presenting at the SQL Server User Group on the 29th.


April 29th-May 1st
Pragmatic Works 
Performance Tuning Workshop (Click the link to go sign up)
Denver, CO

April 29th 
Denver Colorado SSUG
TBD

Jorge and I will be taking our two man show over to SQL Saturday #298 Jacksonville.  When Devin Knight(@Knight_Devin | Blog) called me up and asked I was stoked!  This will be the 4th time that Jorge and I have presented together this year and I can't wait!  Come out and join us for some learning and fun!


May 9th
SQL Saturday Jacksonville #298
SQL Server Performance Tuning and Internals  (Click the link to go sign up)

May 10th
SQL Saturday Jacksonville #298
TBD

As you can see Dear Reader this year will be busy. There are more plans right now that we are working on that go all the way out to the end of the year.  More on that in the next couple of months!

As Always, Thanks for stopping by.

Thanks,

Brad