Tuesday, January 25, 2011

Vote For SQL Rally! And Why you should vote for Page and Row Compression.



I received an email on behalf of the Organizers of SQL Rally, from Mr. Kendal Van Dyke (Blog|Twitter), and it felt like a call to action.  It asked that we who have submitted a topic blog about it, post on Twitter, Facebook, and Linked-In.  It had some quotes from Andy Warren (Blog | Twitter), here are Andy’s word.

"Vote for me" is a start, but why vote for you? Show me why I need to learn the material and why I should pick you to teach it to me. Do you have experience? Credentials? Passion for the topic? Have you been paying your dues in our community and earned the chance to step up? What we’re trying to do here is put more of the marketing work on the speakers. It’s not enough to just submit an idea, you have to help us get people to want to attend, and that starts with convincing enough people that you’ve got the best content out of the options on your track!

This get’s a big (….check’s that the kids are in bed…) HELL YA from me!  I feel passionate about the topic I’m presenting on, and I would hope so.  I’ve spent quite a bit of time researching it, figuring out how it works, how and when to apply it, and most importantly WHY you would want to. 

So let’s start there why.
Page and Row Compression How, When, and Why

One of the great things about presenting is you get to meet a lot of brilliant people, Bill Ramos (Blog| Twitter) shared with me some insight he had received from the ever brilliant Sunil Agarwal, as to why you should use compression.  It was so good that it is my #1 reason.

1st Let’s talk space.  Sure hard drives aren’t as expensive as they used to be, unless we are talking Solid State Drives, so one question you may have is why do I care about compressing data and saving (*Cheap*) hard drive space.







Production

Well let’s say you have a Database, and let’s say this is your Production Database Server.  Well Ball’s”, you say, I can go out and get a 1 TB hard drive from Office Depot for $100.  Why do I care about compression?” 

We’ll Dear Reader I would say you probably didn’t buy your companies hard drives at Office Depot during black Friday.  You may be using a SAN and the hard drives you put in that SAN may not be the cheapest you could find, probably wanted a bit of Quality, a bit of not super expensive.

I would wager to say that you probably don’t have just one Database Instance at your company.



                        
Production          Staging                 Test                    Dev      
                 
You could have this Database Instance in Multiple Environments.  Don’t forget your offsite Disaster Recovery/COOP (Continuation Of Operations Planning). 

So multiple servers, separate hardware cost, perhaps more than one SAN, the time and effort of SAN Administrators, or outside vendors like EMC, the hard drive space is starting to add up.  And as business look to get a control of or reduce their data footprint Compression can help.

2nd Reduced Memory Footprint, Compression happens at the Record and Page levels.  

The structures that we know and have become familiar with are physically changed.  When compressed pages are read off of the hard drive they are stored into Memory, the buffer pool, in a compressed state.

This is a very important distinction, so I want to repeat it.  Compressed Records and Pages are still stored in a compressed state in memory.  

So why is this important?  

There are 2 ways to read records in SQL Server, one is off of the Hard Disk and is a Physical Read.  When a Page is read off of the disk it is then loaded into the buffer pool before SQL can use it.  This means that we are able to load more records onto Compressed Pages, which translates into being able to load more data into the buffer pool for SQL to read.  This would normally only be possible by adding more RAM to a server and increasing the amount of Memory that SQL Server can utilize.  Loading pages into the buffer pool allows SQL to get the information quicker is called a Logical Read.

Think of it like the human memory.  

I have a conversation with you 1st thing in the morning about compression, we stop to go grab some coffee, and then we have meetings and work to do, 5 hours later I start to have the same conversation and you have to search your mind to pick up where we left off.  That is a Physical Read, and it took a little more effort.

Now let's try again.

I have a conversation with you 1st thing in the morning about compression, we stop to go grab some coffee, and then come back 5 minutes later and keep having the conversation.  The conversation is fresh in your mind, you don't have to search around for it, it is still in your buffer pool.  

By applying compression properly we just expanded SQL Server's Memory without having to change our physical hardware at all.

3rd Trading out CPU Cycles for Physical and Logical Reads.  

In SQL Server you can have hardware bottlenecks of CPU, Memory, IO, and Network.  Most SQL Servers that I have seen in the Public Sector tend to have very little CPU overhead, in the private sector it can be a bit more mixed.  But in either setting it is not uncommon to find SQL Servers that have a low usage of their CPU’s.

Just a moment ago I wrote about how the internal structure of records and pages are changed by compression.  You read about how they are still compressed in memory, so you might wonder how it is that SQL accounts for all these different types of pages and records?  The simple answer is it doesn’t.  When the storage engine passes Data Pages from the buffer pool to the relational engine the pages are translated back to their original format.  The additional effort of reading compressed records, and de-compressing them for the relation engine accounts for some of the CPU overhead that Compression adds to a Server.

This is a balancing act and SQL Server accounts for it by shifting some of the load from our Memory and Physical Disks to our CPU’s, which in the majority of cases can handle it.  

DEMO

So Let’s do a little demo tease to give you an idea of how this can help you.  I’m using the AdventureWorks2008R2 Database for this example.  Here is a quick little script

SET STATISTICS IO ON
SET STATISTICS TIME ON

select
    *
from Sales.SalesOrderHeader soh
left join Sales.SalesOrderDetail sod
on soh.SalesOrderID= sod.SalesOrderID

Now we will execute this script twice and get the second set of statistics, so that way we are only looking at the Logical IO’s required for the query.

We give this a run and we see:


It took 1240 logical reads from table Sales.SalesOrderDetail, 686 Logcial Reads for table Sales.SalesOrderHeader, and a CPU time of 1014 ms to return 121,317 rows of data in 5105 ms.

Now we will apply Page Compression to the Sales.SalesOrderDetail table and look at the reduction of logical reads for that table, as well as any increase in CPU time.

/*
Rebuild our table with
Page Compression
*/
USE [AdventureWorks2008R2]
ALTER TABLE [Sales].[SalesOrderDetail] REBUILD
WITH (DATA_COMPRESSION = PAGE);

/*
Set Statistics On
and re-execute our
query
*/
SET STATISTICS IO ON
SET STATISTICS TIME ON

select
    *
from Sales.SalesOrderHeader soh
left join Sales.SalesOrderDetail sod
on soh.SalesOrderID= sod.SalesOrderID

Now Let’s take a look at our results.



And we see our Logical Reads for table Sales.SalesOrderDetail were reduced to 598 from 1240, A Reduction of almost 52%!  We didn’t change Sales.SalesOrderHeader and the Logical Reads remained at 686, our CPU time went from 1014 to 1716 ms, and our elapse time dropped from 5105 to 5079 ms.

“We’re What You Call Professionals”






Unlike the Myth Busters I do want you to try this at home, and probably at work as well.  I’ve busted my hump to research, learn, and apply this topic, set up demo’s showing how you can find your biggest tables, what the allocation unit make-up of the tables and indexes are (*this is important because only In-Row-Data compresses).  How to use DMV’s to track your Scan and Update usage per object, and how scan’s and updates could affect the type of compression you want to use.

So the order I want you to go in is:


1. Gather Baselines-You need to know what your hardware performance as well as your query performance is before you apply compression!
           
     2. Look at your table’s, which tables have the largest amount of records?


    3. Look at their internal structure, are they mainly comprised of In-Row-Data?


    4. What is their pattern of usage, are they updated frequently, is the majority of the usage scans because they are historical data?


    5. Only once you have found the candidates, and determined the type of compression you want to examine use sp_estimate_data_compression_savings

*Keep in mind the over head!  The way sp_estimate_data_compression_savings works is that it takes a 5% sample of the table you are scanning and moves it to the TempDB, performs the actual compression operation and “estimates” the overall compression savings based on that 5% sample, as well as displaying the sample’s compression rate.

This means that if you try to compress a 100 GB table you need to have 5 GB worth of TempDB data space that can be used by this operation.  If you have a maintenance window, this is the sort of heavy operation you would want to do in off hours, and be sure that your TempDB can handle the space it will need.

SQL Rally, Go VOTE!

There are a lot of great sessions that have been submitted for SQL Rally.  The event is a great deal for some really great training, the Pre-Con’s are top notch professionals that you will be seeing and/or have seen at the PASS Sumit.  There are an absolute plethora of topics that are being presented, and you, the SQL community, get to vote for what you want to see.  Go make it great!  Click Here to VOTE, and Click Here to see the sessions you are voting on, and Click Here for the SQL RALLY Website.

Thanks,
Brad

Saturday, January 15, 2011

SQL Saturday 62: Slide Deck & Presentation

Hello Dear Reader,
      Since SQL Saturday 62, I've been a little busy with a sick baby girl, and didn't have the time to get my Slide Deck online until now.  Click here to view the slide deck.  I want to thank you for your interest.
      The first presentation went well, I got some really good feedback from the some of the DBA's in the audience I'm looking forward to re-working a couple things.  I wanted to list my thoughts.

Presenting

I was a little nervous leading up to the presentation, as a kid I was always involved in drama, plays, and musicals.  I went to a Magnet High School for the performing arts in Atlanta, GA and my concentration was in Drama.  We moved from Atlanta to Asheville, NC and I continued to concentrate on my acting but I also joined the football team, wrestling team, and played soccer (indoor and out-door) as well.

Needless to say between acting and sports I'm used to being up in front of an audience, and to be honest I kind of like it.  So you would think that all of that would prepare me to go in front of the SQL Community and present. 


(This picture would be more accurate if the Hulk was renamed the Demo)



Demo SMASHED!!!!

I rehearsed my Demo's quite a few times.  The last thing I wanted to do was be caught unprepared, but somehow that is the one thing that happened. 

I tried to do a little too much in my demo's, my subject being compression I wanted to create a set of tables that were sizable, with around 15000 rows, and then apply compression.

Re-creating the tables was taking around 38 seconds when I was testing everything out over the last several days.  On the day of when I went to load up my table I knew there was an issue when I went to run the first insert and after 1 minute and 45 seconds it was still running.

So I did what any actor would do in the case where their props fail them, I improvised.  While I wasn't able to load the tables to the full extent that I wanted I was still able to insert data, and run the compression scripts against them.  What I wanted to convey I was still able to.  All of the info in my Slide Deck was still there, all of the knowledge I have on the subject was still there, I took a deep breath and just kept moving forward.


Always Have a Backup Plan

Since the failure of the Demo's my mind went into troubleshooting mode, you can't really stop your talk to throw up Perfmon or start analyzing your wait stats in the middle of the Presentation.  Well you could, but since the talk was on Compression and not wait stats it didn't seem like the right thing to do.  

But when I got home the first thing I did was reboot my laptop and try again.  This time I could let the demo run, needless to say 38 seconds had turned into 5 minutes and 50 seconds.  YOWZZA, I'm glad I stopped and kept speaking.  Looking at my wait stats I was ready to do some analyzing when by happy accident my battery fell out while the laptop was unplugged, after that reboot everything went back to normal.  But even at normal I don't want to get caught unprepared.

So I started building my backup plan.  There are a bunch of different ways I can present to get my point across, I would like to say I would isolate and fix the issue my laptop was having, but it is working swimmingly right now (YAY and BOOO all at the same time).

I was a Boy Scout at one point in life and I do remember the motto of "always be prepared", and it has not left me.  So I'm looking forward to the next round of presentations, because I will have worked up how I can get my demo's done at least 3 different ways so I don't get blindsided again.

Not sure when the next Page & Row Compression presentation will be but I'm already looking forward to it.

Thanks,

Brad

Monday, January 10, 2011

A House and a Dog! T-SQL Tuesday 14: Techie New Year’s Resolutions

So T-SQL Tuesday is back again already, the first of 2011, and the  theme is techie resolutions.  This topic is courtesy of the Jen 1/2 of the MidnightDBA (twitter)(blog). 

I really like the idea for this topic, because we all have a lot of personal goals when it comes to a new year, but this is all about techie resolutions and what you want to achieve as it relates to technology. 



You might be wondering how does a House and a Dog make the list of techie goodness?  Houses are great and the can be tech savvy, but a dog….. really a dog…..!?

Look at those eyes, they can be techie if they want to.  So here we go! 

1.       MCITP DBA SQL 2008
I’ve got my MCTS for SQL 2008, so one test to go on this one, this will be the first resolution I wrap up this year.
2.       MCITP DBA Developer SQL 2008
I will start this track as soon as I finish the MCITP for the SQL 2008 DBA, my birthday is in late January and I’ll be getting the 70-433 book  and a Prometric 2 Pack  to help with this.
3.       13 T-SQL Tuesdays
This will be a year long project, I did my very first T-SQL Tuesday last month a full year will put me at 13, so the goal is 13 T-SQL Tuesdays by the end of the year J!

4.       40 Blog Posts
I’m new to blogging, and it is not as easy as it looks when you are casually visiting a site.  Each post requires quite a bit of work.  I have 4 kids, I’m married, and work a full time job.  Needless to say the one thing I’m not lacking on is boredom.  The thing I like most about blogging it is forcing me to keep tight priorities and my goal is to do at least 40 blogs for the year.  That is less than 4 a month.  I’d like to do more but I’m of the under promise over deliver line of thinking.

5.       6 Presentations to the SQL Community
I don’t know how, but I missed most of the SQL Community, except for SQLServerCentral.com, while I was breaking into and becoming a DBA.  It wasn’t until 2009 while I was working in D.C. that our Vice President of Microsoft Technologies came to me and said, “Your really interested in SQL Server right?”  “Yes”, was my simple response to him.  I had already gotten my MCITP SQL DBA for 2005 and was working on my MCTS for 2008, so I thought "like" was an understatement.  You like a sandwhich, you like a joke, you like a movie, but I have a deep rooted desire to know SQL like I know comic books.

He said he was going to forward me this link he got for something call 24 hours of PASS.  Needless to say I was hooked.  Not only did I love presentations from folks like Brent Ozar (twitter) (blog) and Louis Davidson (twitter) (blog), but I thought about what I did every day and felt I had things I could share as well.
Working for the Office of the President I was a little leery to put myself out there, so when I left and I wanted to find a company that would encourage me to be active in the SQL Community, heck I didn’t need encouragement I just needed the thumbs up!  And yes I got the big thumbs up.
My first presentation will be at SQL Saturday 62 in Tampa this Saturday, January 15th, I’ll be presenting on Page and Row Compression How, When, and Why.  I’ve presented internally for my company and also for previous audiences.  But this will be my first venture into the SQL Community and I’m very excited.  I’m going to be giving a 30 minute presentation at OPASS on March 8th, Transparent Data Encryption the Lightning Round.  So that is 2 down, 4 to go.
6.       MCM Prep
This is a yearlong goal for me, I’m already watching all the MCM videos, taking notes, and buying books.  Come 2012 I want to tackle this head on, but this is an expensive proposition, especially when you’re a family of 4 (six including Mrs. Balls and myself).  Mrs. Balls is super awesome, very supportive, quite beautiful, and she said I could go for it as long as I get her a House and a Dog.  So believe it or not
7.       A House and a Dog are techie goals!
My kids will be so happy to hear this!  The kids have wanted a Dog for quite sometime and this move was all about putting down roots long term.  To be quite honest, if I only achieve one thing on this list, this is by far the most importiant.

So there it is, not quite a top 10, but 7 will do for this year for me.  So Dear Reader, thanks for stopping by and Happy New Year!

Thanks,

Brad