Thursday, February 17, 2011

Thanks You Magic Pass!

I was over at the Magic Pass SUG last night and had a blast!  I wanted to share a quick link to the slide deck, here, and the sample code from the final demo where we pulled together all of the information in the presentation.

Pulling It All Together 

To understand where you’re going you’ve got to know where you’ve been, in this way Compression is no different than any other technology.  You cannot just use it blindly; Compression can be a good thing if used properly.  So the final demo I do pulls together everything that I would like people to consider.  (I’m using the AdventureWorks2008R2 Database you can get that here.)

1.       Gather Baselines.

You know your database better than I do Dear Reader, and even though Compression can be a good thing you’ll never know it if you don’t gather baselines.  You should fire up Perfmon and gather at least one week, hopefully you’ve already got more than that, so you can see what your hardware activity looks like.  Apply T-SQL Traces to your system, so you know what the statements being executed look like and how much CPU and Logical Reads they use, as well as the duration.

2.       Find your tables with the largest Row and Data size

The Technology we are using is Compression, the idea being to shrink or compact things.  If you have a 1 page table, Compression is not going to help you.  This is a technology where size matters.

3.       Determine the Allocation Unit Makeup of the table

We cover this in depth in the presentation but Compression in 2008 & 2008 R2 only works on IN_ROW_DATA or HoBTs (Heaps or B-Trees).  So if you have a lot of LOBs, (XML, Text, Varchar(MAX), Varbinary(MAX), NVARCHAR(MAX), IMAGE), or ROW_OVERFLOW_DATA aka SLOBs your table may not be a candidate for Compression.

4.       Scans & Updates – What Kind of Activity do Your Tables Have?

If your Table has a lot of Scan activity but little Update Activity then you may want to Consider Page Compression.  If you have a lot of Update Activity and you are looking to use Compression then Row may be a better fit.  (This is where testing in a non-Production Environment will pay off, know which is best).

5.       Estimate Your Compression

You’ve done all this leg work, and you’ve got numbers to base your decision on.  Now Dear Reader let’s use sp_estimate_data_compression_savings to validate the numbers we’re looking at.  One piece of advice before we run it, the ways this works is a 5% sample of your table is copied into your TempDB and actually compressed (or uncompressed if you want to estimate the space used for uncompressing a compressed table), so before you run this on a 100 GB table make sure you’ve got an extra 5 GB worth of space to spare on your TempDB drive.

6.       Gather another Baseline

Now you find out what your savings are.  And if they are big in Duration and Logical I/O’s saved then make sure to tell your boss, and keep a log of how you just helped speed up your system.

And now here’s the script.

/*============================================================
--@SQLBalls: SQLBalls@gmail.com
--Page and Row Compression How, When, And Why
--

This Sample Code is provided for the purpose of illustration only and is not intended
to be used in a production environment.  THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE.

--Scripts For Updates,Scans, and Allocation Units can be
--obtained from
--Data Compression: Strategy, Capacity Planning and Best
--Practices
--http://msdn.microsoft.com/en-us/library/dd894051(SQL.100).aspx
==============================================================*/
/*
First let's switch to an existing
Database
*/
USE AdventureWorks2008R2;
GO
SET STATISTICS IO ON
SET STATISTICS TIME ON

/*
Let's use a Table Variable
and grab all of the sp_spaceused
numbers for each table in the database.
Last let's sort this by size so we can
test out the bigest table
*/
declare @myTable as Table(
                                  [Name] varchar(1000),
                                  [Rows] int,
                                  [Reserved] varchar(500),
                                  [Data] varchar(500),
                                  [Index_Size] varchar(500),
                                  [Unused] varchar(500)
                                  );

insert into @myTable
exec ('sp_msforeachtable @command1="sp_spaceused [?]"');


select * from @myTable
order by [rows] DESC, LEFT([reserved],(LEN([reserved])-3)) DESC;

GO

/*
So we found a table to inspect
let's use our handy script from
earlier to look at the allocation unit
make up of this table
*/
SELECT
     OBJECT_NAME(sp.object_id) ObjecName,
     si.name AS IndexName,
     sps.in_row_data_page_count as In_Row,
     sps.row_overflow_used_page_count AS Row_Over_Flow,
     sps.lob_reserved_page_count AS LOB_Data
FROM
     sys.dm_db_partition_stats sps
     JOIN sys.partitions sp
          ON sps.partition_id=sp.partition_id
     JOIN sys.indexes si
          ON sp.index_id=si.index_id AND sp.object_id = si.object_id
WHERE
     OBJECTPROPERTY(sp.object_id,'IsUserTable') =1
     AND OBJECT_NAME(sp.object_id)='Sales.SalesOrderDetail';
GO

/*
So now that we know what the Allocation Units
Look like how about the Scans and the Updates

Detect Scans
*/
SELECT
     so.name AS tableName,
     si.name AS indexName,
     ios.partition_number AS [Partition],
     ios.index_id AS indexID,
     si.type_desc AS indexType,
     (ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count + ios.leaf_insert_count + ios.leaf_page_merge_count + ios.leaf_update_count + ios.singleton_lookup_count)) AS percentScan
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count +ios.leaf_page_merge_count + ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
ORDER BY
     percentScan;
GO

/*
Check for Updates
*/
SELECT
     so.name,
     si.NAME AS [indexName],
     ios.partition_number AS [Partition],
     ios.index_id AS [IndexID],
     si.type_desc AS [IndexType],
     (ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count)) AS [Update_Percentage]
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count + ios.leaf_page_merge_count + ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
ORDER BY
     [Update_Percentage] ASC;
GO
/*
Let's Do some Estimating
*/
sp_estimate_data_compression_savings 'Sales', 'SalesOrderDetail', NULL, NULL, ROW;
GO
sp_estimate_data_compression_savings 'Sales', 'SalesOrderDetail', NULL, NULL, PAGE;
GO

/*
Let's run a quick test
how long does an uncompressessed select take
*/

SELECT
     *
FROM
     Sales.SalesOrderDetail;
GO
    
/*
Now let's rebuild with Page
Compression
*/
ALTER TABLE Sales.SalesOrderDetail
REBUILD WITH (DATA_COMPRESSION =PAGE);
GO

/*
Let's run a quick test
how long does an compressessed select take
*/

SELECT
     *
FROM
     Sales.SalesOrderDetail;
GO


Thanks again for having me!

Brad

Wednesday, February 16, 2011

I'm Presenting At MAGICPass Tonight!

I'll be presenting at MAGICPass tonight, starting at 6:30!  The subject is Page and Row Compression How, When, and Why.  I blogged about this topic previously, and have presented this recently for SQL Saturday 62 and internally for my friends at Publix. 

If your interested in a preview of the topic Click Here.  I submitted this presentation to SQL Rally, and blogged about it for the voting process.  I got selected as an Alternate for the Rally, which I'm still really excited about this was the first big conference that I've submitted to, and my Peers did the voting, which doubles the honor.

I'll be posting the slide deck after the presentation as well as some of the T-SQL Demo's behind it.  There will be a couple pieces of code that I explore further. 

If you are in the Orlando Area and would like to stop by we would love to have you. To get information on the MAGICPAss Event Click Here to RSVP.

But Dear Reader, Wait there's more!

If you arrive to the event early around 5:00 pm then you'll get to see Andy Warren (Blog|Twitter).  Here is the abstract for Andy's presentation

Basics of Performance Tuning - Part 1 Speaker: Andy Warren
In part one of our three part class on performance tuning you'll learn how to use Profiler, the tool that ships with SQL Server that is used to monitor SQL statements sent to the server. We'll show you how to start a Profiler session, set up the most common events, columns, and filters, and how to create a template so you can easily use the same settings each time. We'll explain how to assess the cost of a query by looking at reads, writes, cpu time, and duration. We'll finish up by showing you how to set up a server side trace - think of it as Profiler without the UI. Knowing how to use Profiler is critical for a Dba, and incredibly useful for a developer.

If you know Andy, you know he is a great guy.  Andy is one of the Co-Founders of SQL Server Central, he currently sits on the Board of Directors for PASS, he help create, found, and has attended many SQL Saturday's, and has tirelessly devoted his time to the SQL Community.  Andy is a world class presenter and if you can make it in it will be well worth your time!

Well Dear Reader I hope to see you there!

Thanks,

Brad

Monday, February 14, 2011

MCM Video Series: You Should Be Watching These!

Back in November at PASS Microsoft Announced changes to the Microsoft Certified Master program.  Instead of 3 weeks of training and exam’s in Redmond and close to a $20,000 price tag, you can take a Knowledge Exam and a Lab exam at select Prometric Testing Centers for a price of $2,500 ($500 for the Knowledge exam, $2,000 for the Lab – provided that you pass both on your first try).
One of the things that Microsoft did as well was get the good folks over at SQL Skills to tape 70 different videos, some are 2 parts with a lecture and a demo, on a vast array of subjects to give you an idea of the scope of what you will need to know to pass these exams.
So you may be saying “Ball’s, I won’t be going for an MCM.  It would be nice, but I don’t have the time, the money, or [insert perfectly reasonable excuse of choice]. Why should I care about these videos?” Well, Dear Reader the reason you want to watch these MCM videos  is that they will make you better at what you do, even if you have no desire to pursue this Certification.
So let us discuss how these videos will make you better and why Certifications are important.
How These Videos Will Make You Better.
There are millions, billions, probably more than trillions of things I do not know.  The things I know the most about are my kids, comic books, and SQL Server.  Every time I learn something I get a little bit better.   Whether it’s how the best way to make one of my son’s/daughter’s smile, or that Action Comic’s #775 is the best Superman Story Ever , or the physical structure of a Data Record and how it is stored on a Page, learning these things make me better at things that I love to do.
Paul Randal (blog|twitter), the man himself, does an internal Deep Dive in Data Structures, Kimberly Tripp(blog|twitter)  covering Indexing Internals, Brent Ozar (blog|twitter)  on Virtualization, Bob Beauchemin (blog|twitter)  on Security and Encryption, that’s an awesome line up!  And normally you would pay top dollar in order to get this training, and Dear Reader it is available for FREE!
I’m watching this, the guys I’m working with are watching these video’s, and we’ve even had a couple lunch-in’s where we’ll get a conference room a projector and we’ll watch and discuss the topic that are covered.
Learning makes you better, and this is a golden opportunity to better yourself, even if you have no desire to get the MCM Certification chances are there is a video that covers a SQL topic that you are using daily, and it can probably teach you something that will help you do your daily grind in a better less grinding kind of way.
Why Certifications are Important
Personally I’m a big proponent of Certifications; I was in awe of them before I attempted to get one.  I always thought that just to have a Microsoft Certification you had some super reservoir of knowledge.  When I decided to go for my first certification I got the 70-431 book off of Amazon, I studied it cover to cover, made flash cards, did all of the examples, took the exam and…..failed. 
The crushing defeat lasted just long enough for me to leave the testing center and get into the car.  By the time I turned on the car, I knew exactly what I would do.  While it was still fresh in my mind I would take the items that I was weakest on and I’d make a study guide, to cover those topics like crazy.  I didn’t fail by much and I was already itching for another try.  One week later I took the exam again and got a 902, I was stoked!
I knew the material, I was able to use it at work, and I grew as a professional because I had pushed myself.  You can push yourself without going for the certification, but that piece of paper shows that you took the time to pursue something, and I believe that is important. 
I’ve had employers tell me that they did not value Certifications, but never in the interview.  I have had managers where arguing with them about why Certifications are important happened so often, I could have listed it on my resume as a hobby.  But the second a client wanted to know who the DBA on the project would be they were more than happy to point out that I was their “Microsoft Certified” DBA.  Whenever we were interviewing potential DBA candidates they were quick to mention my Certifications when they introduced me as their Sr. DBA’s. 
I’ve heard the criticisms about brain dumps and Certification Boot Camps, but the fact is if you’re not a DBA, if you’re a hiring manager with no DBA experience, you may not know how to sniff out the real ones from the fakes.  However, if you are a DBA and you studied and earned the Certifications then you know what the other person should know.
In an interview to me a Certification is not a free pass that you know what you’re doing; it is a bulls-eye saying BRING IT ON!  I want tough questions, I want technical questions, I want to make sure that the people in the room to feel confident that they are getting an experienced professional, and if I don’t know the answer to a question then I’m not too proud to ask because I like to learn, and I hope Dear Reader that you like to learn as well. 
So go get learning on what it takes to be a Master, and maybe get motivated to go become one.
Happy Learning!

Brad