Thursday, February 17, 2011

Thanks You Magic Pass!

I was over at the Magic Pass SUG last night and had a blast!  I wanted to share a quick link to the slide deck, here, and the sample code from the final demo where we pulled together all of the information in the presentation.

Pulling It All Together 

To understand where you’re going you’ve got to know where you’ve been, in this way Compression is no different than any other technology.  You cannot just use it blindly; Compression can be a good thing if used properly.  So the final demo I do pulls together everything that I would like people to consider.  (I’m using the AdventureWorks2008R2 Database you can get that here.)

1.       Gather Baselines.

You know your database better than I do Dear Reader, and even though Compression can be a good thing you’ll never know it if you don’t gather baselines.  You should fire up Perfmon and gather at least one week, hopefully you’ve already got more than that, so you can see what your hardware activity looks like.  Apply T-SQL Traces to your system, so you know what the statements being executed look like and how much CPU and Logical Reads they use, as well as the duration.

2.       Find your tables with the largest Row and Data size

The Technology we are using is Compression, the idea being to shrink or compact things.  If you have a 1 page table, Compression is not going to help you.  This is a technology where size matters.

3.       Determine the Allocation Unit Makeup of the table

We cover this in depth in the presentation but Compression in 2008 & 2008 R2 only works on IN_ROW_DATA or HoBTs (Heaps or B-Trees).  So if you have a lot of LOBs, (XML, Text, Varchar(MAX), Varbinary(MAX), NVARCHAR(MAX), IMAGE), or ROW_OVERFLOW_DATA aka SLOBs your table may not be a candidate for Compression.

4.       Scans & Updates – What Kind of Activity do Your Tables Have?

If your Table has a lot of Scan activity but little Update Activity then you may want to Consider Page Compression.  If you have a lot of Update Activity and you are looking to use Compression then Row may be a better fit.  (This is where testing in a non-Production Environment will pay off, know which is best).

5.       Estimate Your Compression

You’ve done all this leg work, and you’ve got numbers to base your decision on.  Now Dear Reader let’s use sp_estimate_data_compression_savings to validate the numbers we’re looking at.  One piece of advice before we run it, the ways this works is a 5% sample of your table is copied into your TempDB and actually compressed (or uncompressed if you want to estimate the space used for uncompressing a compressed table), so before you run this on a 100 GB table make sure you’ve got an extra 5 GB worth of space to spare on your TempDB drive.

6.       Gather another Baseline

Now you find out what your savings are.  And if they are big in Duration and Logical I/O’s saved then make sure to tell your boss, and keep a log of how you just helped speed up your system.

And now here’s the script.

/*============================================================
--@SQLBalls: SQLBalls@gmail.com
--Page and Row Compression How, When, And Why
--

This Sample Code is provided for the purpose of illustration only and is not intended
to be used in a production environment.  THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE.

--Scripts For Updates,Scans, and Allocation Units can be
--obtained from
--Data Compression: Strategy, Capacity Planning and Best
--Practices
--http://msdn.microsoft.com/en-us/library/dd894051(SQL.100).aspx
==============================================================*/
/*
First let's switch to an existing
Database
*/
USE AdventureWorks2008R2;
GO
SET STATISTICS IO ON
SET STATISTICS TIME ON

/*
Let's use a Table Variable
and grab all of the sp_spaceused
numbers for each table in the database.
Last let's sort this by size so we can
test out the bigest table
*/
declare @myTable as Table(
                                  [Name] varchar(1000),
                                  [Rows] int,
                                  [Reserved] varchar(500),
                                  [Data] varchar(500),
                                  [Index_Size] varchar(500),
                                  [Unused] varchar(500)
                                  );

insert into @myTable
exec ('sp_msforeachtable @command1="sp_spaceused [?]"');


select * from @myTable
order by [rows] DESC, LEFT([reserved],(LEN([reserved])-3)) DESC;

GO

/*
So we found a table to inspect
let's use our handy script from
earlier to look at the allocation unit
make up of this table
*/
SELECT
     OBJECT_NAME(sp.object_id) ObjecName,
     si.name AS IndexName,
     sps.in_row_data_page_count as In_Row,
     sps.row_overflow_used_page_count AS Row_Over_Flow,
     sps.lob_reserved_page_count AS LOB_Data
FROM
     sys.dm_db_partition_stats sps
     JOIN sys.partitions sp
          ON sps.partition_id=sp.partition_id
     JOIN sys.indexes si
          ON sp.index_id=si.index_id AND sp.object_id = si.object_id
WHERE
     OBJECTPROPERTY(sp.object_id,'IsUserTable') =1
     AND OBJECT_NAME(sp.object_id)='Sales.SalesOrderDetail';
GO

/*
So now that we know what the Allocation Units
Look like how about the Scans and the Updates

Detect Scans
*/
SELECT
     so.name AS tableName,
     si.name AS indexName,
     ios.partition_number AS [Partition],
     ios.index_id AS indexID,
     si.type_desc AS indexType,
     (ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count + ios.leaf_insert_count + ios.leaf_page_merge_count + ios.leaf_update_count + ios.singleton_lookup_count)) AS percentScan
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count +ios.leaf_page_merge_count + ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
ORDER BY
     percentScan;
GO

/*
Check for Updates
*/
SELECT
     so.name,
     si.NAME AS [indexName],
     ios.partition_number AS [Partition],
     ios.index_id AS [IndexID],
     si.type_desc AS [IndexType],
     (ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count)) AS [Update_Percentage]
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count + ios.leaf_page_merge_count + ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
ORDER BY
     [Update_Percentage] ASC;
GO
/*
Let's Do some Estimating
*/
sp_estimate_data_compression_savings 'Sales', 'SalesOrderDetail', NULL, NULL, ROW;
GO
sp_estimate_data_compression_savings 'Sales', 'SalesOrderDetail', NULL, NULL, PAGE;
GO

/*
Let's run a quick test
how long does an uncompressessed select take
*/

SELECT
     *
FROM
     Sales.SalesOrderDetail;
GO
    
/*
Now let's rebuild with Page
Compression
*/
ALTER TABLE Sales.SalesOrderDetail
REBUILD WITH (DATA_COMPRESSION =PAGE);
GO

/*
Let's run a quick test
how long does an compressessed select take
*/

SELECT
     *
FROM
     Sales.SalesOrderDetail;
GO


Thanks again for having me!

Brad

No comments:

Post a Comment