Monday, May 9, 2011

T-SQL Tuesday 18: Common Table Expressions AKA CTE’s

T-SQL SQL Tuesday is brought to us this week by Bob Pusateri (blog|@SQLBob).  So Bob what is the topic this month?

Have you ever solved or created a problem by using CTEs? Got a tip, trick, or something nifty to share? I’d love to see your posts about any of the above. Also don’t forget that T-SQL Tuesday is not limited to only T-SQL:
So we aren’t limited to CTE’s but I do enjoy them, so off we go!


HOT FUZZ FIZZ BUZZ


I first learned about Common Table Expressions while studying for the 70-432 exam back in 2007.  They were neat, they were cool.  They have a lot of power when it comes to recursion.  The first time I used them in any practical sense was in a job interview.  I was asked the Fizz Buzz question.

I remembered reading a blog Rob Boek (blog|@robboek) had written about this exact same question, read it here, Fizz Buzz by Rob Boek

I thought it was simple. I thought it was elegant.  It was a great example of Set Based logic.  I was trying to make a good impression, and so I re-created it.  I was honest about it, when asked why I came up with the solution that I had.  It led to a great conversation about how we study as IT professionals, and learn from others.

“So Balls”, you say “We’re going to talk Fizz Buzz?”

Nope but if you’re interested in CTE’s you should stop and go read it, were going to talk about one of my favorite topics Compression.

SIMPLIFY THE COMPLEX


Common Table Expressions can be used as an alias to complex queries.  The example I’m going to work with today is one from the White Paper on Compression, Data Compression: Strategy, Capacity Planning and Best Practices by Sanjay Mishra.  I use some of the Queries from this white paper, to detect Scan and Update Usage Patterns of Indexes, in my Presentation on Page & Row Compression.  If you’re in the Orlando Area you can come see that presentation this Friday at SQL Rally!

The queries look at the sys.dm_db_index_operational_stats DMF to gage the Scan and Update Patterns of Indexes.  This is important when you are determining a Compression Strategy. 

If you have a High Percentage of Updates then you will experience more overhead with Page Compression, than if you had a High Percentage of Scans.  A good example is an OLTP system vs. an OLAP system, Reference Data vs. Data that is constantly in motion.

The queries are not overly complicated, but getting the information you want requires a certain amount of massaging.  A CTE can help this process by providing a wrapper for a very complex query so you can work with the returning data in additional ways.

We’ll start out with our 2 original queries and then move to the CTE that I’ve put together.

Original Query to Detect Scans

SELECT
     so.name AS tableName,
     si.name AS indexName,
     ios.partition_number AS [Partition],
     ios.index_id AS indexID,
     si.type_desc AS indexType,
     (ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count + ios.leaf_insert_count + ios.leaf_page_merge_count + ios.leaf_update_count + ios.singleton_lookup_count)) AS percentScan
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count +ios.leaf_page_merge_count + ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
ORDER BY
     percentScan;
GO

Original Query to Detect Updates

SELECT
     so.name,
     si.NAME AS [indexName],
     ios.partition_number AS [Partition],
     ios.index_id AS [IndexID],
     si.type_desc AS [IndexType],
     (ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count)) AS [Update_Percentage]
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count + ios.leaf_page_merge_count + ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
ORDER BY
     [Update_Percentage] ASC;
GO

Both of these queries will get you the information you need, but if you want to see the results side by side you would need to perform a Union.  Further manipulation would require you edit both queries, and could get complicated.  This is where our CTE can help simplify things.

We’ll start out with our expression where we set the CTE name, the Data to be returned, and then we define our query.  Final we’ll have a separate select were we can manipulate our CTE and return the data as we see fit.

Our CTE

WITH IndexUsage(tableName, IndexName, [Partition], IndexId, IndexType, Usage, Percentage)
AS (
SELECT
     (ss.name + '.'+ so.name)
     ,si.NAME
     ,ios.partition_number
     ,ios.index_id
     ,si.type_desc
     ,'UPDATES'
     ,(ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count))
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
     JOIN sys.schemas ss
     ON so.schema_id=ss.schema_id
WHERE
     (
          ios.range_scan_count +
          ios.leaf_insert_count +
          ios.leaf_delete_count +
          ios.leaf_update_count +
          ios.leaf_page_merge_count +
          ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
UNION All
SELECT
     (ss.name + '.'+ so.name)
     ,si.name
     ,ios.partition_number
     ,ios.index_id
     ,si.type_desc
     ,'SCANS'
     ,(ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count +
     ios.leaf_insert_count +
     ios.leaf_page_merge_count +
     ios.leaf_update_count +
     ios.singleton_lookup_count))
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
     JOIN sys.schemas ss
     ON so.schema_id=ss.schema_id
WHERE
          (ios.range_scan_count +
          ios.leaf_insert_count +
          ios.leaf_delete_count +
          ios.leaf_update_count +
          ios.leaf_page_merge_count +
          ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
)

SELECT
     iu.tableName
     ,iu.IndexName
     ,iu.IndexType
     ,iu.Usage
     ,iu.Percentage
FROM
     IndexUsage iu
ORDER BY
     iu.tableName
     ,iu.IndexName
     ,iu.Percentage desc
   
   
Despite the complexity of the internal statements, out final select is simple and easy to change and manipulate.

There’s an old phrase, “Work Smarter Not Harder”.  Using CTE’s is a good way to achieve the smarter.  I hope you find this helpful.

Thanks,

Brad

2 comments: