Wednesday, May 11, 2011

SQL Rally: Pre Con Business Intelligence Workshop BI END to END


Hello and Greetings from SQL Rally in Sunny Orlando FL!  


Today was the day of Pre-Con’s, four different tracks to choose from and I went with the Business Intelligence Track.  All of the other track’s had One presenter, heck every pre-con I’ve ever been to has had One presenter, but not today.


Today was an end to end presentation covering SSIS, SSAS, SSRS, Powerpivot, Sharepoint, and Performance Point.  Something that big requires not One, not Two, no Dear Reader not even Three, but Four Presenters.


Mike Davis (@MikeDavisSQL), SQL MVP Patrick Leblanc (@PatrickDBA), Devin Knight (@Knight_Devin), and the Big Man Himself, The Sultan of SSAS, The Conqueror of Cubes, SQL BI MVP Adam Jorgensen (@Adam_Jorgensen) delivered a presentation that was large in scope and in delivery.  All of the presenters work for Pragmatic Works, all are published authors, and all will be speaking during the main SQL Rally event.

Pragmatic Works is a company that offers Consulting, SQL Products, and Training.  They are very active in the SQL Community.  If you’ve ever been to a SQL Saturday then you have probably had the opportunity to hear one of them Speak.  If you have visited or posted a BI question on BIDN you have probably had one of them assist you in some way.  They are good and gracious people and it has been a pleasure interacting with them in the past, and seeing them again today.


So with my buddy Gareth Swanepoel (@GarethSwan) showing the way to the Coffee, the experts were in place and we were Ready to Roll!

FUNDAMENTALS & SSIS

I thought it was very cool and original to have a rotating staff during a full day Pre-con, it kept it very fresh and lively, and as an audience member keep lulls from occurring.    Patrick started out the day by introducing himself and the other members of his team.  They quickly moved into a format where one person would be presenting and another would be up front as a co-presenter as well. 

Devon quickly briefed us on the concepts and terminology that we would be using for the remainder of the work shop.  We needed to know the fundamentals:

 Fact Tables -which will be the center of our BI “Data Model”, but as a rule should contain columns that you can aggregate.

 Dimensions Tables- the support tables that will provide information that we would use to sort or filter data by.  Think of the information in Dimension tables similar to the data you would use in a Where clause in T-SQL.

Surrogate Keys – These are keys without a natural identifier.  Think of using and Integer Identity Column in SQL.  But the usage in BI is important on two fronts, not only will they be used as an identifier but to insulate us, the users, from source data changes in Natural Keys.

Before the day was over we would covered Type 0, Type 1, and Type 2 Dimensions, Attributes, Members, Hierarchies, and Role Playing Dimensions.    We looked at Star Schemas and discussed the what makes a Snow Flake Schema, a BI Server in a VERY COLD room.

After an introduction we moved to SSIS and spent the next several hours there.  We reviewed the many ways we could use SSIS, and in some cases SSIS & T-SQL, for Extract, Transform, and Load.  Configuring SSIS packages for migration through environments, storage of packages MSDB vs. File System, and deployment of packages took almost the remainder of the morning.

CHANGE DATA CAPTURE

When you start talking about loading millions or billions of rows of data into a data warehouse every night, as a DBA I get nervous.  We all know the overhead that BCPing Data out of and into systems can have.  The overhead of disabling and rebuilding indexes, the ETL involved, Latches, Locks, and Blocks OH MY!  Part of you wants to say there has to be a better way, how do we find the data that has changed and migrate only that?
Change Data Capture was the way home.  Patrick had code and demo’s that I’m really looking forward to diving through and testing out.  This was a concept that really clicked, and it was one that I had not considered before.  Using the Log Sequencing Number, and tracking the columns that we need to capture (AND ONLY THOSE COLUMNS), we can minimize the overhead required to load data into a data warehouse.

LUNCH

Lunch was a great bag lunch with gourmet sandwiches, chips, water, apple, and a very delicious chocolate chip cookie.  Tasty and delicious, but the best part was the company I walked around and had great conversations with Jack Corbett (@unclebigguns), Kendal Van Dyke (@SQLDBA), Andy Warren (@SQLAndy), met Grant Fritchey (@Gfrichey), and finally sat down and enjoyed the warm Florida sunshine with my cohort Dan Taylor (@DBABulldog) and Stacia Misner (@StaciaMisner).  The SQL Community is a tight one, and every time I get together with people it seems like we pick up the conversation right where it left off.

SSAS-IT'S ALL ABOUT THE CUBE BABY


Adam was up next to discuss SSAS.  He tied together concepts to show us how the regular SQL Engine & the SSAS Engine were similar and different all at the same time.  How integers are more efficient in queries, which is why in a Date Dimension the SK should be an INT instead of a date time value, regardless of the fact that small date time and INT are both 4 bytes;  How XMLA is to DDL as MDX is to DML, and the drawbacks of calculated measures vs. Transparent Aggregation were all on the table.

He blew the audience away with the Usage Based Optimization wizard, which literally is a wizard that based off of Internal Usage Statistics can be used to tune SSAS at a granular user level.  He briefly touched on partitioning strategies, MOLAP, HOLAP, ROLAP, and when you would and should see each.

SSRS & SHAREPOINT



Mike did a great job of taking us through Sharepoint, demonstrating many of the configuration options, SSRS, Performance Point Reporting, and looking at the dynamic built in nature of SSRS 2008 R2 when it has the power of a data cube behind it.


TIME TO WRAP IT UP



Needless to say these guys are some of the top professionals in the world when it comes to BI, and it was fun to watch them work.   

The crowd was very responsive and if you can’t tell I enjoyed it immensely.

This was 7 hours of training, 1 hour of lunch, and 7 pages of notes.  Let me say that one more time 7 PAGES OF NOTES!  No matter what I’ve written I couldn’t do justice to the information provided!

I’ve only touched on the highlights of the topics Dear Reader, if you’re interested in more I’d encourage you to go to the Pragmatic Works website and watch some of the free video’s they have up for training.  They offer free webinars monthly on a variety of topics, and head over to the SQL Lunch website as well of which Patrick is the founder.

Now I’m off to bed, so I can be up and ready for tomorrow, SQL Rally Day 1!

Thanks,

Brad

Monday, May 9, 2011

T-SQL Tuesday 18: Common Table Expressions AKA CTE’s

T-SQL SQL Tuesday is brought to us this week by Bob Pusateri (blog|@SQLBob).  So Bob what is the topic this month?

Have you ever solved or created a problem by using CTEs? Got a tip, trick, or something nifty to share? I’d love to see your posts about any of the above. Also don’t forget that T-SQL Tuesday is not limited to only T-SQL:
So we aren’t limited to CTE’s but I do enjoy them, so off we go!


HOT FUZZ FIZZ BUZZ


I first learned about Common Table Expressions while studying for the 70-432 exam back in 2007.  They were neat, they were cool.  They have a lot of power when it comes to recursion.  The first time I used them in any practical sense was in a job interview.  I was asked the Fizz Buzz question.

I remembered reading a blog Rob Boek (blog|@robboek) had written about this exact same question, read it here, Fizz Buzz by Rob Boek

I thought it was simple. I thought it was elegant.  It was a great example of Set Based logic.  I was trying to make a good impression, and so I re-created it.  I was honest about it, when asked why I came up with the solution that I had.  It led to a great conversation about how we study as IT professionals, and learn from others.

“So Balls”, you say “We’re going to talk Fizz Buzz?”

Nope but if you’re interested in CTE’s you should stop and go read it, were going to talk about one of my favorite topics Compression.

SIMPLIFY THE COMPLEX


Common Table Expressions can be used as an alias to complex queries.  The example I’m going to work with today is one from the White Paper on Compression, Data Compression: Strategy, Capacity Planning and Best Practices by Sanjay Mishra.  I use some of the Queries from this white paper, to detect Scan and Update Usage Patterns of Indexes, in my Presentation on Page & Row Compression.  If you’re in the Orlando Area you can come see that presentation this Friday at SQL Rally!

The queries look at the sys.dm_db_index_operational_stats DMF to gage the Scan and Update Patterns of Indexes.  This is important when you are determining a Compression Strategy. 

If you have a High Percentage of Updates then you will experience more overhead with Page Compression, than if you had a High Percentage of Scans.  A good example is an OLTP system vs. an OLAP system, Reference Data vs. Data that is constantly in motion.

The queries are not overly complicated, but getting the information you want requires a certain amount of massaging.  A CTE can help this process by providing a wrapper for a very complex query so you can work with the returning data in additional ways.

We’ll start out with our 2 original queries and then move to the CTE that I’ve put together.

Original Query to Detect Scans

SELECT
     so.name AS tableName,
     si.name AS indexName,
     ios.partition_number AS [Partition],
     ios.index_id AS indexID,
     si.type_desc AS indexType,
     (ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count + ios.leaf_insert_count + ios.leaf_page_merge_count + ios.leaf_update_count + ios.singleton_lookup_count)) AS percentScan
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count +ios.leaf_page_merge_count + ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
ORDER BY
     percentScan;
GO

Original Query to Detect Updates

SELECT
     so.name,
     si.NAME AS [indexName],
     ios.partition_number AS [Partition],
     ios.index_id AS [IndexID],
     si.type_desc AS [IndexType],
     (ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count)) AS [Update_Percentage]
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count + ios.leaf_page_merge_count + ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
ORDER BY
     [Update_Percentage] ASC;
GO

Both of these queries will get you the information you need, but if you want to see the results side by side you would need to perform a Union.  Further manipulation would require you edit both queries, and could get complicated.  This is where our CTE can help simplify things.

We’ll start out with our expression where we set the CTE name, the Data to be returned, and then we define our query.  Final we’ll have a separate select were we can manipulate our CTE and return the data as we see fit.

Our CTE

WITH IndexUsage(tableName, IndexName, [Partition], IndexId, IndexType, Usage, Percentage)
AS (
SELECT
     (ss.name + '.'+ so.name)
     ,si.NAME
     ,ios.partition_number
     ,ios.index_id
     ,si.type_desc
     ,'UPDATES'
     ,(ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count))
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
     JOIN sys.schemas ss
     ON so.schema_id=ss.schema_id
WHERE
     (
          ios.range_scan_count +
          ios.leaf_insert_count +
          ios.leaf_delete_count +
          ios.leaf_update_count +
          ios.leaf_page_merge_count +
          ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
UNION All
SELECT
     (ss.name + '.'+ so.name)
     ,si.name
     ,ios.partition_number
     ,ios.index_id
     ,si.type_desc
     ,'SCANS'
     ,(ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count +
     ios.leaf_insert_count +
     ios.leaf_page_merge_count +
     ios.leaf_update_count +
     ios.singleton_lookup_count))
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
     JOIN sys.schemas ss
     ON so.schema_id=ss.schema_id
WHERE
          (ios.range_scan_count +
          ios.leaf_insert_count +
          ios.leaf_delete_count +
          ios.leaf_update_count +
          ios.leaf_page_merge_count +
          ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
)

SELECT
     iu.tableName
     ,iu.IndexName
     ,iu.IndexType
     ,iu.Usage
     ,iu.Percentage
FROM
     IndexUsage iu
ORDER BY
     iu.tableName
     ,iu.IndexName
     ,iu.Percentage desc
   
   
Despite the complexity of the internal statements, out final select is simple and easy to change and manipulate.

There’s an old phrase, “Work Smarter Not Harder”.  Using CTE’s is a good way to achieve the smarter.  I hope you find this helpful.

Thanks,

Brad

SQL Rally: Business Intelligence Workshop

Just TWO DAYS until the start of SQL Rally!  I’m going to the Business Intelligence Workshop Pre-Con as my first stop.  I’m as excited as a kid on the 1st day of school!

“Balls”, you say, “Why BI?  Aren’t you a DBA?”

Excellent question Dear Reader.

I’m working on my MCITP: 2008 DBA Certifications and after that I’ll begin work on the DBA developer Certifications.  My goal, for the end of this year, is to have all the pre-qualifications to go for the MCM sometime next year.  So where does BI fit in?

BECOMING MORE INTELLIGENT ABOUT BUSINESS 


Yep I’m a DBA, and it used to be we weren’t so compartmentalized.  It used to be if you were a DBA you were the Developer, the Administrator, and the BI person (and I would bet in some smaller shops that dichotomy probably still exists).  A database, was a database, was a database and that’s all there was to it.  But as the technology has advanced so has the set of skills required to do each job.

I’d like to say I couldn’t Cube my way out of a paper bag.  But this won’t be my first go round with SSAS.  Back in 2008 I had started studying to get my BI certifications for 2005, but life has a way of guiding you and BI was not a place I was supposed to dedicate my focus back then.  I understand the concepts at a 5,000 foot level, but as for applying them I need some work.

So Why BI?  Because our data can tell us a lot of things, and while storing it, maintaining it, and protecting it is important understanding what it can tell us is important as well.  Our data can help us understand trends, patterns, and answers to questions that we might not know existed.  You probably use metrics from your servers to analyze your performance (tracking and trending) .  Well this is doing it with your business data.

KICKING SSAS ONE CUBE AT A TIME


The guys over at Pragmatic Works are the top dogs when it comes to Business Intelligence, and the Pre-Con for SQL Rally get’s you training from not One, not Two, not Three, but FOUR of the top guys in the field today.

Go over to Kendal Van Dyke’s blog  Meet the SQL Rally Pre-Con Presenter’s and read about them, l .  Patrick LeBlanc, Devin Knight, Adam Jorgensen, and Mike Davis are all at the top of this game. 

This is an area where I really want to expand my knowledge and I can’t wait to attend.  I’ll blog about this more in depth on Wednesday night as part of my SQL Rally Day by Day Series. 

I can’t wait get started!  Bring on Wednesday!

Thanks,

Brad