Monday, May 16, 2011

SQL RALLY DAY 1 RECAP

Day 1 at SQL Rally!  When I first learned about SQL Rally it was watching the PASS Keynote Day 3 remotely from one of our Conference Rooms with a couple other DBA’s back in November 2010.  I’m not going to lie I was excited! 

It was close by, and that would make it an easier sell to our company. 


So fast forward 6 months, and here we are.  Finally the Big Day has arrived, here at the Beautiful World Conference Center Marriot in Orlando FL.



SQL SERVER PARALLEL DATA WAREHOUSE- UNDER THE HOOD BY BRIAN MITCHELL

First Session of the first day and did it ever start out right!  Brian Mitchell (@BrianwMitchell | Blog)  is a SQL 2008 MCM and a Microsoft Premier Field Support Engineer specializing in Parallel Data Warehouse.  

Brian discussed how PDW is a black box, that the OS and SQL Server installations receive updates quarterly.  From a physical stand point if your organization buys one, you won’t be patching it, it comes pre-configured with software installed, so the majority of the work is already done.

After reviewing the basics Brian took us into the physical architecture.  Each system will occupy 2 Racks, within each you will have a Control Rack and a Data Rack.  Within each rack are Nodes that relate to the management of the application.  Familiar technologies are in use under the hood as well as some new components.  RAID, Failover Clustering, Resource Governor, Soft NUMA, are all utilized to ensure that the system has High Availability, redundancy, and stable yet high performance is achieved for the end user.  

Brian did a fabulous job, go to his blog and read more about PDW, and if you get the chance to see him live I would highly suggest it.

TROUBLESHOOTING PERFORMANCE PROBLEMS BY READING THE WAITS BY EDDIE WUERCH


Eddie Wuerch (@EddieW| Blog)  continued on the SQL goodness with his presentation on Waits.  SQL Server has come a long way over the years, and one of the great things that it does in SQL 2005, 2008, and 2008 R2 is collect statistics on where it is spending it’s time.  When SQL Server is working perfectly fine it will have Waits. 

An easy thing to remember about Waits is that the SQL OS, SQL Scheduling, aka the way SQL Server optimizes itself to use CPU’s is very similar to traffic patterns in a city.  If everything is working fine you will have a red light you stop at eventually.  Nothing is wrong, you just have to stop and wait until a resource, the road ahead, is available for you to use.  Until then other cars may be using it.  However when a car accident happens and a lot of traffic is backed up in a way that could have been avoided, high Wait times can show you where your problem is located at. 

I always think that a good session is where I learned something, a great one is where I learned something I can apply quickly.  The two things I learned from Eddies great session, Latches are locks on metadata in Pages and the Wait Stats that I occasionally see for Service Broker, on systems that are not using Service Broker, are there because DBMail is enabled.

UNDERSTANDING STORAGE SYSTEMS AND SQL SERVER BY WESLEY BROWN


Wes Brown (@WesBrownSQL| Blog) an amazing presenter. He is at ease in front of an audience, and he had a large one.  He has a great cadence and presence, which is not something that can be taught. 

He says funny things like how becoming an Accidental DBA was an act of him being "Volun-told" what he would do.  He says brilliantly simple things like “If you are a DBA the fastest part of your machine doesn’t matter, the slowest does”.  He knows storage systems inside and out, and he covered why you should never be using the write cache on a local disk, the differences in speed between SAN, Local Disk, and Solid State.  He covers RAID and how your reads and writes will be affected and how to weight the costs vs. benefit. 

One of the greatest points he made was that Transaction Logs are all about Sequential Rights.  If you load a lot of busy Transaction Logs onto one disk, you take Sequential Logs on Disks that work great with Sequential Reads and Writes, and you make them do Random Reads and Random Writes.  It was a great point to make we the audience think about our disk architecture.

WIT LUNCHEON AND PANNEL DISCUSSION PRESENTED BY SQL SENTRY


Karen Lopez (@DataChick | blog), Adam Jorgensen (@Adam_Jorgensen), and Melinda White where the panel moderated by Jennifer McCown (@MidnightDBA | Blog).  It was a very good discussion of how to get a Mentor and how to be a Mentor that crosses to both genders.  It is a subject that will be near and dear to me forever.  I’m a Dad, I have 4 kids, 2 boys and 2 girls.  Encouraging them, all of them, to be who they want and dream for anything is just something that has always come with the territory.  It was also great to catch up with my friend and former co-worker Wayne Sheffield (DBAWayne) at the lunch! 

T-SQL CODE SINS: THE WORST THINGS WE DO TO CODE, AND WHY BY JENNIFER McCOWN


Jennifer McCown is ½ of the Dynamic Duo of the Midnight DBA’s.  Here presentation was on the worst things that we do and have seen, and some suggestions in how to handle it.  Jen is a fabulous presenter, and she did a very dynamic presentation with a lot of audience participation.

Jen did this presentation as 24 Hours of PASS, view it here Session 8.  I had really enjoyed the presentation and wanted to see it again live, and it did not disappoint!  In any presentation with a lot of audience participation you really end up Herding Cat’s, Jen did this masterfully.

SQL UNIVERSITY: LIGHTNING TALKS BY JORGE SEGARRA

Jorge Segarra (@SQLChicken| Blog)  EL Polo Loco himself, a master on the microphone, everything from Lady Ga Ga (Although I suspect that was an evil twin), to the Cranberries (Death Metal Chicken), to the Dean of SQL University.

 If you are unfamiliar with SQL University, stop right now and click on the link.  Top SQL Professionals are offering up a free curriculum of learning and have been since the Fall Semester of 2009.  MVP’s, MCM’s, all headliners are blogging in order to teach and help further the knowledge of anyone looking to participate and better themselves.

The presentation was 5 minutes by each presenter, and then a panel discussion with the audience.  Jorge, Karen Lopez, Jen McCown, Grant Fritchey (@GFritchey | blog), Brian Moran (@BriancMoran | Blog), Mike Walsh ( @Mike_Walsh | Blog), and Aaron Nelson (@SQLvariant| Blog) where the presenters.  As well are they professors.  We discussed Database Design, Backup & Restore strategies, Professional Development, using Twitter for #SQLHELP, and Powershell.

SQLRALLY OVERDRIVE – SPEED NETWORKING, SPONSORED BY MAGICPASS

Kendal Van Dyke (@SQLDBA| Blog) the chapter leader of MAGICPass, the chapter that I regularly attend, hosted a session on speed networking.  We talked about handshakes, how to talk to people by asking questions that cannot be answered with the typical yes or no answers, and the do’s and don’ts of reading names off of lanyards.

It was a great session and was well put together, it was actually difficult to break up the conversations and rotate people.  Which naturally helped lead to the end of the evening and a good dinner, rousing conversation, and I think there was some Karaoke…..but that was difficult to recall.

While the Rally may be over, I'll recap day 2 tomorrow, better late than never J

Thanks,

Brad

Wednesday, May 11, 2011

SQL Rally: Pre Con Business Intelligence Workshop BI END to END


Hello and Greetings from SQL Rally in Sunny Orlando FL!  


Today was the day of Pre-Con’s, four different tracks to choose from and I went with the Business Intelligence Track.  All of the other track’s had One presenter, heck every pre-con I’ve ever been to has had One presenter, but not today.


Today was an end to end presentation covering SSIS, SSAS, SSRS, Powerpivot, Sharepoint, and Performance Point.  Something that big requires not One, not Two, no Dear Reader not even Three, but Four Presenters.


Mike Davis (@MikeDavisSQL), SQL MVP Patrick Leblanc (@PatrickDBA), Devin Knight (@Knight_Devin), and the Big Man Himself, The Sultan of SSAS, The Conqueror of Cubes, SQL BI MVP Adam Jorgensen (@Adam_Jorgensen) delivered a presentation that was large in scope and in delivery.  All of the presenters work for Pragmatic Works, all are published authors, and all will be speaking during the main SQL Rally event.

Pragmatic Works is a company that offers Consulting, SQL Products, and Training.  They are very active in the SQL Community.  If you’ve ever been to a SQL Saturday then you have probably had the opportunity to hear one of them Speak.  If you have visited or posted a BI question on BIDN you have probably had one of them assist you in some way.  They are good and gracious people and it has been a pleasure interacting with them in the past, and seeing them again today.


So with my buddy Gareth Swanepoel (@GarethSwan) showing the way to the Coffee, the experts were in place and we were Ready to Roll!

FUNDAMENTALS & SSIS

I thought it was very cool and original to have a rotating staff during a full day Pre-con, it kept it very fresh and lively, and as an audience member keep lulls from occurring.    Patrick started out the day by introducing himself and the other members of his team.  They quickly moved into a format where one person would be presenting and another would be up front as a co-presenter as well. 

Devon quickly briefed us on the concepts and terminology that we would be using for the remainder of the work shop.  We needed to know the fundamentals:

 Fact Tables -which will be the center of our BI “Data Model”, but as a rule should contain columns that you can aggregate.

 Dimensions Tables- the support tables that will provide information that we would use to sort or filter data by.  Think of the information in Dimension tables similar to the data you would use in a Where clause in T-SQL.

Surrogate Keys – These are keys without a natural identifier.  Think of using and Integer Identity Column in SQL.  But the usage in BI is important on two fronts, not only will they be used as an identifier but to insulate us, the users, from source data changes in Natural Keys.

Before the day was over we would covered Type 0, Type 1, and Type 2 Dimensions, Attributes, Members, Hierarchies, and Role Playing Dimensions.    We looked at Star Schemas and discussed the what makes a Snow Flake Schema, a BI Server in a VERY COLD room.

After an introduction we moved to SSIS and spent the next several hours there.  We reviewed the many ways we could use SSIS, and in some cases SSIS & T-SQL, for Extract, Transform, and Load.  Configuring SSIS packages for migration through environments, storage of packages MSDB vs. File System, and deployment of packages took almost the remainder of the morning.

CHANGE DATA CAPTURE

When you start talking about loading millions or billions of rows of data into a data warehouse every night, as a DBA I get nervous.  We all know the overhead that BCPing Data out of and into systems can have.  The overhead of disabling and rebuilding indexes, the ETL involved, Latches, Locks, and Blocks OH MY!  Part of you wants to say there has to be a better way, how do we find the data that has changed and migrate only that?
Change Data Capture was the way home.  Patrick had code and demo’s that I’m really looking forward to diving through and testing out.  This was a concept that really clicked, and it was one that I had not considered before.  Using the Log Sequencing Number, and tracking the columns that we need to capture (AND ONLY THOSE COLUMNS), we can minimize the overhead required to load data into a data warehouse.

LUNCH

Lunch was a great bag lunch with gourmet sandwiches, chips, water, apple, and a very delicious chocolate chip cookie.  Tasty and delicious, but the best part was the company I walked around and had great conversations with Jack Corbett (@unclebigguns), Kendal Van Dyke (@SQLDBA), Andy Warren (@SQLAndy), met Grant Fritchey (@Gfrichey), and finally sat down and enjoyed the warm Florida sunshine with my cohort Dan Taylor (@DBABulldog) and Stacia Misner (@StaciaMisner).  The SQL Community is a tight one, and every time I get together with people it seems like we pick up the conversation right where it left off.

SSAS-IT'S ALL ABOUT THE CUBE BABY


Adam was up next to discuss SSAS.  He tied together concepts to show us how the regular SQL Engine & the SSAS Engine were similar and different all at the same time.  How integers are more efficient in queries, which is why in a Date Dimension the SK should be an INT instead of a date time value, regardless of the fact that small date time and INT are both 4 bytes;  How XMLA is to DDL as MDX is to DML, and the drawbacks of calculated measures vs. Transparent Aggregation were all on the table.

He blew the audience away with the Usage Based Optimization wizard, which literally is a wizard that based off of Internal Usage Statistics can be used to tune SSAS at a granular user level.  He briefly touched on partitioning strategies, MOLAP, HOLAP, ROLAP, and when you would and should see each.

SSRS & SHAREPOINT



Mike did a great job of taking us through Sharepoint, demonstrating many of the configuration options, SSRS, Performance Point Reporting, and looking at the dynamic built in nature of SSRS 2008 R2 when it has the power of a data cube behind it.


TIME TO WRAP IT UP



Needless to say these guys are some of the top professionals in the world when it comes to BI, and it was fun to watch them work.   

The crowd was very responsive and if you can’t tell I enjoyed it immensely.

This was 7 hours of training, 1 hour of lunch, and 7 pages of notes.  Let me say that one more time 7 PAGES OF NOTES!  No matter what I’ve written I couldn’t do justice to the information provided!

I’ve only touched on the highlights of the topics Dear Reader, if you’re interested in more I’d encourage you to go to the Pragmatic Works website and watch some of the free video’s they have up for training.  They offer free webinars monthly on a variety of topics, and head over to the SQL Lunch website as well of which Patrick is the founder.

Now I’m off to bed, so I can be up and ready for tomorrow, SQL Rally Day 1!

Thanks,

Brad

Monday, May 9, 2011

T-SQL Tuesday 18: Common Table Expressions AKA CTE’s

T-SQL SQL Tuesday is brought to us this week by Bob Pusateri (blog|@SQLBob).  So Bob what is the topic this month?

Have you ever solved or created a problem by using CTEs? Got a tip, trick, or something nifty to share? I’d love to see your posts about any of the above. Also don’t forget that T-SQL Tuesday is not limited to only T-SQL:
So we aren’t limited to CTE’s but I do enjoy them, so off we go!


HOT FUZZ FIZZ BUZZ


I first learned about Common Table Expressions while studying for the 70-432 exam back in 2007.  They were neat, they were cool.  They have a lot of power when it comes to recursion.  The first time I used them in any practical sense was in a job interview.  I was asked the Fizz Buzz question.

I remembered reading a blog Rob Boek (blog|@robboek) had written about this exact same question, read it here, Fizz Buzz by Rob Boek

I thought it was simple. I thought it was elegant.  It was a great example of Set Based logic.  I was trying to make a good impression, and so I re-created it.  I was honest about it, when asked why I came up with the solution that I had.  It led to a great conversation about how we study as IT professionals, and learn from others.

“So Balls”, you say “We’re going to talk Fizz Buzz?”

Nope but if you’re interested in CTE’s you should stop and go read it, were going to talk about one of my favorite topics Compression.

SIMPLIFY THE COMPLEX


Common Table Expressions can be used as an alias to complex queries.  The example I’m going to work with today is one from the White Paper on Compression, Data Compression: Strategy, Capacity Planning and Best Practices by Sanjay Mishra.  I use some of the Queries from this white paper, to detect Scan and Update Usage Patterns of Indexes, in my Presentation on Page & Row Compression.  If you’re in the Orlando Area you can come see that presentation this Friday at SQL Rally!

The queries look at the sys.dm_db_index_operational_stats DMF to gage the Scan and Update Patterns of Indexes.  This is important when you are determining a Compression Strategy. 

If you have a High Percentage of Updates then you will experience more overhead with Page Compression, than if you had a High Percentage of Scans.  A good example is an OLTP system vs. an OLAP system, Reference Data vs. Data that is constantly in motion.

The queries are not overly complicated, but getting the information you want requires a certain amount of massaging.  A CTE can help this process by providing a wrapper for a very complex query so you can work with the returning data in additional ways.

We’ll start out with our 2 original queries and then move to the CTE that I’ve put together.

Original Query to Detect Scans

SELECT
     so.name AS tableName,
     si.name AS indexName,
     ios.partition_number AS [Partition],
     ios.index_id AS indexID,
     si.type_desc AS indexType,
     (ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count + ios.leaf_insert_count + ios.leaf_page_merge_count + ios.leaf_update_count + ios.singleton_lookup_count)) AS percentScan
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count +ios.leaf_page_merge_count + ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
ORDER BY
     percentScan;
GO

Original Query to Detect Updates

SELECT
     so.name,
     si.NAME AS [indexName],
     ios.partition_number AS [Partition],
     ios.index_id AS [IndexID],
     si.type_desc AS [IndexType],
     (ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count)) AS [Update_Percentage]
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
WHERE
     (ios.range_scan_count + ios.leaf_insert_count + ios.leaf_delete_count + ios.leaf_update_count + ios.leaf_page_merge_count + ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
ORDER BY
     [Update_Percentage] ASC;
GO

Both of these queries will get you the information you need, but if you want to see the results side by side you would need to perform a Union.  Further manipulation would require you edit both queries, and could get complicated.  This is where our CTE can help simplify things.

We’ll start out with our expression where we set the CTE name, the Data to be returned, and then we define our query.  Final we’ll have a separate select were we can manipulate our CTE and return the data as we see fit.

Our CTE

WITH IndexUsage(tableName, IndexName, [Partition], IndexId, IndexType, Usage, Percentage)
AS (
SELECT
     (ss.name + '.'+ so.name)
     ,si.NAME
     ,ios.partition_number
     ,ios.index_id
     ,si.type_desc
     ,'UPDATES'
     ,(ios.leaf_update_count *100/(ios.range_scan_count +
     ios.leaf_insert_count +
     ios.leaf_delete_count +
     ios.leaf_update_count +
     ios.leaf_page_merge_count +
     ios.singleton_lookup_count))
FROM
     sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) ios
     JOIN sys.objects so
     ON so.object_id = ios.object_id
     JOIN sys.indexes si
     ON ios.object_id=si.OBJECT_ID AND ios.index_id=si.index_id
     JOIN sys.schemas ss
     ON so.schema_id=ss.schema_id
WHERE
     (
          ios.range_scan_count +
          ios.leaf_insert_count +
          ios.leaf_delete_count +
          ios.leaf_update_count +
          ios.leaf_page_merge_count +
          ios.singleton_lookup_count) !=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable') =1
UNION All
SELECT
     (ss.name + '.'+ so.name)
     ,si.name
     ,ios.partition_number
     ,ios.index_id
     ,si.type_desc
     ,'SCANS'
     ,(ios.range_scan_count *100.0/(ios.range_scan_count +
     ios.leaf_delete_count +
     ios.leaf_insert_count +
     ios.leaf_page_merge_count +
     ios.leaf_update_count +
     ios.singleton_lookup_count))
FROM
     sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) ios
     JOIN sys.objects so
     ON so.object_id=ios.object_id
     JOIN sys.indexes si
     ON si.object_id = ios.object_id AND si.index_id = ios.index_id
     JOIN sys.schemas ss
     ON so.schema_id=ss.schema_id
WHERE
          (ios.range_scan_count +
          ios.leaf_insert_count +
          ios.leaf_delete_count +
          ios.leaf_update_count +
          ios.leaf_page_merge_count +
          ios.singleton_lookup_count)!=0
     AND OBJECTPROPERTY(ios.object_id, 'IsUserTable')=1  
)

SELECT
     iu.tableName
     ,iu.IndexName
     ,iu.IndexType
     ,iu.Usage
     ,iu.Percentage
FROM
     IndexUsage iu
ORDER BY
     iu.tableName
     ,iu.IndexName
     ,iu.Percentage desc
   
   
Despite the complexity of the internal statements, out final select is simple and easy to change and manipulate.

There’s an old phrase, “Work Smarter Not Harder”.  Using CTE’s is a good way to achieve the smarter.  I hope you find this helpful.

Thanks,

Brad