Friday, July 5, 2013

SQL Saturday Orlando: LAST CALL for Speakers


Hello Dear Reader!  I just wanted to write to Thank All of you for the submissions to SQL Saturday 232 Orlando.  This has been a crazy event and we are still several months away.  

Everything started out as it normally does, Karla Landrum (@karlakay22 | Blog),   leading the way pulling a motley crew of Shawn McGehee (@SQLShawn | Blog), SQL MVP Kendal Van Dyke (@SQLDBA | Blog), SQL MVP Andy Warren (@SQLAndy | Blog), Ben Cork, and myself behind her.

We hit a hiccup early on.  Our venue wasn’t going to be available on the date we had originally announced.  Unexpectedly we had to shift our date.  Some speakers couldn’t make it, and it brought us into conflict with other SQL Saturdays that some speakers had committed to speak at.  At that time we put out a very public call for speakers.

There's Still time to get a seat at our table!
The response was overwhelming!  So overwhelming that we are closing the call a bit early.  The call for speakers will end on July 10th and we hope to have the schedule out within a week or two after that.  Speaking with Rodney Landrum, my speaker committee co-captain, our goal is still the same.  No speaker will get turned away.



An essential part of SQL Saturday is to provide free training to the community.  Equally important is to help grow the next generation of SQL Server professionals who will be our speakers.  Look no further than myself to see proof of this.

So Dear Reader, get those abstracts in, because we’ll expand the number of rooms to fit you in!  Get ready to be part of the biggest SQL Saturday Orlando Ever!  Besides You know you want one of these!

As always Thanks for stopping by.

Thanks,


Brad

Tuesday, July 2, 2013

T-SQL Tuesday #44 The Second Chance

Hello Dear Reader!  This is the first Tuesday of the month and you know what that means.  It’s time to announce the T-SQL Tuesday Topic of the month!  This is your opportunity to participate in the largest SQL Blog party on the intrawebs. 

T-SQL Tuesday is an event started by Adam Machanic(@AdamMachanic| Blog) back in 2009.  The basic idea one blogger hosts the party and others participate.  We announce the topic the first Tuesday of the month, July 2nd 2013 for today, and everyone will post their blogs on the second Tuesday of the month, July 9th 2013 for the actual posts.  This month the host is none other than ME!

I love T-SQL Tuesday, there is always so much to write about.  Our world of technology changes so fast.  Each of us has the daily constraints of a life and a job as well.  Sometimes it is great to have a topic to write on so you can express your point of view or get the opportunity to dive a little deeper into an area of SQL that may have piqued your interest.  Equally as wonderful is reading all the other blogs that people have put together on the subject.  Variety is the spice of life, and will we get it in spades.


“So Balls”, you say, “That’s great, but what’s the topic?”

Thanks for keeping me on task Dear Reader!  Without further ado the topic of T-SQL Tuesday #44, Second Chances.

SECOND CHANCES



 As a DBA or a Presenter/Speaker we have all had at least one moment we would like back.  The demo didn't work, you were green and got asked a question you now know in your sleep.  You had a presentation in front of a client, and it all went sideways.  Maybe you logged onto the prod server thinking it was dev and dropped something you shouldn't have.  These moments serve not just as painful reminders, but also as powerful instruments for learning.  Would you like another shot at getting it right?  WELL NOW'S YOUR CHANCE!   Or I guess actually your…. Second…. Chance.  Your missions should you choose to accept it, tell me one of the moments you had, and most importantly what you learned from it!

First and foremost the rules. 

Rule 1: Don’t get yourself fired.  If you almost dropped the prod DB last week, truncated an important table, or took down a prod server during critical business hours, and nobody knows it was you & the people you work for read your blog, you should probably avoid writing about it here.  You want to write about events we can look back on and reflect over, not events HR would *love* to know about.

Rule 2: Some Time next Tuesday using GMT, here’s a link to a GMT time convertor, publish your blog post.  For example in the US that would cover 8 pm Monday to 8 pm Tuesday.

Rule 3: Make sure that you include the Image at the top of the page helping to identify your post as a T-SQL Tuesday blog.  Then come back here and post a link in the comments so I can find them.  Before the end of the week I'll do a round up of all the blogs. 

Extra Credit!

Tweet your blog with the hash tag #tsql2sday & go read someone else’s blog on the subject!
As Always Dear Reader, Thanks for stopping by and I’ll see you next Tuesday!

Thanks,

Brad

Tuesday, June 11, 2013

T-SQL Tuesday #43 Crossing the Streams

Hello Dear Reader!  This post finds me back in Boston getting ready for the Pragmatic Works Performance Tuning Workshop.  But that is not the reason for the blog.  You see today is T-SQL Tuesday!  T-SQL Tuesday is the brain child of Adam Machanic (@AdamMachanic | Blog).  The idea is once a month we the SQL Community have a blog party where someone hosts a topic and we all write.  This month the host is none other than the Rob Farley (@Rob_Farley | Blog).  So without further ado, Rob what is our topic?

The topic is Plan Operators. If you ever write T-SQL, you will almost certainly have looked at execution plans (if you haven’t, go look at some now. I mean really – you should be looking at this stuff). As you look at these things, you will almost certainly have had your interest piqued by some, and tried to figure out a bit more about what’s going on.”

So Plan Operators it is! 

“So Balls”, you say, “What will you write about today?”

Excellent question Dear Reader!  Adam did a fantastic 3 hours presentation on parallelism this past year at the PASS Summit.  One of the biggest things I learned to watch was the Repartition Streams operator to see how rows were distributed across multiple cores.  It was a nice AHA moment for me, and I wanted to pass it along to you!

CROSSING THE STREAMS



The operator itself is purely logical and is only ever used in parallel plans.  When it happens you know that an operation is occurring that requires a Consumer and a Producer Thread.  The producer thread reads input rows from its subtree, assembles the rows into packets, which are then placed in the consumer subtree. The Repartition Streams operator is a push operator that consumes multiple streams of records.  The format of the records is unchanged as well as the contents.  By using this model it allows the producer and consumer threads to execute independently, but worry not there is flow control to prevent a fast producer from flooding a consumer with excessive packets.

In parallelism you use multiple cores to achieve more.  If you notice a bitmap filter after a Repatriation Streams operator, that means the number of rows in the output is going to be reduced.  You will notice changes in the Argument column if your table is partitioned or if the output is ordered.


SQL Server supports 5 different ways that producers stream results to consumers.  Broadcast, Demand, Hash, Range, and Round Robin.  These different streams partitions the data across the threads.

Broadcast sends all rows to all consumer threads. Demand is normally used with partitioned tables and is the only type of exchange that uses a pull rather than a push.  Hash evaluates a hash function to determine where to send rows using one or more columns from a row.  Range uses a range function on one column in the row to determine where to send each packet, this type is rare.  Round Robin this alternates sending packets of rows to the next consumer thread waiting in a sequence.

We’ve discussed it now we’ll actually take a look at a parallel query and how the repartition is being used.  Let’s start out with a query.  I’ve got a large table in my Adventure works that I’ll be blogging about very soon, more on that later.  Today I’m using it as the basis for my query.  We’re going to do a simple aggregate.

set statistics profile on

select
     p.EnglishProductName
     ,sum(salesAmount) as ProductSales
     ,avg(unitprice) as avgUnitPrice
     ,count(OrderQuantity) totalProductsSold
from
     dbo.factinternetsales_big fb
     inner join dbo.dimproduct p
     on fb.ProductKey=p.ProductKey
group by p.EnglishProductName
Option (MAXDOP 4)


The goal here is to select enough data that SQL Server will produce a plan that will push the cost to a parallel plan, and in doing so will read in data using multiple threads.  In this case we will get two repartition streams operators in our statements.  Let’s start off looking at the query plan. We’ll also get the output of the set statistics profile on, but we’ll cover that in a minute.


Great so we’ve got our query plan.  The icon doesn’t tell us much in and of itself.  When you click on it and hover over it we get a little more information.


We see that we pulled back over 10.5 Million rows, We see the Partitioning Type is Hash, and that our estimated subtree cost was high enough for us to get a parallel plan.  The trick is to hit F4 while you have selected to get even more information.


F4 opens the Properties tab.  In my laptop I’ve got one CPU, two cores, and Hyperthreading is turned on. So I should see a total of 5 active Threads.  Threads 0 – 4.  Thread 0 will always have 0 rows associated with it.  The reason for this is because Thread 0 is a control node.  It is overseeing all the work on the individual threads.


By expanding the Actual Number of Rows we can see that this was somewhat evenly distributed.  Thread 0 is our control Thread so it should be sitting at 0 rows.  Thread 1 had only 1.7 million rows, Thread 2 had 3.6 Million, Thread 3 had 2.3 million, and Thread 4 had 2.9 million.  


We can see there was a skew, based on the hash.  Looking further down at our properties we can see that this was a hash partition, confirmed from our graphical portion of the plan, but we can see the hash key was on the ProductKey column.  This was done as we were gathering data to prepare for our join.

Looking at the output from the set statistics profile on we can see the same information to validate the type of repartition that we had.



Examining the output even further you will see a second Repartition of the Streams, also using a Hash Partition based on the EnglishProductName (because that’s what we grouped on in our T-SQL Query).  If you use F4 to look at the properties you’ll see we’ve greatly reduced the number of columns in by the time we get to this part of the plan.


So what does this all mean?  Sometimes Parallelism messes up.  Sometimes we do not get an even breakdown of rows across threads.  When this occurs we can take steps to address it.  Adam cover’s this in depth in his presentation, and I’ll have to cover this another day Dear Reader, because the plane is about to board, and I’m Boston bound!  But now you know where to look to find out how your rows are being partitioned when they are parallelized!

As always Thanks for stopping by!

Thanks,

Brad