Tuesday, December 8, 2015

T-SQL Tuesday #73: IOPS on Azure VM's Naughty to Nicer

Hello Dear Reader!  Last week I posted the invitation to T-SQL Tuesday #73.  The premise? 

As you work with SQL Server look around you.  Is your environment Naughty or Nice?  If it is Naughty what’s wrong with it?  What would you do to fix it?  Do you have a scrooge that is giving you the Christmas chills?  Perhaps you have servers of past, present, and future haunting you.  Maybe you are looking at SQL Server 2016 like some bright shining star in the east.

I don’t have an environment of my own.  Not one that I get to interact with every day.  Recently I’ve had several experiences where some friends with various companies were having issues and the root cause was all hardware.  Memory, SAN, or CPU one had become a bottleneck.

Several had to do with I/O.  One had issues with their SAN, another had issues with their code (causing unnecessary I/O), and another on their Azure VM.

“So Balls,” you say, “What does this have to do with Naughty & Nice?”

As always Dear Reader, thanks for keeping me on track.  For this week I’m going to tackle how IOp’s can affect your SQL Server.   I’ll be combining this with the pseudo code to mimic the issue my client had.  Then a little on Azure VM’s, and how Premium Storage made a big difference. 

THE SET UP

I’ve got a really bad query.  This is from years past.  Lots of unnecessary row by row logic.  Duplicate logic.  It’s bad. 

Sufficient to say, I’m rewriting it with pseudo code because I don’t want to risk insulting the guilty.  I will post the stats though.  We were on an Azure D13 Series VM, and we moved to a DS 13 Series VM.  This fixes nothing by itself.  Click here to see the machine specs.  See below for a quick table.  We will be using the DS3 VM.  We will move from standard storage to premium storage on the same machine to highlight our fix.


SIZE – AZURE CLASSIC PORTAL\CMDLETS & APIS
CPU CORES
MEM
NICS (MAX)
MAX. DISK SIZES – VIRTUAL MACHINE
MAX. DATA DISKS (1023 GB EACH)
CACHE SIZE (GB)
MAX. DISK IOPS & BANDWIDTH
Standard_DS3\same
4
14
4
OS = 1023 GB
8
172
12,800
OUR DEMO MACHINE
Local SSD disk = 28 GB
128 MB per second
Standard_DS13\same
8
56
8
OS = 1023 GB
16
288
25,600
Local SSD disk = 112 GB
256 MB per second

A DS Series VM gives us the ability to use Premium Storage.  We will use that turn bad code into, faster bad code.  Fixing the code comes later.  The first thing I will do is run CrystalDiskmark against my disk. 

Our G drive is standard storage. 


M drive is our premium storage.



The M drive is a 1 TB Premium Storage Drive.  I have done nothing fancy here.  Just attached and formatted.  As you can see we received a little bit better than 128 MB per second sequential in my baseline.

Our really bad query?  It runs in 19:48 seconds on our regular storage.  When I found it in the wild it actually ran for 4 hours.  I could only replicate so much.  Sufficed to say, for this example it will do.



Without tuning the query, but moving the database to premium storage, dropping all data from memory, and rerunning the query we reduced the time to 5:49 seconds.  A roughly 72% improvement.




Premium storage can go faster.  It is a matter of configuring the VM & the Storage to hit the IOP's limits you need.  For example, this is a screen shot from the DS 13 VM I was talking about earlier.  



With a DS 14 we would have been able to go above 512 MB per second.  Premium storage on DS Series VM's is some serious good stuff.

THE WHOLE TRUTH


The query is bad.  Really bad.  I can fix the whole thing by converting it from row by row logic to set based logic.  I can get it down to 2 seconds using the merge statement.

We did.  It was a big fix.  Originally we were hoping to get the process from 4 hours down to 2 hours.  After I was done we had it down to 5 minutes.

I say this often when teaching.  I can throw super-fast hardware at bad code and make it run faster.  But there is still room for improvement.  Fixing the query is for another day.   Today we keep our IO on the ball.   Why is it so fast?


CPU
Memory
SSD
Spinning Disk
Nanosecond
Nanosecond
Microsecond
Millisecond

You can only get as fast as your slowest component.  Premium storage uses SSD's.  We are hastening our retrieval of data from our disks by a factor of x10.  In this case it made our query 72% faster.  There are other things to fix.  The IO issue is one of them.


“So Balls,” you say, “Keep our IO on the ball.  Really?!”


Sorry Dear Reader, I couldn’t resist.  As always, Thank You for stopping by.

Thanks,

Brad





Tuesday, December 1, 2015

T-SQL Tuesday #73 Invitation – Naughty or Nice?




Hello Dear Reader!  This is the first Tuesday of the month and you know what that means.  It’s time to announce the T-SQL Tuesday Topic of the month!  This is your opportunity to participate in the largest SQL Blog party on the intrawebs. 

T-SQL Tuesday is an event started by Adam Machanic(@AdamMachanic| Blog) back in 2009.  The basic idea one blogger hosts the party and others participate.  We announce the topic the first Tuesday of the month, December 1st 2015 for today, and everyone will post their blogs on the second Tuesday of the month, December 8th 2015 for the actual posts.  This month the host is none other than……….. drum roll………….. ME!

Since it is the holiday season, I wanted to keep this holiday themed.  I couldn’t help but think of the approaching Christmas holiday.  Every year I get my children to make a list of presents that they would like to get from Santa Clause.  If we are out and about and they start to misbehave I remind them about their list and that only Nice children get presents.

“So Balls,” you say, “What in the ever loving wide wide world of sports does this have to do with SQL Server?!?”

As you work with SQL Server look around you.  Is your environment Naughty or Nice?  If it is Naughty what’s wrong with it?  What would you do to fix it?  Do you have a scrooge that is giving you the Christmas chills?  Perhaps you have servers of past, present, and future haunting you.  Maybe you are looking at SQL Server 2016 like some bright shining star in the east.

First and foremost the rules. 

Rule 1: Don’t get yourself fired.  If your boss is a scrooge don’t name names, don’t hint to hard.  It’s okay to protect yourself.  You want to write about events we can look back on and reflect over, not events HR would *love* to know about.  Perhaps a reminder of SQL Server's past would be better than that of SQL present.

Rule 2: Some Time next Tuesday using GMT, here’s a link to a GMT time convertor, publish your blog post.  For example in the US that would cover 8 pm Monday to 8 pm Tuesday.

Rule 3: Make sure that you include the Image at the top of the page helping to identify your post as a T-SQL Tuesday blog.  Then come back here and post a link in the comments so I can find them.  Before the end of the week I'll do a round up of all the blogs. 

Extra Credit!

Tweet your blog with the hash tag #tsql2sday, use SQL Server 2016, & go read someone else’s blog on the subject!

As Always Dear Reader, Thanks for stopping by and I’ll see you next Tuesday!

Thanks,


Brad