Thursday, October 31, 2013

Tales from the Query Processor: II

Hello Dear Reader!  Today is Halloween, October 31st.  My Daughter turns a big 15 today!  Happy Birthday Ches!!!!!

To celebrate two years ago I posted my first Halloween Blog.  Tales from the Query Processor.  It was a lot of fun to write a fiction/horror story revolving around SQL Server and computers.  I wanted to do this last year but got a little busy with family stuff. 

Leading up to this year, I wanted to do it again.  This is one of those stories that's been sitting in the back of my head for a while.  I hope you enjoy it!  



INTO THE DARK

“It’s the same dream every night Doctor, and I don’t know how to make it stop and it just keeps happening.”  The panic in my voice was clear.  “I’m just a simple data page, I don’t know how to stop something like this!”

“Settle down, and start from the beginning,” was the doctor’s soothing reply. 

I settled in on the doctor’s couch, take a deep breath.  I’m in a safe place, I tell myself.  I’m in the buffer far from the disk, there’s plenty of room in the cache and the LRU algorithm is fast asleep.  Nothing can get me here.

CRASH!!!!

“WHAT WAS THAT!”, I screamed as I bolted straight up.

“Settle down,” said the doctor his tone growing a bit frustrated.  “It was just a stray hash bucket.  I’m sure it was nothing.  Lie back down, that's it. You were about to tell me about your dreams.”

Another deep breath, I’m in a safe… oh the hell with relaxing, “It always begins the same…. “

I’m at home in the disk.  I’m at the bottom of the B-Tree where data pages live.  Everyone is going about their business as usual.  It seems darker than normal, but I’m the only one who notices.  The lights of Data Center cascading through the rack, the blink of packets coming and going during their commute, the hustle and bustle of page splits and re-orgs.  It’s all there.  But there’s a fog.  Like I’m the only one that can see it.  A haze in the air, and everything is in slow motion.  I move myself out of physical order, changing my pointer so I can be found and maintain logical order.

My table is large and there are many IAM pages I can hop to.  I transverse the chain trying to get a better view.  The spinning of the disk heads are in the air, their normal beat and pulse that we’ve all come to expect.  It comforts us, like a baby in the womb.  I stare out into the vastness.  Pages so far it looks like the hex starts to fall off the end of the world into nothing but darkness.  That deep darkness captures me.  I cannot look away.  It's deep like a night with no stars and no moon.  Objects just start to melt in it.  The longer I look at it the larger it grows. Instantly have a stab of panic.  I know that I should stop looking.  But I can't!

It’s out there!  It sees me!  I shouldn’t have gone looking for it!  I should have stayed where I was.  I was safe back in my allocated space.  And I'm still staring!  I want to tear my eyes away from the darkness but I can't.  It's so empty and vast.  At least I thought it was.  The hairs on my neck stand up.  I can feel it.  I'm the only one looking for it and now it can see me.

I’m to close now.  I still can’t see it, but I can feel it with a certainty.  IT IS OUT THERE.  The normal spinning of the disk feels wrong.  It feels off.  Like a 757 flying 300 feet off the ground at 500 miles per hour, the disk continues to spin.  But if you watch it, you just know that any second somethings going to happen.  And that’s when I see it.

It lights up the sky with its spark.  Everyone else is still sitting there, like it’s a picnic on a sunny day.  Be lazy, enjoy the moment.  But I know.  I KNOW.  

I shouldn’t see it.  BUT I DO.  The light caused it to be visible.  The light that was some how gone a moment ago, the vile of fog has been pierced.  Worst of all, now I know that it sees me too.  I was the only one looking for it.  I stood out on the edge, and IT SAW ME.

It’s big and malformed.  It’s hideous.  It shouldn’t be there.  I know it.  Now it must catch me and add me to it or I’ll tell everyone.

I start running.  I need to get back to my allocated position.  I’ve got 10 IAM pages to transverse to get there.  524,288 pages per allocation.  I’m running and running.  This should be taking nanoseconds, but the pages are stretching before me, more, and more, and more.  It’s taking microseconds just to transverse one page.  I can feel the ground quake behind me.  I can feel it closer to me.

I’m terrified.  Running.  This is the end and I know it.  I see my spark starting to go out in the darkness.  I know it wants my bits and it wants to re-arrange them. 

I scream and there is no sound.  I’m trying to get the Checksum’s attention.  Screaming and Screaming, but there’s no sound!  Everyone is still going about their business.  Sunday picnic’s!  Maintenance windows, Read-ahead’s from a query, none of them see it!

READ-AHEAD’S!  If I can grab onto one it can pull me up!  There’s a chance!  There’s a chance I can make it!  But the GAM Allocation stretches out even further.  I extend my hand, like that will do any good.  Reach for as far as I can.  But it’s no good.  It falls on me.

“And that’s where I wake up”

The Doctor was on the edge of his chair.  Moments ago, he would have thought this simple data page another fragmented mess just in need of a bit of re-ordering.  Unknowingly he has stopped taking notes, and his right hand is digging deep into the arm of his chair.

“What was it?”, asks the Doctor.

“Doc, you know what it was as well as I do.  I don’t want to say it.”

“No of course not,” said the doctor.  Forcibly relaxing.  “Look, dreams like this are somewhat common for data pages.  I wouldn’t worry about it.” 

“Really?”

“Yes I’m sure everything is fine.  Often times dreams are just subliminal ways of dealing with a long day.  Head on home and let’s meet again next week,” he says jotting down some notes.  “Can I see your insurance card and driver’s license one more time?  I’ve want to make sure I get all the numbers down correctly.”

“Sure,” I say handing them over.  “So next week then.”

“mmmhmmm”, mutters the Doctor and he copies down the number next to his notes.
I rise.  The doctor is right I feel a little better just getting it off my chest.  We smile shake hands and I leave the office.

I’m almost to the door when I realize my keys feel out of my pocket.  I go back to the knock on the door, it’s still cracked open.  That’s when I hear him.

The Doctor is yelling into the phone.  “What DO YOU MEAN THERE’S NO ALERTS ON THE SUSPECT_PAGES TABLE!  We need to raise some sort of alert!  They need to check the Windows Error Log! NO NO YOU FOOL, 823 & 824 DON’T APPEAR IN THE SQL ERROR LOG IT’S GOT TO BE WINDOWS!”

The other end of the phone squabbles back.  The Doctor cuts him off sharply.

“Look I know damn well CHECKSUM is not ON!  I just had a torn page in my office!  IN MY OFFICE!  CHECKSUM was enabled at some point, but it must have been disabled.” 

“Doctor, I’m a torn Page!”, I say announcing my presence at the door.

“CHECKPOINT!!” The Doctor yells into the phone, “CHECKPOINT!!!”


And the world went white.


WRAP UP

As Always Dear Reader, Thanks for stopping by.  And Happy Halloween!

Thanks,

Brad

Friday, October 18, 2013

PASS Summit 2013 Deck and Demos Live

Ready to rock some Internals.
Hello Dear Reader!  This is just a quick blog to say that my Deck and Demo's are now live for the PASS Summit 2013.  My presentation is at 9:45 am in Room 217-A.  You can get them by going to the Resource Page, or Clicking Here for the Deck or Here for the Demos.

I did something different this time that I hope you will enjoy.  Internals are great but why do they matter?  In the world of performance tuning and troubleshooting an understanding of Internals is essential.

Understanding what's going on under the hood often tells us where we need to be looking when we are problem solving.

"So Balls", you say, "You said you were doing something new?"

Thank You for keeping me on point Dear Reader!  I'll be mixing a whole list of Why Do Internals Matter with this presentation and I'd love your feedback.

Hope to see you there!  As always Thanks for stopping by!

Thanks,

Brad

Wednesday, October 16, 2013

PASS 2013 Keynotes Live Day 1

Final thoughts Dear Reader.  This is a big launch year for Microsoft and we are getting some very cool toys to play with today.  Make sure you go get the SQL 2014 CTP 2 bits.  I was telling Brent Ozar before the keynote that I thought they would extend the supported upgrade path from SQL 2005 to 2014.  I was 50/50 on that.  After the keynote I'm 90/10.

As always Thanks for stopping by!

Thanks,

Brad

9:50 am  We slice by International and Domestic, and using Power BI through the Office 365 portal we are able to use the power of Bing and get a world map breakout of calls.  Using a simple typed search you get a scatter plot.

The slicing on the fly via simple English language queries.  They look at the data by country, by platform, then they add additional countries receiving different reports by typing in a simple text box.  Just like using a search engine.

ANNOUNCING a Power BI contest that is aligned with the BACON, PASS BA Conference, coming in April.  You can vote on the winner on facebox http://www.facebook.com/microsoftbi

The winner will get a paid trip to BACON.  Who doesn't love BACON.

We review how we've pushed the boundaries.  Looks like we are wrapping up.

9:40 am We will be pulling from multiple data sets and area's.  Power Query they are pulling scrubbed user data, nice dig at Google and respecting users privacy.  Couple clicks we've gone from big data to structured data.

Now we are going to pair data and graphics with our data.  We head over to Power Query, you do an online search and you can tie that data to columns.  The data is shredded and displayed in a relational format.  Dig at Quentin's ability to do joins by the speaker.  Lot of funny banter between the two.  Population statistics are pulled in to pair up with the country, divided by population to show what the Skype calls per capita are.

They are using M behind the scenes to, for lack of a better word, power Power Query.  M is how they internally refer to the language.  It is some powerful stuff.

We we get a display of how this works on a Surface, Android Tablet, iPAD, Chrome laptop.  It just works, and it used HTML-5.  No more Silverlight.

We get a great breakout of the data calls per-capita.

9:30  am  We get a video about how the City of Barcelona is using Hadoop and Big data to manage all the city services.  Nice plug that reminds me of how IBM advertises their infrastructure services.  Nice to see a Microsoft video showing how they do the same.

Every organization has an IT need, and must be an IT organization to better handle their data and distribute services.  Interesting to see a large international city that recognizes that.

NOW for some BI!

Quentin is telling us all about Power BI, Power Query, Power Pivot, Power View, and Power Map.  Sense a trend here, you are empowered via excel.

DEMO

Very cool, looks like we will be using Skype data, 35 TB's a day are produced.

9:20 am We have a nice slide highlighting the improvements for HA/DR and how they utilize the cloud.  Next up Redefining mission critical data.

So how do we make the Cloud mission critical.  Resource Governor in the cloud using Premium Cloud Services.

Next up Big Data.  Hadoop has a lot of ways it is being integrated.  HDInsight is Microsoft Hadoop in the cloud.  Horton Works is the premier MS partner for Hadoop.  It runs on Apache, not Windows Server.

There are workloads not made for SQL Server.  Dr Dewitt did a great deep dive on Hadoop at the Summit 2011.  His keynote is online, and should be watched by any SQL Server person.  It contains the roadmap of what their team was looking at.  Hadoop tackles alot of those non-SQL Server based workloads and data sets.  In PDW v2 we have PolyBase to be able to run queries against Hadoop.

Lot of cool stuff coming with PolyBase.


9:10 am  We are getting more about hybrid DR and Availability can be used with on prem and the cloud.  Backup all versions SQL 2005 - 2014 directly to Azure storage is announced.

Natively encrypted backups for SQL 2014.  More to come on that. We also get a little more info on Managed Backup.  For shops without a full time DBA this will be a great feature.  If you're paying for a DBA managing Backups should be part of their job.

I would bet DIFF_MAP's are being monitored to trigger this.  Could be some interesting threasholds with this.  I do love good blog fodder.

Backup to Azure is being Demo.  We are walking through registering the Azure storage.  Nice thing is once you configure it, it can fire off automatically from there.

I understand that people are a little scared of the cloud.  But this is IaaS, Infrastructure as a Service using Azure Blog Storage.  Think backing up to a VM in another datacenter and the on-prem translation is easier to understand.

9:00 am Enough about the slides, let's do a DEMO!  Tracey from the SQL team comes out to do the demo.

We've got a game website for X-Box 360 games.  We see the load times when games are recommended.  We see a 10x improvement just by converting the table to an In-Memory table.  converting the natively compiled stored procedure gives us another x11 improvement.

Look ups taking almost 10 seconds are now sub-second.   We have a lookup process that runs for 26 minutes.  Using a Clustered Columnstore index we get down to less than a minute.

No need for application coding changes to use Hekaton and Clustered Columnstore.  I can't tell you how many calls and presentations I've done on those two technologies.  They are the big two for 2014 and they are going center stage.

Back to pushing the boundaries, next up Breakthroughs in availability and recovery.


8:50 am  Quentin says "Good Morning"!  He Thank's the community for the work we do to evangelize SQL Server.  There are a lot of pictures they were taking at the Pre-cons that are subject area's surrounding Cloud and Big Data.

Quentin is giving us a story about On-Premises and Cloud.  We are going to hear a lot about the merger between cloud and on-prem.  SQL 2012 CU 2 had a lot of integration, SQL 2014 takes that even further.  Nice graphic of climbing the Himalayas to signify the journey of bringing cloud and on prem together.

We will only be talking about product in GA, in preview, or that will be in GA soon.  Nice chuckle from the AZURE CAT team.

PDW V2, Power BI for Office, Power Map, SQL Server in Azure VM, HDInisght, 2012 SP1, SQL Server 2014 CTP 2.  (*There it is).

Quentin has announced that CTP 2 is the last publicly available CTP before RTM!  Next up, all about SQL 2014 and "Pushing the boundaries".  The keywords begin.

1st Up In-Memory, nice jab at Oracle. We are covering the high points of Hekaton/In-Memory OLTP.


8:40 am Amy Lewis has received the PASSION award for volunteerism.  Way to go AMY!!!  Honorable mentions are given to another nominee Ryan Adams.  Ryan does a lot of work in his local user group and the DBA performance virtual chapter.  All of the volunteers, speakers, and chapter leaders are asked to stand.  There is a myVolunteering link at the PASS website.  Bill is asking everyone to update their profile.

Quentin Clark is introduced to the stage.  We get a nice video about how SQL Server is awesome, nice community pictures, Plug for SQL Server 2014.


8:30  am     Bill Graziano is on stage making the PASS announcements.  Over 700,000 hours of free training where given this year.  The main focus will be volunteers.

Really cool video using Power BI to show the history of SQL Saturday, nice shout out to Andy Warren, Brian Knight, and Steve Jones for starting what has become a great community mainstay.



Hello Dear Reader!  It's that time of the year again, I'm sitting at the PASS Summit and I'll be live blogging the keynotes.  Read it from the bottom up!  With the 2014 CTP 2 bits already live on the web, I expect that announcement shortly!

Wednesday, October 2, 2013

DevConnections Deck & Demos Live!

Hello Dear Reader!  Just a quick blog today.  The Deck and Demo’s for my Presentations today at SQL DevConnections are Live!  

I just want to say a Big Thank You to the Organizers for having me and putting on a great conference.  I also want to Thank the people who attended the presentations today.

We had almost a packed house and standing room only for the Inside the query Optimizer session  and lots of great questions all around!   


Thanks again for a great day of SQL learning here in Vegas!

Thanks,


Brad

Thursday, September 26, 2013

Performance Tuning In LA!

Hello Dear Reader!  This year Pragmatic Works has brought our Performance Tuning Workshop to Chicago, Boston, and Atlanta.  Due to the success we've added one more stop before the end of the year, Los Angeles!

That's right Pragmatic Works is headed to the City of Angels on December 9th - 10th for two days of Deep Dives and Performance Tuning goodness.

"So Balls", you say "What will you be covering in you're Performance Tuning Workshop?"

Great question Dear Reader!  We start off our journey at the bottom talking about hardware, we discuss the details of CPU's, Memory, and Disks and how their throughput and proper configuration can prevent Hardware Bottlenecks.  We cover installing SQL Server talking about our Pre-Installation Check List and our Post Installation Checklist.

Then we tackle Data files, Round Robin and Proportional Fill, Partitioning, and Compression.  Next we dive into the Query Optimizer to discuss how the Relational Engine works and how it works with the Storage Engine, along the way we'll stop to look at Execution Plans and Plan cache.  From there we dive into the world of Indexes, covering what makes the best Clustered Index, Non-Clustered Indexes, Columnstore Indexes, Index Maintenance, Duplicate Indexes, Reverse Indexes, Unused Indexes, and Missing Indexes.

And that's just Day 1.  We've still got Locking, Blocking, Latches, Wait Stats, Baselines, Alerting, and Extended Events before we are through.

As a member of the class you will get all of our slide decks and Demo's to follow along and keep for yourself.  Lunch is also provided each day as well as coffee and donuts or bagels and muffins.


For all the details Click Here to go to the course outline.  Let's not forget that you get access to myself, and my co-Presenter for those two days.  You'll have a lot of questions, and during brakes and after the class you'll have access to us for Q & A.


Plus you'll get to network with other great folks from the SQL Community and your local SQL Server User Groups.

ALL THIS AND...

All this and I still haven't announced my co-Presenter for this yet!  We'll save that for another blog.


"So Balls", you say "Why are you being a tease?"

Yes, Dear Reader.  Shamelessly it is a tease, but I promise it will be worth it.

And now for one last shameless plug.  The price.  Two days of community, access to a top notch as yet to be announced speaker..... and me....., breakfast, lunch, and probably some swag all for $399.  Right now we also have an Early Bird rate until November 9th, it is only $299.

NEXT YEAR


We are already actively planning our courses for next year Dear Reader.  As soon as I have the finalized schedule I will be posting it.  But right now we have NINE 2 day Workshops we are planning, FIVE 4 1/2 day virtual classes on Performance Tuning, and TWO 5 day week long Bootcamps that will be staffed by a couple SQL MVP's and an MCM.

We will be all over the US, hopefully in a city near you, and presented Virtually throughout the Globe.

Good stuff is coming and I'm very excited to be part of it!  As always Dear Reader, Thanks for stopping by.


Thanks,

Brad




Tuesday, September 24, 2013

Can You Compress a Temp Table?

 Hello Dear Reader!  We are finishing up the final day of the Performance Tuning Workshop here in Atlanta and I got an interesting question on Compression from Tim Radney (@tradney | Blog).

The question: Can you compress a temp table? Just a quick blog to get the answer out there while Gareth Swanepoel (@GarethSwan | Blog)  teaches the class about Extended Events. 

My guess was yes.  Temp Tables can have statistics, Clustered and Non-Clustered Indexes, while they only exist in the session they are created, I would be they could be compressed.  If you would actually want to compress them is a different discussion, but let’s prove this out.

DEMO

Here’s a quick demo to show you can do this.  So first up we will create our Temp Table specifying with Data_Compression=ROW.

This will create our temp table #myTable1, we will then insert 15000.

if exists(select name from tempdb.sys.tables where name like '#myTable1%')
begin
     drop table #mytable1
end
go
create table #myTable1(
              myid int identity(1,1) primary key clustered
              ,mychar1 char(500) default 'a'
              ,mychar2 char(3000) default 'b'
              ) with (data_compression=row)
go
declare @i int
set @i=0
while(@i<5000)
begin
     set @i=@i+1
     insert into #myTable1
     default values
end

Now let’s use DBCC IND to view the pages associated with our table, and DBCC Page to Validate that our data is compressed.

dbcc ind(tempdb, '#myTable1', 1)
go



dbcc traceon(3604)
go
dbcc page('tempdb', 1, 376,3)
go



Looking at the output of DBCC Page I can see that the CD array for my compressed data is present near the header.  Row compression is indeed on.

Now let’s rebuild this using page compression on a rebuild operation using sp_spaceused to measure the size of the table.


And it is now Page Compressed.  Thanks for the question Tim!  And as always Dear Reader Thank you for stopping by.

Thanks,

Brad




Tuesday, September 17, 2013

SQL 2014 SSMS is Killing my C:

I could hear SQL saying to my C: Drive "Why You Little...."
 Hello Dear Reader! 

The Blog about SQL Saturday 232 coming shortly, but first I needed to blog about an error that I’m getting.  I presented this weekend on SQL 2014 a First Look at What’s New.  One of the demo’s I did was using Buffer Pool Extensions. 

To confirm with what I’m seeing I tested this on 2 VM’s and one physical Instance.

VM1
I created VM1 on my SSD.  I allocated 4 processors and 8 GB of RAM for my VM.  I created 3 drives for data files M: , log files L: , backup files W:, Buffer Pool Extensions S: .  I then Installed CTP1 using Oracle’s Virtual Box.  Max Memory set to 4096 MB.

VM2
I also confirmed these results by creating a Hyper-V VM on our Host server in the Pragmatic Works Lab.  The guest has 4 processors and 8 GB of RAM.  Max Memory set to 4096 MB.  No BPE was used on this VM.

Physical Instance
I also tested this on my laptop on a SQL 2012 SP1 CU2 instance.  I have 4 cores and 16 GB of RAM.  Max Memory set to 6144 MB. 

I then used WinDirStat to confirm the results.  During my SQL Saturday presentation my VM unexpectedly ran out of space on the C drive.  This was a little confusing as I had specifically set up my SQL instance to not be installed on the C drive.

“So Balls”, you say, “What was taking up all that space on the C drive?”

Great Question Dear Reader!  That’s want I wanted to know as well.

KILLING ME SOFTLY… THEN NOT SO SOFTLY

This is the VM.  As you can see I didn’t go with a huge C drive, only 25 GB.  The only drive I didn’t mention above was my T drive that I used to store a database that I put a Clustered Columnstore Index on.



I’ve got a script that I need to blog on called MakeAdventureWorksDW_big, but I don’t have that typed up just yet.  I use that script to make my FactInternetSales_big table.  I used my friend and cohort Robert Cain’s (@ArcaneCode | Blog) script Make Adventure Works Modern for Today to make my AdventureWorks2013 Database.



My table has 42.9 Million rows in it and is a little over 8 GB.  My test for PBE was pretty simple.  I created a 15 GB BPE.  The instance has 8 GB.  My max memory for the instance was set to 4 GB.  Then I raised the max memory to 19 GB.  I ran sys.dm_os_buffer_descriptors using the is_in_bpool_extension to find the data pages as they were allocated.

SELECT DB_NAME(database_id) AS [Database Name]
, case is_in_bpool_extension
     when 1 then 'Stored in BPE'
     when 0 then 'Stored in non-BPE cache'
  end as BufferLocation
,COUNT(*) * 8/1024.0 AS [Cached Size (MB)]
FROM sys.dm_os_buffer_descriptors
WHERE database_id > 4
AND database_id <> 32767
GROUP BY DB_NAME(database_id), is_in_bpool_extension
ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);


I then ran the following query to load up as much data as possible.

Use AdventureWorksDW2013
go
set transaction isolation level read uncommitted
go
set nocount on
select * from [dbo].[FactInternetSales_big]

It is just a simple select statement.  Rerunning my buffers query I could see things were loading up nicely.



My plan was to leave this query running so I could load all the data I needed to into my BPE.  That is when the C: ran out of space and the VM froze on me.  

In order to catch this bug I then ran WinDirStat.  Looking under my C:\Users\SQLBALLS.PW\AppData\Local\Temp\1 folder I found this.



Hmmm…. right now not that bad still plenty of free space, but here’s this temp file.  Let’s fast forward to 3 minutes in.  We’re now at 2 GB.



At 5 minutes we are at 3 GB.

Looking at my C Drive, I’ve now dropped from 10 GB of free space to a little over 7 GB.


The result set is large enough that this will continue until my C drive runs out of space.  If I try to delete the file I get a nice error letting me know that the file is locked by SQL Server Management Studio.



15 minutes in my C drive is looking pretty scary.  We are now in the red.  I kill the query so I can keep looking at my VM.  The C drive doesn’t free up.  I know from experience I could now delete the temp file because it is no longer in use.





I asked the question on twitter and Jason Kyle(@JasonNKyle) replied with a suggestion that maybe I was looking at a Swap file. 


As I understand them a swap file allows an operating system to use hard disk space to simulate extra memory when the system runs low on memory.  The way this behaves is that the OS swaps a section of RAM that an idle program is using onto the hard disk to free up memory for the program you are using.  Then when you switch back to the other program the OS trades out the memory for bytes on the HD.

I don’t think this is a swap file though.  I could be wrong. 

When I open one of the smaller temp files in notepad, the query results from my my query re-run so it is a much smaller result set.  This is what I see.



Not really useful, except that I realized the first data that I highlighted happened to be my sales order number from my query.



As I continue to mine through this I see that this is truly my query result set.

If I run this same query on my laptop.  16 GB of RAM, 6 GB Max Memory, after a restart with only around 1 GB in the buffers, I get the exact same results.  At first I thought this was an issue with Buffer Pool Extensions.  Now it looks like this is just the way queries return on SSMS.

Granted you normally don’t run million row transactions in SSMS.  Your application runs them.  So more digging to go, but I wanted to get the blog out to attract comments and help show what I was seeing.

As always Dear Reader, Thanks for stopping by.

Thanks,

Brad



Tuesday, September 10, 2013

T-SQL Tuesday #46 Rube Goldberg Machine aka Automating Deleting Older Certificates




Hello Dear Reader!  This is the second Tuesday of the month and you know what that means, T-SQL Tuesday the largest blog party on the Intrawebs.  T-SQL Tuesday is the brain child of SQL Community member extraordinaire Adam Machanic(@AdamMachanic | Blog), also the inventor of the word “Twote”  as in “To misquote a Tweet”, when used in a sentence it sounds like “He Twoted me wrong”.  This month our host is Rick Krueger(@DataOger | Blog).  So Rick what’s our topic?

My first exposure to Rube Goldberg Machines was playing the game Mouse Trap as a child. I work almost exclusively on the SQL development side of the house, where we sometimes build crazy creative solutions to solve business problems. We generally know the ‘right’ way to do things, but pesky issue like budgets, personnel, and deadlines get in the way. So, we channel our inner MacGyver, grab a handful paper clips and some duct tape, and then do things with SQL Server that we know shouldn’t be done (in an ideal world). And we hope nobody ever finds out how we bent the rules, because we know they will judge us (as we would judge them) and call our work a <gasp>HACK</gasp>.
So, if you would please, dust off one of those skeletons and tell us how you got really creative with SQL Server, instead of doing it ‘the right way’. In other words, tell us about your ugly SQL baby. If you’re worried about saving face, feel free to describe how you would have implemented the solution if you lived in that ideal world.”
I love mouse trap and MacGyver!  Over the years as a DBA sometimes you have to work with what you’ve got.  Other times your boss says do A, you say the best way to achieve A is by doing B & C and they say do A.   I’ve got two of these that I can think of off the top of my head.  One we used Change Data Capture in lieu of Auditing (don’t ask me why, because that version of SQL also had Auditing.  Oh Hello A…..).  The other may actually prove useful.  Which one to choose from?
“So Balls”, you say, “What’s the useful one?”
Good call Dear Reader, we’ll go with the useful one!

OUT OUT D@MN SPOT CERTIFICATE

When you are using Transparent Data Encryption one of the most important things is the certificate.  Once you enable it on a production database that certificate is just as important as your database backup.  Why?  Because in case of a catastrophic failure that backup is dependent on the certificate.  If you cannot restore the certificate to a new instance your backup is useless.  *There are some work arounds to this using backups of the Master DB, but we’ll save that for another day.*

When you look at setting up maintenance plans for your server you should create a job to back up your certificate daily.  A certificate is only 1 KB in size.  Very tiny file.  If you use a private key to encrypt your certificate it is only 1 KB in size as well.  So if you leave a year of them on your hard drive you haven’t taken up 1 MB.

As a DBA sometimes you can be anal retentive a neat freak.  I don’t keep a year’s worth of backups on hand, why would I keep a year’s worth of certificates on hand?  I’d like a process to automatically delete them and only keep the last two weeks on hand, or month on hand whatever matches up with my backup retention policy.

The problem is the automated cleanup task doesn’t work.  Sure you can go in the maintenance plan wizard, make one that looks in a directory for a .CER file, but the true problem lies in the data storage.  You have to custom script out the certificates.  If you didn’t think to add a line to the backup set history table with the extension of .cer and .key and the path to your Private Key or Certificate backups then the job won’t work.

Inserting records into the MSDB tables could work, but as a DBA new to TDE that thought hadn’t crossed my mind.  I wanted a way to back up my certificates and delete my old ones.  So I built one.

MY RUBE GOLDBERG MACHINE

This is a demo I do in my TDE presentation.  It’s up on my Resource Page and has been for some time.  Today I realized I’d never blogged about it.  My scripts heavily use XP Command Shell.  I had an audit setting in my environment that wouldn’t allow that to be on my servers.  So in this script I turn it on in the beginning and off at the end.  The nice thing about the script is I unit tested it and even if there is an error in the script the sp_configure settings are server level commands that occur outside of transactions, so they run no matter what.  The script runs quick, but it will make logged entries in the SQL Server Error log stating that XP_Command shell was turned on and off.  My audit team could live with this so I was able to implement it.
I also like to use a private key and a password for my TDE Encryption.  I don’t want the password sitting around in plain text in the job either.  So I make a database called TDE.  In it I have one table called tdeKeys.  I put two columns in there one is the name of my certificate that a private key will be created for the other is the password to use for that private key.  In secure environments you could set up column level encryption to ensure the password is not in plain text even in the table field.  The demo scripts I’m going to give you doesn’t use column level encryption.  It contains a function that retrieves the Password for the Certificate Name.
Next we will create the dynamic script to back up the certificate.  Note that I backup the Master Key as well.  If you are using column level encryption you’ll want a copy of the Master Key.  You’ll need to specify the path that you want to back up the certificates.  Also you will need to specify the certificate name.
Finally we will create the script that will use xp_cmdshell to transverse directories to manually delete our backups.  You will need to edit the file path in this script and insert the Master Key and certificate names in line 74.  Finally on line 103 you will need to alter the DATEADD function.  Right now it would only keep 4 days of certificates on hand, you’ll need to edit the DATEADD to match up your backup retention policy.

Want to see the whole presentation live?  I’ve done this for Pragmatic Works Training on the T’s, click Here to watch.  You’ve got to sign up for a Pragmatic Works account if you don’t already have one, and you’ll get free info on all the free training we do monthly!


OVERKILL

“So Balls”, you say, “Is this over kill?”
<soapbox>
Well Dear Reader it depends on your environment.  You must consider Level of Effort and Level of Response, LOE and LOR.
LOE is one part you one part the hacker.  The more secure you make something the less likely that a hacker will keep going for it, or how far they will bother to go.  On your part it is how far you are willing to go to do your job.  We can also get dissuaded from going the extra mile sometimes.  Your LOE should be governed by your organizations LOR.
LOR is the response that your organization will have to the event.  One thing I like to tell folks is that if you are ever in the position that your security has been breached, and you are then talking to your boss, his/her boss, the CIO, a high ranking officer in the military, or a/multiple high ranking government official(s).  Trust me when I say that you want to be able to say you took every last step possible to protect the data under your care.  The more detail you can provide the better.  So overkill?  Maybe.  CYA.  Absolutely. Thankful that no fault on your part was found and you still have a job? Yep.
Having been in this position trust me take the extra couple steps, if you ever need it you’ll be glad you did.
</soapbox>
Thanks to Rick for hosting this today, and as always Thank You Dear Reader for stopping by!
Thanks,

Brad

Friday, August 16, 2013

24 Hours of PASS Q & A: Why are Bit’s Flipped?



http://bit.ly/16UyEPH
Hello Dear Reader.  Almost two weeks ago I delivered a presentation for the 24 Hours of PASS on SQL Server Data Internals.  I received some really great questions that have pushed me to dive deeper in my understanding of SQL and computers in general.  One of the questions was as follows:

Why are Bytes sometimes swapped and why do you have to flip bits in order to decode a bitmap?

At the time I took a WAG at the question, Wild @$$ Guess.  I said that they were flipped because of a need to protect proprietary data.  I couldn’t have been more wrong.  Fortunately for me my friend and co-worker Roger Wolter (@RWolter50 | Blog)  was watching and was quick to let me know that the source of the byte swapping was not proprietary.  It was because of Endians.


“So Balls,” you say, “It was because of Endi-a-Whats?”

Don’t worry Dear Reader, I was right there with you.  This was a new term to me, some of you clever former Computer Science majors are already seeing a light bulb form.  For my sake humor me and pretend you haven’t already figured this out.  Some of you former Literature Majors, or general readers of classic tales, are wondering what Gulliver Travels has to do with SQL Internals.

BIG AND LITTLE ENDIANS
Jack Black, Comedian, Singer, Dancer, ...Computer Scientist

In Jonathan Swift’s satirical novel Gulliver’s Travels, Gulliver ends up in a land called Lilliput.  Lilliput has hostilities with their neighbor Blefuscu.  You see Lilliput likes to eat their hard boiled eggs by cracking the little end.  Whereas Blefuscu likes to eat their eggs by cracking the big end.  There for the Lilliput’s are known as Little End-ians and the Blefuscu are Big End-ians.  Side stepping Swift’s satirical play on societal issues the term was later utilized in Computer Science over 200 years later.

So what does this mean for Computers?  It is how we read data out of memory. 

We will be covering memory at a very high level, even though we are pretty deep.  Memory is one big array waiting to hold our data. 

What does that array hold?  Bytes.  How do we find where we are storing the bytes in our array?  We give it an address and we look up that memory address.

An address is not an Index, but for the correlation of how to look up memory data it is comparable to the way we store data on a Clustered Index page and then look that data up by its unique key.  So to make this an easy comparison for my SQL Family, let’s just say that an Address is our Index and how we will look up data in our array/table.

When we read data into memory there is a memory address assigned to the byte or bytes depending on the chipset of the machine.  Big Endian Processors read the data from Left to right, also known as most significant byte to smallest address.  Motorola and quite a few others use Big Endian.  x86 and x64 processors use Little Endian.  Since SQL Server run’s on x86 and x64 hardware we will focus mainly on that.

For example take value XSWB.  If we translate each letter to two byte hex pairs that we would place into memory we would get X=58, S=53, W=57, B=42, or 58535742.  Each hex pair would be translated to binary which would then translate to ASCI characters which would become the regular letters we see.  How would we store that in memory?  The ASCII example below is for 8 bit access.

*We will disprove the flipping of ASCII bit’s here in a moment using a 64 bit access.  But what I want you to get from this is the concept.  More in a moment.  Also here’s a really nice graphic from theWikipedia Entry on Endianess, well worth the read.


BIG ENDIAN
Address
Value
1000
58=X
1001
53=S
1002
57=W
1003
42=B


LITTLE ENDIAN
Address
Value
1000
42=B
1001
57=W
1002
53=S
1003
58=X


This behavior is left over from when 8 bit processors had 16 bit memory registers and it was more efficient to load the lower byte first.  If it was only an 8 bit operation then the top byte could be skipped.  Thanks to Roger for all the technical explanations, more on that to come.

Since ASCI characters show up internally a little bit nicer than this, each letter is a two byte hex pair. No need for swapping to decode. When we get large numbers, we can really see this at work within SQL Server.  For example let’s use the following statement.

use master
go
if exists(select name from sys.databases where name='demoInternals')
begin
     drop database demoInternals
end
go
Create database demoInternals
go
use demoInternals
go
if exists(select name from sys.tables where name='brad1')
begin
     drop table brad1
end
go
create table brad1(mychar char(4) primary key clustered, myint int)
go
insert into brad1(mychar, myint)
values('XSWB', 12345678)
go

We’ll create the value we just looked at XSWB and an integer value of 12345678.  Now let’s do a DBCC IND, get our page number and look at the page dump.

dbcc ind('demointernals', 'brad1', 1)
go
dbcc page('demointernals', 1, 278, 3)


The ASCII doesn’t look byte swapped, but the integers obviously are. This lead to another question that I asked Roger.  Ridiculously smart man that he is, he told me that ASCII characters do not need to load the registers in the arithmetic processors.  For that reason we do not have to swap bytes.

Thanks Roger for all the great info.  This was a lot of fun to learn.  Thank You Dear Reader for stopping by.

Thanks,


Brad