Tuesday, September 17, 2013

SQL 2014 SSMS is Killing my C:

I could hear SQL saying to my C: Drive "Why You Little...."
 Hello Dear Reader! 

The Blog about SQL Saturday 232 coming shortly, but first I needed to blog about an error that I’m getting.  I presented this weekend on SQL 2014 a First Look at What’s New.  One of the demo’s I did was using Buffer Pool Extensions. 

To confirm with what I’m seeing I tested this on 2 VM’s and one physical Instance.

VM1
I created VM1 on my SSD.  I allocated 4 processors and 8 GB of RAM for my VM.  I created 3 drives for data files M: , log files L: , backup files W:, Buffer Pool Extensions S: .  I then Installed CTP1 using Oracle’s Virtual Box.  Max Memory set to 4096 MB.

VM2
I also confirmed these results by creating a Hyper-V VM on our Host server in the Pragmatic Works Lab.  The guest has 4 processors and 8 GB of RAM.  Max Memory set to 4096 MB.  No BPE was used on this VM.

Physical Instance
I also tested this on my laptop on a SQL 2012 SP1 CU2 instance.  I have 4 cores and 16 GB of RAM.  Max Memory set to 6144 MB. 

I then used WinDirStat to confirm the results.  During my SQL Saturday presentation my VM unexpectedly ran out of space on the C drive.  This was a little confusing as I had specifically set up my SQL instance to not be installed on the C drive.

“So Balls”, you say, “What was taking up all that space on the C drive?”

Great Question Dear Reader!  That’s want I wanted to know as well.

KILLING ME SOFTLY… THEN NOT SO SOFTLY

This is the VM.  As you can see I didn’t go with a huge C drive, only 25 GB.  The only drive I didn’t mention above was my T drive that I used to store a database that I put a Clustered Columnstore Index on.



I’ve got a script that I need to blog on called MakeAdventureWorksDW_big, but I don’t have that typed up just yet.  I use that script to make my FactInternetSales_big table.  I used my friend and cohort Robert Cain’s (@ArcaneCode | Blog) script Make Adventure Works Modern for Today to make my AdventureWorks2013 Database.



My table has 42.9 Million rows in it and is a little over 8 GB.  My test for PBE was pretty simple.  I created a 15 GB BPE.  The instance has 8 GB.  My max memory for the instance was set to 4 GB.  Then I raised the max memory to 19 GB.  I ran sys.dm_os_buffer_descriptors using the is_in_bpool_extension to find the data pages as they were allocated.

SELECT DB_NAME(database_id) AS [Database Name]
, case is_in_bpool_extension
     when 1 then 'Stored in BPE'
     when 0 then 'Stored in non-BPE cache'
  end as BufferLocation
,COUNT(*) * 8/1024.0 AS [Cached Size (MB)]
FROM sys.dm_os_buffer_descriptors
WHERE database_id > 4
AND database_id <> 32767
GROUP BY DB_NAME(database_id), is_in_bpool_extension
ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);


I then ran the following query to load up as much data as possible.

Use AdventureWorksDW2013
go
set transaction isolation level read uncommitted
go
set nocount on
select * from [dbo].[FactInternetSales_big]

It is just a simple select statement.  Rerunning my buffers query I could see things were loading up nicely.



My plan was to leave this query running so I could load all the data I needed to into my BPE.  That is when the C: ran out of space and the VM froze on me.  

In order to catch this bug I then ran WinDirStat.  Looking under my C:\Users\SQLBALLS.PW\AppData\Local\Temp\1 folder I found this.



Hmmm…. right now not that bad still plenty of free space, but here’s this temp file.  Let’s fast forward to 3 minutes in.  We’re now at 2 GB.



At 5 minutes we are at 3 GB.

Looking at my C Drive, I’ve now dropped from 10 GB of free space to a little over 7 GB.


The result set is large enough that this will continue until my C drive runs out of space.  If I try to delete the file I get a nice error letting me know that the file is locked by SQL Server Management Studio.



15 minutes in my C drive is looking pretty scary.  We are now in the red.  I kill the query so I can keep looking at my VM.  The C drive doesn’t free up.  I know from experience I could now delete the temp file because it is no longer in use.





I asked the question on twitter and Jason Kyle(@JasonNKyle) replied with a suggestion that maybe I was looking at a Swap file. 


As I understand them a swap file allows an operating system to use hard disk space to simulate extra memory when the system runs low on memory.  The way this behaves is that the OS swaps a section of RAM that an idle program is using onto the hard disk to free up memory for the program you are using.  Then when you switch back to the other program the OS trades out the memory for bytes on the HD.

I don’t think this is a swap file though.  I could be wrong. 

When I open one of the smaller temp files in notepad, the query results from my my query re-run so it is a much smaller result set.  This is what I see.



Not really useful, except that I realized the first data that I highlighted happened to be my sales order number from my query.



As I continue to mine through this I see that this is truly my query result set.

If I run this same query on my laptop.  16 GB of RAM, 6 GB Max Memory, after a restart with only around 1 GB in the buffers, I get the exact same results.  At first I thought this was an issue with Buffer Pool Extensions.  Now it looks like this is just the way queries return on SSMS.

Granted you normally don’t run million row transactions in SSMS.  Your application runs them.  So more digging to go, but I wanted to get the blog out to attract comments and help show what I was seeing.

As always Dear Reader, Thanks for stopping by.

Thanks,

Brad



Tuesday, September 10, 2013

T-SQL Tuesday #46 Rube Goldberg Machine aka Automating Deleting Older Certificates




Hello Dear Reader!  This is the second Tuesday of the month and you know what that means, T-SQL Tuesday the largest blog party on the Intrawebs.  T-SQL Tuesday is the brain child of SQL Community member extraordinaire Adam Machanic(@AdamMachanic | Blog), also the inventor of the word “Twote”  as in “To misquote a Tweet”, when used in a sentence it sounds like “He Twoted me wrong”.  This month our host is Rick Krueger(@DataOger | Blog).  So Rick what’s our topic?

My first exposure to Rube Goldberg Machines was playing the game Mouse Trap as a child. I work almost exclusively on the SQL development side of the house, where we sometimes build crazy creative solutions to solve business problems. We generally know the ‘right’ way to do things, but pesky issue like budgets, personnel, and deadlines get in the way. So, we channel our inner MacGyver, grab a handful paper clips and some duct tape, and then do things with SQL Server that we know shouldn’t be done (in an ideal world). And we hope nobody ever finds out how we bent the rules, because we know they will judge us (as we would judge them) and call our work a <gasp>HACK</gasp>.
So, if you would please, dust off one of those skeletons and tell us how you got really creative with SQL Server, instead of doing it ‘the right way’. In other words, tell us about your ugly SQL baby. If you’re worried about saving face, feel free to describe how you would have implemented the solution if you lived in that ideal world.”
I love mouse trap and MacGyver!  Over the years as a DBA sometimes you have to work with what you’ve got.  Other times your boss says do A, you say the best way to achieve A is by doing B & C and they say do A.   I’ve got two of these that I can think of off the top of my head.  One we used Change Data Capture in lieu of Auditing (don’t ask me why, because that version of SQL also had Auditing.  Oh Hello A…..).  The other may actually prove useful.  Which one to choose from?
“So Balls”, you say, “What’s the useful one?”
Good call Dear Reader, we’ll go with the useful one!

OUT OUT D@MN SPOT CERTIFICATE

When you are using Transparent Data Encryption one of the most important things is the certificate.  Once you enable it on a production database that certificate is just as important as your database backup.  Why?  Because in case of a catastrophic failure that backup is dependent on the certificate.  If you cannot restore the certificate to a new instance your backup is useless.  *There are some work arounds to this using backups of the Master DB, but we’ll save that for another day.*

When you look at setting up maintenance plans for your server you should create a job to back up your certificate daily.  A certificate is only 1 KB in size.  Very tiny file.  If you use a private key to encrypt your certificate it is only 1 KB in size as well.  So if you leave a year of them on your hard drive you haven’t taken up 1 MB.

As a DBA sometimes you can be anal retentive a neat freak.  I don’t keep a year’s worth of backups on hand, why would I keep a year’s worth of certificates on hand?  I’d like a process to automatically delete them and only keep the last two weeks on hand, or month on hand whatever matches up with my backup retention policy.

The problem is the automated cleanup task doesn’t work.  Sure you can go in the maintenance plan wizard, make one that looks in a directory for a .CER file, but the true problem lies in the data storage.  You have to custom script out the certificates.  If you didn’t think to add a line to the backup set history table with the extension of .cer and .key and the path to your Private Key or Certificate backups then the job won’t work.

Inserting records into the MSDB tables could work, but as a DBA new to TDE that thought hadn’t crossed my mind.  I wanted a way to back up my certificates and delete my old ones.  So I built one.

MY RUBE GOLDBERG MACHINE

This is a demo I do in my TDE presentation.  It’s up on my Resource Page and has been for some time.  Today I realized I’d never blogged about it.  My scripts heavily use XP Command Shell.  I had an audit setting in my environment that wouldn’t allow that to be on my servers.  So in this script I turn it on in the beginning and off at the end.  The nice thing about the script is I unit tested it and even if there is an error in the script the sp_configure settings are server level commands that occur outside of transactions, so they run no matter what.  The script runs quick, but it will make logged entries in the SQL Server Error log stating that XP_Command shell was turned on and off.  My audit team could live with this so I was able to implement it.
I also like to use a private key and a password for my TDE Encryption.  I don’t want the password sitting around in plain text in the job either.  So I make a database called TDE.  In it I have one table called tdeKeys.  I put two columns in there one is the name of my certificate that a private key will be created for the other is the password to use for that private key.  In secure environments you could set up column level encryption to ensure the password is not in plain text even in the table field.  The demo scripts I’m going to give you doesn’t use column level encryption.  It contains a function that retrieves the Password for the Certificate Name.
Next we will create the dynamic script to back up the certificate.  Note that I backup the Master Key as well.  If you are using column level encryption you’ll want a copy of the Master Key.  You’ll need to specify the path that you want to back up the certificates.  Also you will need to specify the certificate name.
Finally we will create the script that will use xp_cmdshell to transverse directories to manually delete our backups.  You will need to edit the file path in this script and insert the Master Key and certificate names in line 74.  Finally on line 103 you will need to alter the DATEADD function.  Right now it would only keep 4 days of certificates on hand, you’ll need to edit the DATEADD to match up your backup retention policy.

Want to see the whole presentation live?  I’ve done this for Pragmatic Works Training on the T’s, click Here to watch.  You’ve got to sign up for a Pragmatic Works account if you don’t already have one, and you’ll get free info on all the free training we do monthly!


OVERKILL

“So Balls”, you say, “Is this over kill?”
<soapbox>
Well Dear Reader it depends on your environment.  You must consider Level of Effort and Level of Response, LOE and LOR.
LOE is one part you one part the hacker.  The more secure you make something the less likely that a hacker will keep going for it, or how far they will bother to go.  On your part it is how far you are willing to go to do your job.  We can also get dissuaded from going the extra mile sometimes.  Your LOE should be governed by your organizations LOR.
LOR is the response that your organization will have to the event.  One thing I like to tell folks is that if you are ever in the position that your security has been breached, and you are then talking to your boss, his/her boss, the CIO, a high ranking officer in the military, or a/multiple high ranking government official(s).  Trust me when I say that you want to be able to say you took every last step possible to protect the data under your care.  The more detail you can provide the better.  So overkill?  Maybe.  CYA.  Absolutely. Thankful that no fault on your part was found and you still have a job? Yep.
Having been in this position trust me take the extra couple steps, if you ever need it you’ll be glad you did.
</soapbox>
Thanks to Rick for hosting this today, and as always Thank You Dear Reader for stopping by!
Thanks,

Brad

Friday, August 16, 2013

24 Hours of PASS Q & A: Why are Bit’s Flipped?



http://bit.ly/16UyEPH
Hello Dear Reader.  Almost two weeks ago I delivered a presentation for the 24 Hours of PASS on SQL Server Data Internals.  I received some really great questions that have pushed me to dive deeper in my understanding of SQL and computers in general.  One of the questions was as follows:

Why are Bytes sometimes swapped and why do you have to flip bits in order to decode a bitmap?

At the time I took a WAG at the question, Wild @$$ Guess.  I said that they were flipped because of a need to protect proprietary data.  I couldn’t have been more wrong.  Fortunately for me my friend and co-worker Roger Wolter (@RWolter50 | Blog)  was watching and was quick to let me know that the source of the byte swapping was not proprietary.  It was because of Endians.


“So Balls,” you say, “It was because of Endi-a-Whats?”

Don’t worry Dear Reader, I was right there with you.  This was a new term to me, some of you clever former Computer Science majors are already seeing a light bulb form.  For my sake humor me and pretend you haven’t already figured this out.  Some of you former Literature Majors, or general readers of classic tales, are wondering what Gulliver Travels has to do with SQL Internals.

BIG AND LITTLE ENDIANS
Jack Black, Comedian, Singer, Dancer, ...Computer Scientist

In Jonathan Swift’s satirical novel Gulliver’s Travels, Gulliver ends up in a land called Lilliput.  Lilliput has hostilities with their neighbor Blefuscu.  You see Lilliput likes to eat their hard boiled eggs by cracking the little end.  Whereas Blefuscu likes to eat their eggs by cracking the big end.  There for the Lilliput’s are known as Little End-ians and the Blefuscu are Big End-ians.  Side stepping Swift’s satirical play on societal issues the term was later utilized in Computer Science over 200 years later.

So what does this mean for Computers?  It is how we read data out of memory. 

We will be covering memory at a very high level, even though we are pretty deep.  Memory is one big array waiting to hold our data. 

What does that array hold?  Bytes.  How do we find where we are storing the bytes in our array?  We give it an address and we look up that memory address.

An address is not an Index, but for the correlation of how to look up memory data it is comparable to the way we store data on a Clustered Index page and then look that data up by its unique key.  So to make this an easy comparison for my SQL Family, let’s just say that an Address is our Index and how we will look up data in our array/table.

When we read data into memory there is a memory address assigned to the byte or bytes depending on the chipset of the machine.  Big Endian Processors read the data from Left to right, also known as most significant byte to smallest address.  Motorola and quite a few others use Big Endian.  x86 and x64 processors use Little Endian.  Since SQL Server run’s on x86 and x64 hardware we will focus mainly on that.

For example take value XSWB.  If we translate each letter to two byte hex pairs that we would place into memory we would get X=58, S=53, W=57, B=42, or 58535742.  Each hex pair would be translated to binary which would then translate to ASCI characters which would become the regular letters we see.  How would we store that in memory?  The ASCII example below is for 8 bit access.

*We will disprove the flipping of ASCII bit’s here in a moment using a 64 bit access.  But what I want you to get from this is the concept.  More in a moment.  Also here’s a really nice graphic from theWikipedia Entry on Endianess, well worth the read.


BIG ENDIAN
Address
Value
1000
58=X
1001
53=S
1002
57=W
1003
42=B


LITTLE ENDIAN
Address
Value
1000
42=B
1001
57=W
1002
53=S
1003
58=X


This behavior is left over from when 8 bit processors had 16 bit memory registers and it was more efficient to load the lower byte first.  If it was only an 8 bit operation then the top byte could be skipped.  Thanks to Roger for all the technical explanations, more on that to come.

Since ASCI characters show up internally a little bit nicer than this, each letter is a two byte hex pair. No need for swapping to decode. When we get large numbers, we can really see this at work within SQL Server.  For example let’s use the following statement.

use master
go
if exists(select name from sys.databases where name='demoInternals')
begin
     drop database demoInternals
end
go
Create database demoInternals
go
use demoInternals
go
if exists(select name from sys.tables where name='brad1')
begin
     drop table brad1
end
go
create table brad1(mychar char(4) primary key clustered, myint int)
go
insert into brad1(mychar, myint)
values('XSWB', 12345678)
go

We’ll create the value we just looked at XSWB and an integer value of 12345678.  Now let’s do a DBCC IND, get our page number and look at the page dump.

dbcc ind('demointernals', 'brad1', 1)
go
dbcc page('demointernals', 1, 278, 3)


The ASCII doesn’t look byte swapped, but the integers obviously are. This lead to another question that I asked Roger.  Ridiculously smart man that he is, he told me that ASCII characters do not need to load the registers in the arithmetic processors.  For that reason we do not have to swap bytes.

Thanks Roger for all the great info.  This was a lot of fun to learn.  Thank You Dear Reader for stopping by.

Thanks,


Brad