Tuesday, June 21, 2011

Thank You, I'm Speaking At PASS SUMMIT!



I just wanted to say a Big Thank You to you Dear Reader, and everyone that voted for my session!  I've been selected to speak at the 2011 PASS Summit on Compression.  It is a topic that I have really loved speaking and learning about, I'll be blogging more about the topic as I sharpen up all of my material for a Deep Dive!  Here is the abstract from my submission.

Page and Row Compression are powerful new tools. Page & Row with SQL 2008 RTM, and Page & Row with Unicode Compression with SQL 2008 R2. We can turn it on, we can turn it off, but we want more! What are Access Methods and how in the SQL Engine do they affect Compression? What are the “Gotchas” of Page Compression? How does Compression differ in the way it treats Non-Leaf & Leaf Level pages? What additional functionality did we get in DBCC Page, DMV’s, Perfmon Counters, and Extended Events to support our usage of Compression? How do complex query plans affect Compression? Come find out!


When I found out I wanted to run up and down the halls high-fiving people!  Was one of the thing's on my SQL Bucket List that I wanted to do.  Truthfully I wanted to make it TO the Summit, as I've never been to one. So the Prep is beginning and I'm super excited!

Thinking about going!  You should, and if you register by June 30th you'll recieve $600 off the full price!  Go sign Up Here.

Once again I just want to say a Big Thank You!

Thanks,

Brad

Thursday, June 16, 2011

SQL Rally Q & A Row Compression with Variable Legnth Unicode


This is another follow up on the Q & A’s that came about from SQL Rally, this question was asked by Jim Murphy (@SQLMurph | Blog).   This wasn’t so much of a question as it was a discussion during the presentation. 






I have Slide, see below, that lists the data types that can use Row Compression.  



Row Compression specifically takes all the empty space out of Fixed Length Data Types and stores them as efficiently as possible.  So in essence we take a fixed length field and treat it as if it has a variable size.

Jim very astutely pointed out that I had nvarchar listed as a Data Type that compresses even though it was not a Fixed Length Data Type and Jim wanted to know how Compression worked on an nvarchar Data Type.  To be perfectly honest I botched the answer.  But it led to a wonderful learning opportunity for me.

WHAT IN THE UNIVERSE IS UNI-CODE





The short answer is it is a Universal Character Set that allows for many non-English Characters and is governed by the International Standards ISO/IEC committee.  The Data Types in SQL Server that use Uni-Code are NCHAR & NVARCHAR and they use the UCS-2 character set.  UCS-2 allows exactly two bytes to represent each character.


So what in the wide wide world of sports does this have to do with Compression.  This means that when I declare an NCHAR with a 250 length that it is actually a 500 byte value.  When I type ‘Bradley Ball’ into an NVARCHAR data type instead of taking up 11 bytes of space it takes up 22.  So this encoding can utilize a lot of space that it will need when moving the data to the user, but not necessarily in Storage.


SQL 2008 R2 introduced Unicode Compression to SQL Server.  So to get back to the original question:

“Row Compression works by taking fixed length strings and Compressing out the unused white space from the record, NVARCH is a variable length string how does Row Compression work with a Variable Length Data Type?”


The short answer it strips out the unused extra encoding if there is any to strip out.  The long answer, TO THE DEMO MOBILE!

DUHNA DUHNA DUHNA DUHNA DEMO

Coolness is not Compressed for Batman

So Let’s start off by making our database.

USE master;
GO
IF EXISTS(SELECT name FROM sys.databases WHERE Name=N'demoCOMPRESSIONInternals')
     BEGIN
          DROP Database demoCOMPRESSIONInternals
     END
    
CREATE DATABASE demoCOMPRESSIONInternals
GO

USE demoCOMPRESSIONInternals
GO

Then we’ll create our table and populate it with a couple rows.

IF EXISTS(SELECT NAME FROM sys.tables WHERE name=N'myTable1')
BEGIN
     DROP TABLE dbo.myTable1
END

CREATE TABLE myTable1(
     myID INT IDENTITY(1,1)
     ,productName NCHAR(500) DEFAULT 'some product'
     ,productDescription NVARCHAR(1000) DEFAULT 'Product Description'
     ,PRIMARY KEY CLUSTERED(myID)     
) ; 

DECLARE @i INT
SET @i=0

WHILE (@i<5)
     BEGIN
          INSERT INTO dbo.myTable1(productName, productDescription)
          VALUES(
                   ('some product' + CAST((@i +1) AS VARCHAR(5)))
                   ,('Here is a Generic Product Description' + CAST((@i+2) AS VARCHAR(5)))
                   )
             

          SET @i = @i +1

     END

We’ll set on Trace flag 3604, and we’ll do a DBCC IND to get our page number, *Page Numbers will vary when you do this on your own.


DBCC IND(demoCOMPRESSIONInternals, 'mytable1', 1)
go

DBCC TRACEON(3604)
Go

And here are our pages.


Now a DBCC Page on page 153.

DBCC PAGE('demoCOMPRESSIONInternals', 1, 153, 3)
GO

Now let’s grab an edited look at our records, we mainly want to see the Length to each record in its pre-compressed state.

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

myID = 1                            

Slot 0 Column 2 Offset 0x8 Length 1000 Length (physical) 1000

productName = some product1                                                                                     
                                                                                                                 
                                                                                                                
                                                                                                                 
                                                                         

Slot 0 Column 3 Offset 0x3f7 Length 76 Length (physical) 76

productDescription = Here is a Generic Product Description2               

Slot 0 Offset 0x0 Length 0 Length (physical) 0

So we can see that even though Slot 0 Column 3 only contains 38 characters it’s length is 76.  You can see the doubling of the fixed length field as well looking at Slot 0 Column 2, even though it is an NCHAR(500) it has a length of 1000 bytes.  Now let’s apply Row Compression.


ALTER TABLE dbo.myTable1
REBUILD WITH (DATA_COMPRESSION=ROW)
go


We need to do another DBCC IND to get our newly rebuilt pages.

DBCC IND(demoCOMPRESSIONInternals, 'mytable1', 1)
go

Here are the new pages.



Now let’s do a DBCC Page on page 155 and let’s take a look at our data page and get the lengths.

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 1

myID = 1                             

Slot 0 Column 2 Offset 0xc Length 1000 Length (physical) 13

productName = some product1                                                                                     
                                                                                                                
                                                                                                                
                                                                                                                 
                                                                         

Slot 0 Column 3 Offset 0x19 Length 76 Length (physical) 39

productDescription = Here is a Generic Product Description2              

Slot 0 Offset 0x0 Length 0 Length (physical) 0

We see that the length is reduced greatly.   Slot 0 column 1 going from 4 bytes down to 1.  We see that Slot 2 went from 1000 Characters down to 13.

But the answer to the question lies in column 3 which went from 76 bytes down to 39.  When your nvarchar strings get to be even longer that get’s to be more space savings.  We did a nvarchar 1000.  If we maxed out that value it would be 2000 characters.

So let’s do that, we’ll update our values to max out our strings.

DECLARE @i INT
SET @i=0

WHILE (@i<5)
BEGIN
     UPDATE dbo.myTable1
     SET productname = (REPLICATE('a', 499) + CAST(@i AS VARCHAR(1)))
     WHERE myID=@i

     UPDATE dbo.myTable1
     SET productDescription = (REPLICATE('b', 999) + CAST(@i AS VARCHAR(1)))
     WHERE myID=@i

     SET @i=@i+1
END

And now we’ll take a look at page 155 again.
DBCC PAGE('demoCOMPRESSIONInternals', 1, 155, 3)
go
And what are the lengths of our slots?
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 1

myID = 1                            

Slot 0 Column 2 Offset 0xc Length 1000 Length (physical) 501

productName = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1           

Slot 0 Column 3 Offset 0x201 Length 2000 Length (physical) 1001

productDescription = bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbb1                                

Slot 0 Offset 0x0 Length 0 Length (physical) 0

WOW, so even when we max out our fields, Row Compression still provides us with savings.  Slot 0 Column 2 is 1000 bytes in length, but it’s physical storage is only 501 bytes, and Slot 0 Column 3 our variable length field is physically 2000 bytes but is stored in 1001 bytes.

I’m sure Dear Reader you have noticed that instead of an even 1000 it is 1001, and I would love to tell you that I know why, but I don’t.  I suspect that the 1 byte is some sort of internal byte that contains the information needed for a Unicode Character to be translated by the access methods after compression.  But your guess would be as good as mine.

Thanks,

Brad


Wednesday, June 15, 2011

Should Denali work on Windows XP


Microsoft release a list of Operating Systems (OS) and SQL Version Upgrade Paths that will be compatible with SQL Denali, SQL 11, aka SQL Next or whatever it will be called,  the next version of SQL.

1) The current support matrix for OSes is as follows:
·    Windows Vista SP2 or later
·    Windows Server 2008 SP2 or later
·    Windows Server 2008 R2 SP1 or later
·    Windows 7 SP1 or later
2) Denali will support upgrading from these SQL Server versions:
·    SQL Server 2005 SP4 or later
·    SQL Server 2008 SP2 or later
·    SQL Server 2008 R2 SP1 or later
3) The installer is going to block installation on unsupported OSes and it will block unsupported upgrade paths.


“So Balls”, You say, “What’s your opinion?”
Well Dear Reader I’ve got a couple thoughts I’ll give those and then I’ll give you the post I put in place.
IT’S ALL ABOUT THE CUSTOMERS

No matter what industry you are in you serve a customer.  Every year Microsoft has a lot of people whose sole job is to find out how many Microsoft Licenses you company has and how many it needs to have.  They work out Licensing deals and companies pay money.  When this occurs Microsoft get’s an inventory of sorts that let’s them know what people are using. 

I don’t know how granular that gets, I have no access to that information, I can only speak for myself.  Every job I’ve had since graduating college or while I was still in college has been using computers that ran Windows XP.  It has been around a LONG time.

So I would hope that the numbers Microsoft get’s would tell them what Percentage of customers still running XP.  You would then assume a natural upgrade trend, whatever the average is they see in their analysis.

I would then hope that they look at the number of hours it would take to make Denali work on XP or any other OS, based off the amount of work, the talent and cost of the team (and there is some serious Talent working for them) vs. the Revenues they project to get from licensing. 

Then I would look at the number of shops that run a lot of SQL Server and what their XP breakdown is.  If it doesn’t make Dollars and Sense they shouldn’t do it.

MAYBE THEY SHOULDN’T
“But Balls,” You say, “Earlier you told us you would like to see it support XP?   WTH?”

Ahh Dear Reader, I would like it.  It would make it easier for me to demo at work, and sell to management.  Which would work out to some of those Licensees that I talked about up top.  I’ve Windows 7 at home, and I’ll install it and learn the heck out of it there. 

But what is right for me might not be right for Microsoft.  Or it could be, it would all depend on where their customers are.

Even here at work, there is an upgrade strategy that will eventually be put into place, but that may be months or a year or more from now, those decisions are currently above my pay grade.

So finally here is what I wrote.  In the end it is up to Microsoft.  And as they are with the Clouds so am I with them, when it comes to Microsoft “I’m All IN!”.
Hi Dan,
   I'm going to disagree with the crowd, I think support for XP should be put in place.  I'm on Windows 7 at home, but work is still on XP slowly moving to 7. 

For some of us IT is a fight, it is a fight to get our companies to adopt the latest and greatest, and sometimes the big ships turn slowly.  This would certainly hurt my ability to sell the technology internally and I would assume I'm not alone in this.

As a SQL DBA I feel by default I'm a Microsoft Advocate.  In IT there are 50 ways you can do anything, I choose to do those 50 things using Microsoft and predominately SQL Server.  I Advocate moving to new technology and using the latest and greatest.  But it was a fight to use Powershell 2.0, Mirroring, 2008 R2, and Policy Based Management. 

I won't stop Advocating but it will make my fight more of an uphill battle if I cannot install on XP.

Thanks,

Brad