Wednesday, June 8, 2011

HEAPS & COMPRESSION

Compression did not work out so well for Bessy

Heaps are really interesting animals when it comes to Compression.  They start out having a lot in common with Clustered Indexes, but they stop just after creation.

  You can apply compression to a Heap or Clustered Index upon creation, or you can apply compression to a Heap or Clustered Index using a Rebuild Operation and specifying the type of compression you would like to apply.

After that the behaviors diverge.  When a Heap is rebuilt and Page Compression is Applied, even if you only have enough data to occupy one page, two will be provisioned. 

You will always get One new page that does not have any data on it.  In the case of our table with only Two pages, Page One will contain your data in its Page Compressed format, as long as an internal algorithm can determine that you would save enough space to fully apply Page compression, and subsequent inserts will be in a row Compressed format.  Page Two will only be row compressed until it is rebuilt.  But don’t take my word for it let’s prove this out.

First I’m going to create my database and my table that we’ll be using for this demo.

USE master
GO
IF EXISTS(SELECT * FROM sys.databases WHERE name='demoCompressionINTERNALS')
BEGIN
    DROP DATABASE demoCompressionINTERNALS
END
CREATE DATABASE demoCompressionINTERNALS
GO
USE demoCompressionINTERNALS
GO
/*
Create a Heap Table
*/
IF EXISTS(SELECT * FROM sys.tables WHERE name='heap1')
BEGIN
    DROP TABLE dbo.heap1
END
CREATE TABLE heap1(
    myID int IDENTITY(1,1)
    ,myChar1 CHAR(500) default 'some text'
    ,myChar2 Char(1000) default 'more text for a larger field'
    );
GO

Our table has 1504 bytes per record, so we’ll insert 5 records to fill up our page.  Let’s run our script to populate the page with exactly 5 records.

DECLARE @i INT
SET @i=0

WHILE (@i<5)
    BEGIN
         INSERT INTO dbo.heap1(myChar1, myChar2)
         VALUES(
                 ('some product' + CAST((@i +1) AS VARCHAR(1)))
                 ,('Here is a Generic Product Description' + CAST((@i+2) AS VARCHAR(1)))
                 )
         SET @i = @i +1
    END

We’ll set trace flag 3604 so we can get the output of DBCC Page to SSMS, we’ll run DBCC IND to get  our pages, and finally let’s run DBCC Page on our data page and see the internal contents. 

DBCC TRACEON(3604)
GO
DBCC IND(demoCompressionINTERNALS, 'heap1',1)

We’ll Go ahead and run that.




You can see that I get back 2 pages 146 & 145.  Looking at the picture I’ve highlighted the PageType field.  Page Type 1 is a Data Page, and that is the type we are interested in.  Page 10 is an IAM page.  We want to look at our Data Page, so we’ll take page 145 and look at it.  For the sake of space I’m going to hop down to the first record on the page after the hex dump.  We’ll look at Slot 0.

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

myID = 1                            

Slot 0 Column 2 Offset 0x8 Length 500 Length (physical) 500

myChar1 = some product1                                           

Slot 0 Column 3 Offset 0x1fc Length 1000 Length (physical) 1000

myChar2 = Here is a Generic Product Description2 

We are looking at our un-compressed Record.  Page Compression has three parts to it that are applied in the following order, Row Compression, Column Prefix Compression, and Page Dictionary Compression.

So first let’s apply Row Compression and see our space savings.

ALTER TABLE dbo.heap1
REBUILD WITH (DATA_COMPRESSION=ROW)
GO

We need to re-run DBCC IND & DBCC Page as our Rebuild operation will cause our page numbers to change.

DBCC IND(demoCOMPRESSIONInternals, 'heap1', 1)
GO



As you can see our page numbers changed, and we now have our second page has been allocated.  If we look at page 160 our data is on that page, and 147 is blank.  So let’s look at Slot 0

DBCC PAGE('demoCOMPRESSIONInternals', 1, 160, 3)
GO
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 1

myID = 1                            

Slot 0 Column 2 Offset 0xc Length 500 Length (physical) 13

myChar1 = some product1                                                                                         
           
Slot 0 Column 3 Offset 0x19 Length 1000 Length (physical) 38

myChar2 = Here is a Generic Product Description2



Row Compression did exactly what it was supposed to, it took the fixed length strings and stored only the space required.  So if we max out our columns we see that our sizes mirror what it was before row compression.

DECLARE @i INT
SET @i=0

WHILE (@i<5)
BEGIN
    UPDATE dbo.heap1
    SET myChar1 = (REPLICATE('a', 499) + CAST(@i AS VARCHAR(1)))
    WHERE myID=@i

    UPDATE dbo.heap1
    SET myChar2 = (REPLICATE('b', 999) + CAST(@i AS VARCHAR(1)))
    WHERE myID=@i

    SET @i=@i+1
END
DBCC PAGE('demoCOMPRESSIONInternals', 1, 160, 3)
GO

We see that our data has grown, and that the physical size now matches the logical size again, everything except our ID column.

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 1

myID = 1                            

Slot 0 Column 2 Offset 0xc Length 500 Length (physical) 500

myChar1 = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1               

Slot 0 Column 3 Offset 0x200 Length 1000 Length (physical) 1000

myChar2 = bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb1

Let’s add some more data so we have more than 1 page worth of data.  We will need this data in order to get Page Compression to apply.

DECLARE @i INT, @myID INT
SET @i=0

WHILE (@i<10)
BEGIN
         SET @myID=(SELECT (MAX(myid)+1) FROM dbo.heap1)


         INSERT INTO dbo.heap1(myChar1, myChar2)
         VALUES(
                  (REPLICATE('a', (250- LEN(@myID))) + CAST(@myID AS VARCHAR(3)))
                  ,(REPLICATE('b', (500- LEN(@myID))) + CAST(@myID AS VARCHAR(3)))
             )

    SET @i=@i+1
END

 If we take a look at our two pages, 160 & 140 we see that they are both populated with rows.

DBCC IND(demoCOMPRESSIONInternals, 'heap1', 1)
GO
DBCC PAGE('demoCOMPRESSIONInternals', 1, 160, 3)
GO

DBCC PAGE('demoCOMPRESSIONInternals', 1, 147, 3)
GO

So now we will apply Page Compression.  We will need to run DBCC IND & DBCC Page again, as this will rebuild our pages.

ALTER TABLE dbo.heap1
REBUILD WITH (DATA_COMPRESSION=Page)
GO
DBCC IND(demoCOMPRESSIONInternals, 'heap1', 1)
GO 
DBCC PAGE('demoCOMPRESSIONInternals', 1, 200, 3)
GO


As you can see we have our pages again.  And after running DBCC Page we’ll see that we have a CI, or Compression Information Header Section.  This is where Page Compression will store the CI that it uses for Column Prefix and Page Dictionary Compression.

 Just because the ID of 6 appears in our data doesn’t mean we are going to Slot 5 (remember zero based array’s).  This is a heap and our data is not sorted off of the ID column like it is with a Clustered Index.  Slot 0 contains our record with an ID of 6 and that corresponds with our CI information.  This will be our most compressed record.  I’ll go down further and grab Slot 6 for us to use as a comparison.

* Note the difference in the record lengths in the Compression Data Array, CD array entry. I’ll highlight those in yellow.

CD array entry = Column 1 (cluster 0, CD array offset 0): 0x02 (ONE_BYTE_SHORT)
CD array entry = Column 2 (cluster 0, CD array offset 0): 0x01 (EMPTY)   
CD array entry = Column 3 (cluster 0, CD array offset 1): 0x01 (EMPTY)   
Record Memory Dump

000000001443A35E:   01031211 86000000 00†††††††††††††††††....†....       

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 1

myID = 6                             

Slot 0 Column 2 Offset 0x0 Length 500 Length (physical) 0

myChar1 = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa6                                                                              
                                                                                                                 
                                                                         

Slot 0 Column 3 Offset 0x0 Length 1000 Length (physical) 0

myChar2 = bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb6 

Slot 6 Offset 0x3a2 Length 13

Record Type = (COMPRESSED) PRIMARY_RECORD                                 Record size = 13

CD Array

CD array entry = Column 1 (cluster 0, CD array offset 0): 0x02 (ONE_BYTE_SHORT)
CD array entry = Column 2 (cluster 0, CD array offset 0): 0x05 (FOUR_BYTE_SHORT)
CD array entry = Column 3 (cluster 0, CD array offset 1): 0x05 (FOUR_BYTE_SHORT)

Record Memory Dump

000000001443A3A2:   01035215 8c80f831 3281f231 32††††††††..R.Å’.ø12.ò12   

Slot 6 Column 1 Offset 0x4 Length 4 Length (physical) 1

myID = 12                            

Slot 6 Column 2 Offset 0x5 Length 500 Length (physical) 4

myChar1 = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa12                                                                              
                                                                                                                 
                                                                         

Slot 6 Column 3 Offset 0x9 Length 1000 Length (physical) 4

myChar2 = bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb12

So when we look at our other page we will see that there are no records on it.  Go do a DBCC Page and take a look just to confirm.

DBCC PAGE('demoCOMPRESSIONInternals', 1, 155, 3)
GO

So let’s insert a couple more records, we should have plenty of room to fin on this page right?

DECLARE @i INT, @myID INT
SET @i=0

WHILE (@i<10)
BEGIN
         SET @myID=(SELECT (MAX(myid)+1) FROM dbo.heap1)


         INSERT INTO dbo.heap1(myChar1, myChar2)
         VALUES(
                  (REPLICATE('a', (250- LEN(@myID))) + CAST(@myID AS VARCHAR(3)))
                  ,(REPLICATE('b', (500- LEN(@myID))) + CAST(@myID AS VARCHAR(3)))
             )

    SET @i=@i+1
END

Now take another look at Page 155, you’ll see that these rows take up around 762 bytes each, and that and are not in a Page Compressed format, this looks more like a Row Compressed format and it is.

Slot 0 Offset 0x60 Length 762

Record Type = (COMPRESSED) PRIMARY_RECORD                                 Record attributes =  LONG DATA REGION
Record size = 762                   
CD Array

CD array entry = Column 1 (cluster 0, CD array offset 0): 0x02 (ONE_BYTE_SHORT)
CD array entry = Column 2 (cluster 0, CD array offset 0): 0x0a (LONG)    
CD array entry = Column 3 (cluster 0, CD array offset 1): 0x0a (LONG)    

“But Balls,” you say, “I thought we enabled Page Compression?”  Ahh Dear Reader, we did.  However, Heaps do not automatically apply Page Compression when data is added to a page.   There are 3 exceptions to this
  1.  When data is inserted into a heap using Bulk Insert
  2. When data is inserted into and a full table lock is placed on the table.
  3. When a heap is Rebuilt specifying the with (Data_Compression) option, as we did above.


So when you use Page Compression together with Heaps keep their behavior in mind so you don’t encounter any surprises.  Good luck and Happy Compressing!

Thanks,

Brad

Wednesday, June 1, 2011

Looking For A Job: The Dreaded Money Conversation



A couple of days on Twitter I had an exchange with some fellow Twitterer’s (sp?).  One of them had re-tweeted someone, repeated what someone had posted, with regards to how an interviewee was obviously not a good fit because the first question they had asked was what was the Salary Range for the position and they were obviously just in it for the money.

I took exception to this.  Having recently changed jobs within the last year, and being the interviewee & interviewer plenty of times in the past, the importance of the money conversation was still fresh in my mind.

“So Balls,” you say, “it’s all about the money?”


Glad you asked Dear Reader, you need to have more that moves and motivates you than money, but it is important never to underestimate its importance.   I love what I do, but if I could not support my family and do it, I’d do something else.  Family is first to me, and that means money is pretty important when I look at taking a job.

SO YOU’VE DECIDED TO LOOK FOR A JOB

Maybe you’re under paid, maybe you’re overworked and underappreciated, or maybe you’re looking for better opportunities.  There is nothing wrong with any of that.  When you go on Monster, Dice, or [insert job board of choice] and you start looking, eventually, you will probably type in your position and sort by Salary.

This is a good thing to do, you should look at what the posted range is out there.  You should probably go on Salary.com and look at what your position range is for your area.  You should get an idea of what you are worth.  Perhaps you’ve been with your company so long that you are getting paid more than the average person in your area.  Perhaps you are way underpaid.  This knowledge will help you decide what you are looking for and what you should expect to get.


I LOVE THIS JOB BUT WHERE’S THE MONEY

So eventually you find a post, get an email with a job description, or a recruiter contacts you with an opportunity.   You’re psyched, you’re stoked, you’re…….wondering where the salary description is.  Did I miss it?  Did they forget to put it in the information they sent?  You look at the numbers description the averages for your area and you HOPE that this job will come with the salary you want.

This is the point where I will say shame on you COMPANY X or shame on you RECRUITER, not that they are bad people but they are playing the game.  Everyone wants to get good people for the BEST DEAL possible.  You want money from them, they want you to kick butt, take names, and work cheap.  So you have to play the game as well.

The money talk is uncomfortable, we don’t like asking about it.  If you are working with a good recruiter they will share this information with you. 

“But Balls”, you say “I feel a little uncomfortable about asking, how do I do it”



 

SHOW ME THE MONEY



You may be tempted to yell show me the money but that would be the wrong approach.  There are a couple approaches.  If there are questions that you have about the job or the position in an email or a phone call put those up front. 

I’m the kind of guy that I won’t be turned off from a Candidate if money is the first question, but most people aren’t like that, they want to believe in their heart of hearts that you want this jobs for bigger reasons than the money.   So let’s use a little etiquette.

  1.  Make sure and research the company if they have a website visit it, and mention how you are interested to work for this company (And I hope you are!)
  2. Ask a question about the job, or ask to clarify anything that seems ambiguous to you
  3. Politely point out that the salary and salary range information was missing and state that as excited as you are about this opportunity you wouldn’t want to waste the Companies time or yours perusing a position that isn’t right for either of you.

2.       

It’s all about the manners, I mean after all you’re excited for this job, you do want it, but the money does need to be there.  If you ask the right way they will tell you, if they are good and upfront company.

Occasionally you’ll get a response along the lines of “what are you looking for salary range?”, they are being purposely vague.  It’s not bad, but it is someone who is trying to cleverly find out what you want without showing you their cards.  I would respond with the following:

“ a salary range that is fair for my position and experience and allows room to grow and continue to support my family, here is what I’ve seen is the typical range in this area, insert numbers from what your research shows, can you tell me what the specific range for this position is?”


DOING WHAT YOU LOVE FOR ALL THE RIGHT REASONS



We all have somebody to take care of, ourselves or our family and it is hard to work at a job and not enjoy the labors.  We all start at the bottom, it is rare that someone is so brilliant they go straight to the top.  But finding the right job for you is the mixture of a lot of good things, the people around you, the team you work on, your manager, and yes the money.

Money is very important and it should not be over looked because people are afraid of being greedy.  Remember it's not greedy to feed your family, pay your bills on time, or earn an good wage.  Don't be afraid Dear Reader ask when it is important.

The fact is it is a bit of a dance, but if you have to dace be as polite and direct as possible, the end goal is to find a job, get the money, and do the you complete me speech in the living room work happily ever after.

Thanks,

Brad