Monday, August 6, 2012

How to Find Bad Page Splits


http://www.flickr.com/photos/brickapolis/5474835288/


I was working with a friend last week, doing SQL related teaching and tune ups.  Earlier this week I had received an email from them regarding unusually high page splits they were receiving after deploying new code, upwards of 800 per second!

This sounds really bad at first glance, but is it?  It depends on the type of Page Split behavior.

  This is an excellent opportunity for us to look at together. We need to look at how active your database system is and determine what type of activity the Page Splits are to see if this is  bad or not.  The new code release may play into this, but then again maybe it doesn’t.

“So Balls,” you say, “What’s a Page Split?”

Great Question Dear Reader! And away we go.

DOING THE SPLITS
http://www.flickr.com/photos/kathy1243/5570985183/


First let’s talk about the setting.  Your table is either a Heap or a Clustered Index.  Page Splits can only occur in Indexes, Heaps get something called Forwarding Pointers.  A Heap, however, can have Non-Clustered Indexes.  Non-Clustered Indexes can get Page Splits on their leaf level indexes.  So regardless of the base table type, only Indexes can get Page Splits.

  Page Splits are natural behaviors to data growth.  Some cause fragmentation in indexes, we want to track these and figure out a way to keep them from happening to prevent fragmentation, which can lead to poor query performance.  

A page split can occur as an Update or an Insert statement.  There are 3 scenarios that can cause Page splits, Sequential Inserts, Non Sequential Inserts, and Updates.

“So Balls”, you ask, “Why does it matter if it is an Insert or an Update, Sequential or Non Sequential?”

Again with the great question’s Dear Reader, well played.  

The way that a Sequential Page Split operation is performed is that the Leaf page become full, and you go to insert one more record.  A Split operation is performed when the next record is attempted to be inserted on the page.  The data is placed on a newly allocated page.  The PFS, Page Free Space, Allocation Bytemap, will flip bits indicating that the old page is full as well as bits for our new page.  The table will be updated to reflect the new page has been created.  This is not a Split operation that we would not want to track it, for the purpose of causing Index Fragmentation.

The way that a, Non Sequential Insert, Mid Page Split operation is performed is the almost the same as an Update.  The leaf level of the page is already full.  An Insert operation occurs on the page.  Records are moved off the page to make room for that Non Sequential Insert. This is a Split operation that we would want to track.

The way a Split operation is performed on an update is a page already is full of data, and one of the records on the page is updated and the data grows, it pushes the another record off of the page.  This is a Split operation that we would want to track.

These Inserts and Updates can not only affect the pages that they are on but could have a cascading affect onto other pages if they are all full.

The update could cause multiple pages to be altered, and changes to our Clustered and Non Clustered Indexes, this translates to I/Ops which will cost us CPU cycles.  Our Indexs will point queries into the right direction, but we could cause fragmentation and out of date statistics that could lead to a less than optimal plan from being selected.  So most inserts are not bad, and updates that cause excessive Page Splits should be examined.

DETECTING THE  SPLITS

So the question becomes how do we track these Mid Page Splits and separate them out from the End Page Splits.  The lucky ones using SQL 2012 have an Extended Event Action sqlserver.transaction_log that they can use.  Jonathan Kehayias (@SQLPoolboy|Blog) wrote a great blog about this earlier this year, to read more on it click here.

But what if you’re not on SQL 2012? We have the Transaction log that has recorded this same event.  The LOP_DELETE_SPLIT logged operation is the logged event that tells us if a Mid Page Split has occurred.  Specifically it marks that rows were deleted from a Page as a result of a split.  Using the undocumented procedure fn_dblog we can query the Transaction Log to find these events.  


Jonathan provides a great script for producing Mid Page Splits, I recommend reading his blog, even if you’re not on 2012 it will give you a good understanding of the logged event, as well as a practice script to use.  Just run the same table scripts in SQL 2005, 2008, or 2008 R2.  If you use 2005 you’ll have to change the DateTime2 value in Jonathan’s script to a DateTime value…or at least something other than a DateTime2 value.

Once you have created some Mid Page Splits run the following query.
SELECT
     COUNT(1) AS NumberOfSplits
     ,AllocUnitName
     ,Context
FROM
     fn_dblog(NULL,NULL)
WHERE
     Operation = 'LOP_DELETE_SPLIT'
GROUP BY
     AllocUnitName, Context
ORDER BY
     NumberOfSplits DESC

Now you have the total number of operations and the full Object Name of the Index that is the cause of the Mid Page Splits.  Now you can go and rebuild your Index with a lower amount of Fill Factor to see if that keeps the splits from occurring.

Keep in mind that the LOP_DELETE_SPLIT will only stay in your transaction log so long as the VLF’s, Virtual Log Files (the internal components of the transaction log), are not over written by new transactions.

 For best results you could place this script in a SQL Agent job and run it a couple times an hour and log the results to a table, so you could analyze what indexes you have that get Mid Page Splits and check them against any nightly re-index jobs to see the level of fragmentation of those same indexes.

If you don’t have nightly index jobs that log I’d recommend grabbing Ola Hallengren’s Maintenance plans click here to visit his site.

END PAGE SPLITS, OR NEEDLE I GIVE YOU THE HAYSTACK


http://www.flickr.com/photos/gymru1958/3418449455/


So we know how to find the Mid Page Splits, but what about the End Page Splits.  After all you don't want to know one without the other right?  Well Dear Reader like most things Microsoft finding the first one is easy, the second not so much.

You would think that their is a logged event that signifies an End of Page Split.  If there is I didn't find it.  But there are other fields in the fn_dblog function that we can use to find our End Page Splits.  When you do a Select * from fn_dblog you get a lot of info back.  Parsing it can seem daunting, and if we were left with only our field Context we would truly be up the river.

However if you use the field [Transaction Name] you will find a value called 'SplitPage'.  Adding the Description field will get you more detail.  But let's take this a bit at a time.  Frist we'll create a database a table and insert just enough data that we get 1 page split.  Then we'll use the undocumented DBCC IND in order to get a full list of page number for our table.


IF EXISTS(SELECT name FROM sys.databases WHERE name='endPageSplits')
BEGIN
     ALTER DATABASE endPageSplits SET SINGLE_USER WITH ROLLBACK IMMEDIATE
     DROP DATABASE endPageSplits
END
GO
CREATE DATABASE endPageSplits
GO
USE endPageSplits
GO
IF EXISTS(SELECT NAME FROM SYS.TABLES WHERE NAME='endSplits')
BEGIN
     DROP TABLE dbo.endSplits
END
GO
CREATE TABLE endSplits(myid int identity(1,1), mychar char(2000) NOT NULL DEFAULT 'A', constraint pk_myid_endsplits primary key clustered(myid))
GO

DECLARE @i INT

SET @i=0

WHILE(@i< 1)
BEGIN
     SET  @i=@i+1

     INSERT INTO dbo.endSplits
     DEFAULT VALUES
END

DBCC IND(endPageSplits, 'endSplits', 1)
GO




We see that our Allocation Unit (PageType=10) is page 154, our First Data Page (PageType=1) is 153, our Index Page (PageType=2) is 155, and our Second Data Page (PageType=1) is 155.  If we do a query on fn_dblog looking for [Transaction Name]='SplitPage'.

For this example we'll add in our Description we need to get everything that is like 'Changed type HOBT_ROOT (0)%'.  This is only good for this current example.  Since we have a small clustered index we'll cause a root Index page to be created, and that will give us in Hexadecimal the values of our pages being inserted,




SELECT
            [Transaction Name]
            ,Description
            ,*
FROM
            FN_DBLOG(NULL,NULL)
WHERE
            [Transaction Name]='SplitPage'
            or Description like 'Changed type HOBT_ROOT (0)%'




The Changed type HOBT_ROOT will get us more than one row.  We want to look at rows 2 & 3.  Row 3 is our SplitPage logged event. Row 2 contains data in the description that is directly related to our page split.  I've pasted it's contents below.


Changed type HOBT_ROOT (0) for AU 72057594039697408 in rowset 72057594038779904 from page 0001:00000099 to page 0001:0000009b

These are Hexadecimal values for the page numbers in our SplitPage Transaction.  If you put the Hex value 99 in you will get the decimal value 153.  If you put the Hex value 9b in you will get decimal value 155.  And there we go confirmation that we have split from Page 153 to 155.

The operation itself was at the Root level of the index, and depending on your index size and the split that occurs, the logged operation can change..  I'm simplifying the process quite a bit, but from this example that we had a SplitPage operation and through the description we could validate what page split from which.


These are logged operations that we don't need to single out, but I wanted to cover them in case you were curious.  These operations mean our table is growing, data is being added, and you can track that many many better ways than scouring the transaction log for events and descriptions.

Good luck finding Mid Page Splits and as always thanks for stopping by and reading.

Thanks,

Brad

Monday, July 9, 2012

SQL Internals Reading Data Records Part 6: Variable Length Data




Today we will be discussing what that array was pointing to the Variable Length Data.  The nice part is this should be easy going for you by now because this is very similar to what we did in Part 3.  We are reading data. 


One of the things that we will use today is what we learned last time to bring it all home.  So without fan fair or ado let’s dive right into it.


VARIABLE LENGTH DATA
First let’s update our chart so we know what part of the Data Record we are tackling.  Once again these images come by way of Paul Randal (@PaulRandal | Blog) and the MCM Video series on Data Structures and the good people from Microsoft.


The variable length columns are the last bit of data that we need to read out of a regular data record.  The offset array that we looked at yesterday gives us a value for the end of the variable length record we are looking for so that we can find the end of our data record and read it in reverse.

We’ll use the same code we’ve been using for the last couple days but in case you are missing it here it is.  First we’ll create our table and insert a record.
IF EXISTS(SELECT NAME FROM sys.tables WHERE NAME='dataRecord2')
BEGIN
       DROP TABLE dataRecord2
END
GO
CREATE TABLE dataRecord2
                     (myID INT
                     ,myfixedData CHAR(4)
                     ,myVarData1 VARCHAR(6)
                     ,myVarData2 VARCHAR(9)
                     ,myVarData3 VARCHAR(6)
                     )

GO
INSERT INTO dataRecord2(myID, myfixedData, myVarData1, myVarData2, myVarData3)
VALUES (7, 'XXXX', 'SSS', 'WWWWWWWWW', 'BB')
GO

Now let’s do a DBCC IND and get our page numbers.

DBCC IND(demoInternals, 'dataRecord2',  1)
GO

Remember that a page type 10 in an allocation page and we are reading a data page.  So look for the page number that has a PageType=1.  We’ll follow that up with a DBCC PAGE on Page 296, remembering to turn on Trace Flag 3604 so that we get our output to our SSMS window.  *Remember your page numbers may be different than mine. 

DBCC TRACEON(3604)
GO
DBCC PAGE('demoInternals', 1, 296, 1)
GO

Now that we’ve got our page let’s take a look at it, I’m only going to post the relevant output.
Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS
Record Size = 37                   
Memory Dump @0x0000000014DEA060

0000000000000000:   30000c00 07000000 58585858 05000003 001a0023  0.......XXXX.......#
0000000000000014:   00250053 53535757 57575757 57575742 42        .%.SSSWWWWWWWWWBB

The data that we are looking at is in bold and underlined and highlighted in red.  If we use the variable offset array values we could find which data belongs to which column.  So let’s do that real quick.  Don’t forget to use our handy Hex to decimal and binary conversion tool, http://easycalculation.com/hex-converter.php.


Each block of data in a record is 8 characters long but stands for 4 bytes of information.  So if we look at all of the blocks we see we have 9 * 4 is 36 bytes.  The last bit of information is 42 is two characters long and is 1 byte long.  So our total record is 37 bytes long.  Since variable length data can have different lengths that changes with the values in place we need to look at the variable length array to find our data and read it forward.

Our first Variable length column is 53 5353. Because this is regular character data we will translate one byte at a time.  0x53 translates to binary 01010011, if we use our binary to ascii conversion tool, , then we find that binary value translates to S.  There are three 53 values and our first variable length column has ‘SSS’ as it’s the value that we inserted into myVarData1.  Following this pattern you could very easily translate the contents of the other two variable length columns.

I’ll leave you to it Dear Reader, as always thanks for stopping by.

Thanks,

Brad

Thursday, July 5, 2012

Differences in the Null Bitmap between SQL 2005 and 2012


Hello Dear Reader, I’ve been working on a series for BIDN.com, the Business Intelligence Developers Network, on how to read a Data Record.  I’ve always wanted to take the time to go rummaging around in the internals of SQL Data Records and this was my shot to do it.  If you are interested please feel free to read Part 1 Tag bytes, Part 2 The Null Bitmap Offset, Part 3 Fixed Data, and Part 4 The Null Bitmap.

While I was working on part 4 I found that the results that I was getting for the Null Bitmap were a bit different than what I was finding on the blogosphere and the Intrawebs.  I was reading the blogs of some very smart people, so that lead me to one conclusion: I was doing it wrong.  So I started following their examples and comparing the results and well….there’s only so may ways you can type DBCC PAGE before you start thinking maybe I’ve found something.  I was working in SQL 2012 so my first thought was that it changed there.  So I stood up a VM with SQL 2008 R2 on it and I got the same results.  So then I stood up another VM with 2005 and Viola! I was looking at results just like everyone else.  So then I stood up ANOTHER VM with 2008 RTM and once again I was back to what I was initially looking at.

“So Balls,” you say, “What was different?”

Sorry for the Ramble Dear Reader, what I found was that in 2005 when you translate the Null Bitmap if you had a number of columns that doesn’t divide easily into 8, (The Null Bitmap is one byte which translates out to a map of 8 individual bits, and you get 8 whether you need them or not), instead of using 1’s for unallocated bits they use 0’s. 

I understand not exactly a jaw dropping conclusion but if you’re a geek like me that likes to dive into internals this should be a fun read.

LET’S ROCK

So now that we’ve got some extra detail out of the way let’s dive into examples.  First we’ll create our table with two fixed length columns and three variable length.  We’ll keep the variable length columns as nulls at first so we can see how the Null Bitmap changes.

IF EXISTS(SELECT NAME FROM sys.tables WHERE NAME='dataRecord2')
BEGIN
       DROP TABLE dataRecord2
END
GO
CREATE TABLE dataRecord2
                     (myID INT
                     ,myfixedData CHAR(4)
                     ,myVarData1 VARCHAR(6)
                     ,myVarData2 VARCHAR(6)
                     ,myVarData3 VARCHAR(6)
                     )

GO
INSERT INTO dataRecord2(myID, myfixedData, myVarData1, myVarData2, myVarData3)
VALUES (7, 'XXXX', null, null, null)
GO

Now that we’ve got our table and a record inserted let’s use DBCC IND to find our page.

DBCC IND(demoInternals, 'dataRecord2',  1)
GO


Remember type 10 is an allocation page and we want a data page or a PageType 1, we see that Page 288 is our candidate.  Now let’s use DBCC Page to get page 288, don’t forget to turn on Trace Flag 3604.  I’m only going to copy the relevant bits from our page dump and remember your page numbers may differ.  I’m using SQL 2012 for this output.

DBCC TRACEON(3604)
GO
DBCC PAGE('demoInternals', 1, 288, 1)
GO

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 15

Memory Dump @0x00000000136EA060

0000000000000000:   10000c00 07000000 58585858 05001c             ........XXXX...

Here in Red, Green, and Bold (sorry to anyone who is color blind hence the bold…), is our portion of the record slated as our Null Bitmap.  0500 is the first two bytes of our record, remember two numbers equal a Hex pair.  Our Pairs are 05 & 00, they are flipped so you would read them as Hex 0x0005, which translates out to 5 and we do have 5 columns.  Don’t forget to use our handy Hex to binary convertor, http://easycalculation.com/hex-converter.php. The last portion of the record is the Null Bitmap itself 1c.  In Hex this works out as 0X1c and converts to binary as 00011100.  In order to read this binary value accurately we flip it and it becomes 0011100



We break this down per column, because we have two columns that are not null three Variable length columns that are null and then the remaining two 0’s are not in use.  So why are they are you may ask?   Because 1 byte equals 8 bits, and these are left over bits.

Now this is one of the differences that I’ve noticed from 2005 to 2008 on up, in 2005 the left over bits where marked as 1’s.  Let’s take a look real quick to back that up.  I set up a VM running Windows 2008 R2 and SQL 2005 Straight up RTM no Service Packs applied.  We’ll run the exact same scripts as above and look at the Null Bitmap output.


IF EXISTS(SELECT NAME FROM sys.tables WHERE NAME='dataRecord2')
BEGIN
       DROP TABLE dataRecord2
END
GO
CREATE TABLE dataRecord2
                     (myID INT
                     ,myfixedData CHAR(4)
                     ,myVarData1 VARCHAR(6)
                     ,myVarData2 VARCHAR(6)
                     ,myVarData3 VARCHAR(6)
                     )

GO
INSERT INTO dataRecord2(myID, myfixedData, myVarData1, myVarData2, myVarData3)
VALUES (7, 'XXXX', null, null, null)
GO

Now that we’ve got our table and a record inserted let’s use DBCC IND to find our page. Notice we use DBCC IND with a -1 if you attempt this with the exact same script as above you’ll get an output error for dump style 1.

DBCC IND(demoInternals, 'dataRecord2',  -1)
GO


Remember type 10 is an allocation page and we want a data page or a PageType 1, we see that Page 153 is our candidate.  Now let’s use DBCC Page to get page 153, don’t forget to turn on Trace Flag 3604.  I’m only going to copy the relevant bits from our page dump and remember your page numbers may differ. 

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP    
Memory Dump @0x0000000008BAC060

0000000000000000:   10000c00 07000000 58585858 0500fc††††........XXXX... 
asdf
As you can see from above we get the output of fc in SQL 2005 instead of 1c like we got in SQL 2012.  If you take the Hex for 0xfc and dump it to binary you get 11111100.  When when we flip works out to 00111111.



Just to reiterate, unless you want to validate this for yourself, I stood up VM’s for 2005 RTM, 2008 RTM, 2008 R2 RTM, and SQL 2012 RTM.  Only 2005 handles unused bit’s as 1’s the rest treat them as 0’s.  The Null Bitmap was an optimization that was added in 2005, Read Paul Randal’s, Paul(@PaulRandal | Blog), blog post on that here.

Since it was introduced in 2005 then there I’m sure there was a good reason to change it in 2008 and up.  My guess is that instead of masking as columns that where just there of offset the bitmap some logic is used based on the number of columns and the bitmap to determine how much of the record remains to be read.

Whatever the case a lot of the documentation I came across regarding this all pointed to the unused bit’s being labeled as 1’s.  That through me for a loop at first and I just wanted to toss this out there to anyone else searching on the subject.

Thanks for stopping by,

Brad