Tuesday, March 29, 2011

We Should Have A Talk: Your Database is Using ACID

I feel like a lot of my life as a DBA has been lived in reverse.  I’m started out my DBA life as an involuntary DBA.  My first real DBA issue was when my company’s production website was crashing.  The Database server was tossing errors that the Log file was full, as a matter of fact it had filled up the entire hard drive it was on.  Looking back at it now it seems like such a simple problem. 

One of the issues I’ve had with learning everything in reverse is I often find things and think “Why didn’t anyone tell me this!?”  There is so much to learn with being a DBA, Fundamentals, Internals, Troubleshooting, Monitoring, and Functionality of so many versions.   I was a certified MCITP for 2005, before the first time a guy named Paul Randal (blog|twitter) mentioned ACID in one of his blogs and I said, hey what’s that?  Chances are you’ve see this mentioned, but this is something we all need to know about.

So file this under Fundamentals.  Your Database is using, and while normally we’d say “Drugs are Bad m’kay”, we’re going with a little ACID is a good thing.

“But Balls”, you say, “ACID, Transactions, Fundementals why should I care about this, how is this going to help me impress my boss, or help me manage my SQL Servers?”

Well Dear Reader, I’m really glad you asked.  I’m of the firm belief that the more you know about how a database is supposed to work, the more you understand what it does.  The concepts that we will discuss Stretch across SQL Server. 

TRANSACTIONS

You’ve probably heard the term Transaction before.  Everything that occurs in SQL Server is a Transaction, the essential process of how data flows from end to end in SQL Server. In the 1970’s a very smart man named Jim Gray started working on theories about how a reliable transactional system should work.  He is literally the father of transactional databases, and he did quite a bit of work for Microsoft in helping to develop SQL Server. Jim Gray and Andreas Reuter wrote a book called Transaction processing: concepts and techniques,   this is the definitive book for explaining Transacactions, get a free preview from Google Books.

 In SQL Server you can start a statement by saying BEGIN TRANSACTION, and you can finish it by saying COMMIT or ROLLBACK.

Think of a Transaction like driving a car on the highway.  The Transaction is the car, and the highway is our SQL Server Instance.  The destination is our database.  Either we get there, think COMMIT, or we turn around and go back home without reaching our destination, think ROLLBACK.

There are two kinds of Transactions, Implicit and Explicit.  And Implicit Transaction is an implied Transaction.  SQL Server will always wrap a BEGIN and a COMMIT or ROLLBACK around a transaction.  Here’s a quick example, let’s say you type:

SELECT
    *
FROM
    dbo.myTable1

What you don’t see is that when SQL takes your command, it reads it as:

BEGIN TRANSACTION

SELECT
    *
FROM
    dbo.myTable1
   
COMMIT TRANSACTION

The BEGIN and the COMMIT are Implied, if you type them out then they are Explicit.  An Explicit Transaction is an unambiguous transaction.  We are specifying the beginning and the ending.


ACID

The term ACID was coined by Theo Haerder and Andreas Reuter in the academic paper Principles of Transaction Oriented Database Recovery.   ACID is an acronym that stands for Atomicity, Consistency, Isolation, and Durability.  And it describes the way that a Transaction, behaves.   Here is a link to the MSDN article on transactions, http://msdn.microsoft.com/en-us/library/ms190612.aspx.  All of the definitions are from the MSDN article because these are the terms you will want to know, defined in the way you would want to understand them for SQL.

ATOMICITY
                A transaction must be an atomic unit of work; either all of its data modifications are performed, or none of them is performed.”

CONSISTENCY
                When completed, a transaction must leave all data in a consistent state. In a relational database, all rules must be applied to the transaction's modifications to maintain all data integrity. All internal data structures, such as B-tree indexes or doubly-linked lists, must be correct at the end of the transaction.”

ISOLATION
                Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions. A transaction either recognizes data in the state it was in before another concurrent transaction modified it, or it recognizes the data after the second transaction has completed, but it does not recognize an intermediate state. This is referred to as serializability because it results in the ability to reload the starting data and replay a series of transactions to end up with the data in the same state it was in after the original transactions were performed.”

DURABILITY
                After a transaction has completed, its effects are permanently in place in the system. The modifications persist even in the event of a system failure.”

"But Balls", you say, "High level Balls, tell me from a High Level what does this acronym effect something I can see within SQL Server?"

Well Dear Reader, you see Atomicity in the way transactions behave as a unit of work.  Remember all of our work must be done or none of it.  This is accomplished by how our Transactions behave.  A wonderful example of Atomicity is the COMMIT or ROLLBACK operators that we discussed earlier.

There are two forms of Consistency that I can think of as examples.   First we can use Referential Integrity and Constraints in order to guarantee that a data field is not dropped that is the parent to child data.  You have a Datafile and a Transaction Log that make up your basic database files.  These are our physical file structures.  But we have internal file structures as well.  Internally when SQL Server writes or deletes a record, SQL makes sure that all pointers that would lead to that record are updates as well.  This ensures that we always have a Consistent view of our data accessible to our Users.

A good example of Isolation is the way SQL Server uses Locks.  In SQL Server some of our most common Locks are the Shared Lock and the Exclusive Lock.  By default in SQL Server’s Transaction Isolation Level, is READ COMMITED.  This means that if I have a simple table like below:

ID
Product
Description
1
Bike
Huffy 28 in Bike
2
Light Saber
Star Wars Toy

And let’s pretend that there are a couple hundred rows.  Let’s give a simple example if I go to Delete or Update the Light Saber row, and a millisecond after my Transaction starts, someone else tries to Select the record, what would happen?  My Update would find my record with a Shared Lock, and then it would change to an Exclusive Lock on the record when I begin to Update my data.  The Select statement would attempt to take a Shared Lock on the same record and it would be Blocked, having to wait until my Update finished.

Now all of this would only take a fraction of a second, but it is Locking and Blocking function as intended.  This is a very basic level and we’ll cover Transaction Isolation Levels at a later date.

The best example I can think of for Durability is Recovery.  Durability means that you have an advanced logging mechanism that allows a Transaction that is COMMITED to persist, even if the computer is shut down at the second a COMMIT is received.  Likewise it would ROLLBACK any Transactions that may have been in progress, or In-Flight, that had not yet reached a COMMITTED state, ensuring that our database is Durable through unexpected shutdowns.    To see Recovery in Action open up your SQL Server Instance, go to Management, SQL Server Logs, and look at when your server came online. 



You’ll find that when the databases where brought online there is a record for Transactions Rolled Forward and for Transactions Rolled Back, and you will see the message “Recovery Complete” for databases.

We touched on a lot of topics today that each could be and deserve their own blog post if not series, Transaction Isolation Levels, Locking & Blocking, Internal File Structures, Logical File Structures, and Recovery.  To be honest we could have continued on for much longer, but what I want you to see is that ACID is fundamental to your database. 

So Dear Reader, I hope you can use this going forward.

Thanks,

Brad

Thursday, March 24, 2011

Thank You PASS DBA Virtual Chapter

Yesterday I was able to do my first webcast, a presentation for the PASS DBA Virtual Chapter.   A Big Thank You to Idera for being the Webcast Sponsor!  Go check them out they are a great SQL Community Member, and a Sponsor of user groups and SQL Saturday Events.  I want to Thank everyone who was able to take time away from work, or during, to attend.  I would also like to thank Sharon & PASS for having me present.  Last but not least I would like to say a big Thank You to Mike Clark for being my Presenter, he made the experience super easy, and was a pleasure to work with!  Mike You rock!

So Dear Reader, the topic at hand was Compression.  I did my presentation that will be part of the up-coming SQL Rally, May 11th - May 13th, in Beautiful Orlando FL.  We have an amazing line-up of SQL Proffesionals throughout the industry that will be on hand.   And for the cost of less than $500 (if you sign up now and that is INCLUDING the Pre-Con)!

DECK & SCRIPTS

My Deck & Scripts were the same that I've used before but I wanted to post them for download, in case anyone would like to use compression.  I want them to have all the tools they need to get started.

Get my Slide Deck Here.

And all the Demo's Here.

QUESTIONS
A Question was asked after the presentation, forgive me if I butcher it, But the main idea was as follows.

 "I've compressed all of the tables in my database, my CPU is below 40%, are there any KNOWN issues with this?  And what do you recommend?"

There is always a case that will be the exception of the rule.  I would not recommend compression your entire database, just like I would not recommend compressing the entire contents of your C Drive.  Sure you'll save some space, but your performance will go down hill because of the overhead.

When you look through the Demo's there is an order that I like to suggest people go through.

1. Look at the Size of your Tables.
2. Look at the Makup of your Tables Allocation Unit's (Only IN_ROW_DATA Compresses)
3. Look at your Scan & Update pattern usage, this will help determine the type of Compression you should use.
4. GET A BASELINE!!! Look at your Table and/or Indexes Size, I/O's, CPU, and Runtime Before.  Then Look at them After.  Do a comparison.   If anything has changed for the negative, then perhaps you shouldn't be using Compression on that Table and/or Index.

It is possible that someone out there has a system that could benefit from every table being compressed, It's not outside of the realm of possibilities.  However, I would wager that would be a very Rare Scenario.

Compression is like Indexes, used properly it is a beautiful thing, but too much can tank your performance.

Thank's Again to everyone who could make the Presentation, I hope you are motivated to go to SQL RALLY!

Thanks,

Brad

Tuesday, March 22, 2011

PASS DBA Virtual Chapter: SQL Rally Preview

On March 23rd at 12 noon Eastern Coast Time, I'll be hosting the PASS DBA Virtual Chapter meeting, Click Here to for the DBA Virtual Chapter webpage that has the Live Meeting Link.

This presentation is sponsered by the Good People over at Idera, and has a sponsor based presentation, here's a little info on that.

"Bullett Manale, a sales engineering manager at Idera, will present various use-case scenarios for utilizing virtual databases. A virtual database is a software implementation of a database that allows for data and storage and retrieval but exploits the backup files rather than the data base itself."




I'll be presenting my SQL Rally Topic, Page & Row Compression How, When, and Why, I'll be changing up some of the Demo's a bit, so if your free hop in and get a preview for SQL Rally.


I've blogged about this topic in relation to my Presentations, and I'm getting to the point that I just need to do a Series on it.  But until that Series Dear Reader, if your curious to read a little more here are some of my previous posts on the topic.



Vote For SQL Rally! And Why you should vote for Page and Row Compression.



Happy Reading and I hope to see you online.



Thanks,


Brad