I feel like a lot of my life as a DBA has been lived in reverse. I’m started out my DBA life as an involuntary DBA. My first real DBA issue was when my company’s production website was crashing. The Database server was tossing errors that the Log file was full, as a matter of fact it had filled up the entire hard drive it was on. Looking back at it now it seems like such a simple problem.
One of the issues I’ve had with learning everything in reverse is I often find things and think “Why didn’t anyone tell me this!?” There is so much to learn with being a DBA, Fundamentals, Internals, Troubleshooting, Monitoring, and Functionality of so many versions. I was a certified MCITP for 2005, before the first time a guy named Paul Randal (blog|twitter) mentioned ACID in one of his blogs and I said, hey what’s that? Chances are you’ve see this mentioned, but this is something we all need to know about.
So file this under Fundamentals. Your Database is using, and while normally we’d say “Drugs are Bad m’kay”, we’re going with a little ACID is a good thing.
“But Balls”, you say, “ACID, Transactions, Fundementals why should I care about this, how is this going to help me impress my boss, or help me manage my SQL Servers?”
Well Dear Reader, I’m really glad you asked. I’m of the firm belief that the more you know about how a database is supposed to work, the more you understand what it does. The concepts that we will discuss Stretch across SQL Server.
You’ve probably heard the term Transaction before. Everything that occurs in SQL Server is a Transaction, the essential process of how data flows from end to end in SQL Server. In the 1970’s a very smart man named Jim Gray started working on theories about how a reliable transactional system should work. He is literally the father of transactional databases, and he did quite a bit of work for Microsoft in helping to develop SQL Server. Jim Gray and Andreas Reuter wrote a book called Transaction processing: concepts and techniques, this is the definitive book for explaining Transacactions, get a free preview from Google Books.
In SQL Server you can start a statement by saying BEGIN TRANSACTION, and you can finish it by saying COMMIT or ROLLBACK.
Think of a Transaction like driving a car on the highway. The Transaction is the car, and the highway is our SQL Server Instance. The destination is our database. Either we get there, think COMMIT, or we turn around and go back home without reaching our destination, think ROLLBACK.
There are two kinds of Transactions, Implicit and Explicit. And Implicit Transaction is an implied Transaction. SQL Server will always wrap a BEGIN and a COMMIT or ROLLBACK around a transaction. Here’s a quick example, let’s say you type:
What you don’t see is that when SQL takes your command, it reads it as:
The BEGIN and the COMMIT are Implied, if you type them out then they are Explicit. An Explicit Transaction is an unambiguous transaction. We are specifying the beginning and the ending.
The term ACID was coined by Theo Haerder and Andreas Reuter in the academic paper Principles of Transaction Oriented Database Recovery. ACID is an acronym that stands for Atomicity, Consistency, Isolation, and Durability. And it describes the way that a Transaction, behaves. Here is a link to the MSDN article on transactions, http://msdn.microsoft.com/en-us/library/ms190612.aspx. All of the definitions are from the MSDN article because these are the terms you will want to know, defined in the way you would want to understand them for SQL.
“A transaction must be an atomic unit of work; either all of its data modifications are performed, or none of them is performed.”
“When completed, a transaction must leave all data in a consistent state. In a relational database, all rules must be applied to the transaction's modifications to maintain all data integrity. All internal data structures, such as B-tree indexes or doubly-linked lists, must be correct at the end of the transaction.”
“Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions. A transaction either recognizes data in the state it was in before another concurrent transaction modified it, or it recognizes the data after the second transaction has completed, but it does not recognize an intermediate state. This is referred to as serializability because it results in the ability to reload the starting data and replay a series of transactions to end up with the data in the same state it was in after the original transactions were performed.”
“After a transaction has completed, its effects are permanently in place in the system. The modifications persist even in the event of a system failure.”
"But Balls", you say, "High level Balls, tell me from a High Level what does this acronym effect something I can see within SQL Server?"
Well Dear Reader, you see Atomicity in the way transactions behave as a unit of work. Remember all of our work must be done or none of it. This is accomplished by how our Transactions behave. A wonderful example of Atomicity is the COMMIT or ROLLBACK operators that we discussed earlier.
There are two forms of Consistency that I can think of as examples. First we can use Referential Integrity and Constraints in order to guarantee that a data field is not dropped that is the parent to child data. You have a Datafile and a Transaction Log that make up your basic database files. These are our physical file structures. But we have internal file structures as well. Internally when SQL Server writes or deletes a record, SQL makes sure that all pointers that would lead to that record are updates as well. This ensures that we always have a Consistent view of our data accessible to our Users.
A good example of Isolation is the way SQL Server uses Locks. In SQL Server some of our most common Locks are the Shared Lock and the Exclusive Lock. By default in SQL Server’s Transaction Isolation Level, is READ COMMITED. This means that if I have a simple table like below:
Huffy 28 in Bike
Star Wars Toy
And let’s pretend that there are a couple hundred rows. Let’s give a simple example if I go to Delete or Update the Light Saber row, and a millisecond after my Transaction starts, someone else tries to Select the record, what would happen? My Update would find my record with a Shared Lock, and then it would change to an Exclusive Lock on the record when I begin to Update my data. The Select statement would attempt to take a Shared Lock on the same record and it would be Blocked, having to wait until my Update finished.
Now all of this would only take a fraction of a second, but it is Locking and Blocking function as intended. This is a very basic level and we’ll cover Transaction Isolation Levels at a later date.
The best example I can think of for Durability is Recovery. Durability means that you have an advanced logging mechanism that allows a Transaction that is COMMITED to persist, even if the computer is shut down at the second a COMMIT is received. Likewise it would ROLLBACK any Transactions that may have been in progress, or In-Flight, that had not yet reached a COMMITTED state, ensuring that our database is Durable through unexpected shutdowns. To see Recovery in Action open up your SQL Server Instance, go to Management, SQL Server Logs, and look at when your server came online.
You’ll find that when the databases where brought online there is a record for Transactions Rolled Forward and for Transactions Rolled Back, and you will see the message “Recovery Complete” for databases.
We touched on a lot of topics today that each could be and deserve their own blog post if not series, Transaction Isolation Levels, Locking & Blocking, Internal File Structures, Logical File Structures, and Recovery. To be honest we could have continued on for much longer, but what I want you to see is that ACID is fundamental to your database.
So Dear Reader, I hope you can use this going forward.