Tuesday, March 29, 2011

We Should Have A Talk: Your Database is Using ACID

I feel like a lot of my life as a DBA has been lived in reverse.  I’m started out my DBA life as an involuntary DBA.  My first real DBA issue was when my company’s production website was crashing.  The Database server was tossing errors that the Log file was full, as a matter of fact it had filled up the entire hard drive it was on.  Looking back at it now it seems like such a simple problem. 

One of the issues I’ve had with learning everything in reverse is I often find things and think “Why didn’t anyone tell me this!?”  There is so much to learn with being a DBA, Fundamentals, Internals, Troubleshooting, Monitoring, and Functionality of so many versions.   I was a certified MCITP for 2005, before the first time a guy named Paul Randal (blog|twitter) mentioned ACID in one of his blogs and I said, hey what’s that?  Chances are you’ve see this mentioned, but this is something we all need to know about.

So file this under Fundamentals.  Your Database is using, and while normally we’d say “Drugs are Bad m’kay”, we’re going with a little ACID is a good thing.

“But Balls”, you say, “ACID, Transactions, Fundementals why should I care about this, how is this going to help me impress my boss, or help me manage my SQL Servers?”

Well Dear Reader, I’m really glad you asked.  I’m of the firm belief that the more you know about how a database is supposed to work, the more you understand what it does.  The concepts that we will discuss Stretch across SQL Server. 

TRANSACTIONS

You’ve probably heard the term Transaction before.  Everything that occurs in SQL Server is a Transaction, the essential process of how data flows from end to end in SQL Server. In the 1970’s a very smart man named Jim Gray started working on theories about how a reliable transactional system should work.  He is literally the father of transactional databases, and he did quite a bit of work for Microsoft in helping to develop SQL Server. Jim Gray and Andreas Reuter wrote a book called Transaction processing: concepts and techniques,   this is the definitive book for explaining Transacactions, get a free preview from Google Books.

 In SQL Server you can start a statement by saying BEGIN TRANSACTION, and you can finish it by saying COMMIT or ROLLBACK.

Think of a Transaction like driving a car on the highway.  The Transaction is the car, and the highway is our SQL Server Instance.  The destination is our database.  Either we get there, think COMMIT, or we turn around and go back home without reaching our destination, think ROLLBACK.

There are two kinds of Transactions, Implicit and Explicit.  And Implicit Transaction is an implied Transaction.  SQL Server will always wrap a BEGIN and a COMMIT or ROLLBACK around a transaction.  Here’s a quick example, let’s say you type:

SELECT
    *
FROM
    dbo.myTable1

What you don’t see is that when SQL takes your command, it reads it as:

BEGIN TRANSACTION

SELECT
    *
FROM
    dbo.myTable1
   
COMMIT TRANSACTION

The BEGIN and the COMMIT are Implied, if you type them out then they are Explicit.  An Explicit Transaction is an unambiguous transaction.  We are specifying the beginning and the ending.


ACID

The term ACID was coined by Theo Haerder and Andreas Reuter in the academic paper Principles of Transaction Oriented Database Recovery.   ACID is an acronym that stands for Atomicity, Consistency, Isolation, and Durability.  And it describes the way that a Transaction, behaves.   Here is a link to the MSDN article on transactions, http://msdn.microsoft.com/en-us/library/ms190612.aspx.  All of the definitions are from the MSDN article because these are the terms you will want to know, defined in the way you would want to understand them for SQL.

ATOMICITY
                A transaction must be an atomic unit of work; either all of its data modifications are performed, or none of them is performed.”

CONSISTENCY
                When completed, a transaction must leave all data in a consistent state. In a relational database, all rules must be applied to the transaction's modifications to maintain all data integrity. All internal data structures, such as B-tree indexes or doubly-linked lists, must be correct at the end of the transaction.”

ISOLATION
                Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions. A transaction either recognizes data in the state it was in before another concurrent transaction modified it, or it recognizes the data after the second transaction has completed, but it does not recognize an intermediate state. This is referred to as serializability because it results in the ability to reload the starting data and replay a series of transactions to end up with the data in the same state it was in after the original transactions were performed.”

DURABILITY
                After a transaction has completed, its effects are permanently in place in the system. The modifications persist even in the event of a system failure.”

"But Balls", you say, "High level Balls, tell me from a High Level what does this acronym effect something I can see within SQL Server?"

Well Dear Reader, you see Atomicity in the way transactions behave as a unit of work.  Remember all of our work must be done or none of it.  This is accomplished by how our Transactions behave.  A wonderful example of Atomicity is the COMMIT or ROLLBACK operators that we discussed earlier.

There are two forms of Consistency that I can think of as examples.   First we can use Referential Integrity and Constraints in order to guarantee that a data field is not dropped that is the parent to child data.  You have a Datafile and a Transaction Log that make up your basic database files.  These are our physical file structures.  But we have internal file structures as well.  Internally when SQL Server writes or deletes a record, SQL makes sure that all pointers that would lead to that record are updates as well.  This ensures that we always have a Consistent view of our data accessible to our Users.

A good example of Isolation is the way SQL Server uses Locks.  In SQL Server some of our most common Locks are the Shared Lock and the Exclusive Lock.  By default in SQL Server’s Transaction Isolation Level, is READ COMMITED.  This means that if I have a simple table like below:

ID
Product
Description
1
Bike
Huffy 28 in Bike
2
Light Saber
Star Wars Toy

And let’s pretend that there are a couple hundred rows.  Let’s give a simple example if I go to Delete or Update the Light Saber row, and a millisecond after my Transaction starts, someone else tries to Select the record, what would happen?  My Update would find my record with a Shared Lock, and then it would change to an Exclusive Lock on the record when I begin to Update my data.  The Select statement would attempt to take a Shared Lock on the same record and it would be Blocked, having to wait until my Update finished.

Now all of this would only take a fraction of a second, but it is Locking and Blocking function as intended.  This is a very basic level and we’ll cover Transaction Isolation Levels at a later date.

The best example I can think of for Durability is Recovery.  Durability means that you have an advanced logging mechanism that allows a Transaction that is COMMITED to persist, even if the computer is shut down at the second a COMMIT is received.  Likewise it would ROLLBACK any Transactions that may have been in progress, or In-Flight, that had not yet reached a COMMITTED state, ensuring that our database is Durable through unexpected shutdowns.    To see Recovery in Action open up your SQL Server Instance, go to Management, SQL Server Logs, and look at when your server came online. 



You’ll find that when the databases where brought online there is a record for Transactions Rolled Forward and for Transactions Rolled Back, and you will see the message “Recovery Complete” for databases.

We touched on a lot of topics today that each could be and deserve their own blog post if not series, Transaction Isolation Levels, Locking & Blocking, Internal File Structures, Logical File Structures, and Recovery.  To be honest we could have continued on for much longer, but what I want you to see is that ACID is fundamental to your database. 

So Dear Reader, I hope you can use this going forward.

Thanks,

Brad

Thursday, March 24, 2011

Thank You PASS DBA Virtual Chapter

Yesterday I was able to do my first webcast, a presentation for the PASS DBA Virtual Chapter.   A Big Thank You to Idera for being the Webcast Sponsor!  Go check them out they are a great SQL Community Member, and a Sponsor of user groups and SQL Saturday Events.  I want to Thank everyone who was able to take time away from work, or during, to attend.  I would also like to thank Sharon & PASS for having me present.  Last but not least I would like to say a big Thank You to Mike Clark for being my Presenter, he made the experience super easy, and was a pleasure to work with!  Mike You rock!

So Dear Reader, the topic at hand was Compression.  I did my presentation that will be part of the up-coming SQL Rally, May 11th - May 13th, in Beautiful Orlando FL.  We have an amazing line-up of SQL Proffesionals throughout the industry that will be on hand.   And for the cost of less than $500 (if you sign up now and that is INCLUDING the Pre-Con)!

DECK & SCRIPTS

My Deck & Scripts were the same that I've used before but I wanted to post them for download, in case anyone would like to use compression.  I want them to have all the tools they need to get started.

Get my Slide Deck Here.

And all the Demo's Here.

QUESTIONS
A Question was asked after the presentation, forgive me if I butcher it, But the main idea was as follows.

 "I've compressed all of the tables in my database, my CPU is below 40%, are there any KNOWN issues with this?  And what do you recommend?"

There is always a case that will be the exception of the rule.  I would not recommend compression your entire database, just like I would not recommend compressing the entire contents of your C Drive.  Sure you'll save some space, but your performance will go down hill because of the overhead.

When you look through the Demo's there is an order that I like to suggest people go through.

1. Look at the Size of your Tables.
2. Look at the Makup of your Tables Allocation Unit's (Only IN_ROW_DATA Compresses)
3. Look at your Scan & Update pattern usage, this will help determine the type of Compression you should use.
4. GET A BASELINE!!! Look at your Table and/or Indexes Size, I/O's, CPU, and Runtime Before.  Then Look at them After.  Do a comparison.   If anything has changed for the negative, then perhaps you shouldn't be using Compression on that Table and/or Index.

It is possible that someone out there has a system that could benefit from every table being compressed, It's not outside of the realm of possibilities.  However, I would wager that would be a very Rare Scenario.

Compression is like Indexes, used properly it is a beautiful thing, but too much can tank your performance.

Thank's Again to everyone who could make the Presentation, I hope you are motivated to go to SQL RALLY!

Thanks,

Brad

Tuesday, March 22, 2011

PASS DBA Virtual Chapter: SQL Rally Preview

On March 23rd at 12 noon Eastern Coast Time, I'll be hosting the PASS DBA Virtual Chapter meeting, Click Here to for the DBA Virtual Chapter webpage that has the Live Meeting Link.

This presentation is sponsered by the Good People over at Idera, and has a sponsor based presentation, here's a little info on that.

"Bullett Manale, a sales engineering manager at Idera, will present various use-case scenarios for utilizing virtual databases. A virtual database is a software implementation of a database that allows for data and storage and retrieval but exploits the backup files rather than the data base itself."




I'll be presenting my SQL Rally Topic, Page & Row Compression How, When, and Why, I'll be changing up some of the Demo's a bit, so if your free hop in and get a preview for SQL Rally.


I've blogged about this topic in relation to my Presentations, and I'm getting to the point that I just need to do a Series on it.  But until that Series Dear Reader, if your curious to read a little more here are some of my previous posts on the topic.



Vote For SQL Rally! And Why you should vote for Page and Row Compression.



Happy Reading and I hope to see you online.



Thanks,


Brad

Sunday, March 20, 2011

Thank You OPASS! Transparent Data Encryption Recap

At the start of March I had the great pleasure of being the opening DBA shot for the OPASS meeting.  Microsoft MVP Ken Tucker was the main act giving us an inside look at Silverlight MVVM Framework.

Going to OPASS is like going to the Oscar’s, the stars in the audience are part of the treat!  Everyone had great questions, and they were very interactive.  Jack Corbett (blog | twitter) and Karla Landrum (twitter) had a wonderful venue, Rikka Asian Bistro, for the OPASS meeting.

During my Talk I had promised to post my slide deck and the script that I used for my Demo, so I wanted to do that right off the bat.

Slide Deck – OPASS Transparent Data Encryption: The Lightning Round

Demo for the presentation

DEMO RECAP

So Dear Reader during the talk we covered a couple things.  We covered Why you should use Transparent Data Encryption (TDE), How to setup TDE, how to Disable TDE, and the effects TDE has on your backup and recovery Strategy, the slide deck and demo’s are attached so I’ll let you go through those on your own I’m just going to cover things from a logical level.  We’ll step through each of those, and then we’ll get to some advanced scripts on how to self manage certificates for TDE.

WHY?

Here is a slide from my deck, and this is the best reason why, if you are a DBA long enough eventually you will find yourself in a situation where a backup is lost, something will get stolen, or something will get hacked.  TDE will not save you from every scenario, but what it will do is make so when you are in this situation you have one more level of protection. 

“But Balls,” you say “This only covers your back-side as the DBA, why would my Boss care about this when it’s me on the line?”   

Well Dear Reader yes it is our back-side on the line, but so is our Boss’s, and the companies.  If you take the point of view of Jerry Macgwire “help me, help you. HELP ME, HELP YOU!”. 

Don’t go and yell at your boss, but you should know them well enough to know to find an approach that they will respond to.  If its ROI, bring up how much companies lose in money because of incidents like these, and future loss of customers profit if customer’s confidence in the business is shaken. 

HOW TO TURN IT ON

Get your minds out of the gutter J.  Turning on TDE is very easy we just follow 4 easy steps
  1. In the Master Database Create a Master Key
  2. In the Master Database Create a Certificate
  3. In the User Database to be Encrypted Create a Database Encryption Key
  4. Alter the User Database and Set Encryption On

It’s that easy.  After step 4 an encryption scan will begin on the database.  It is safe for user activity.  The only things you cannot do is alter the database at the Filegroup level.  No dropping, no adding.  While you are encrypting your database, It’s probably not the best time to be doing changes to the underlying file structure, (one thing at a time!).

Keep in mind that the second the 1st user database becomes encrypted with TDE the TempDB will be encrypted as well.

TURNING IT OFF

“So Balls”, you say, “I turned TDE on my DEV system, did a demo for my boss, we’re not going to use it, how do I turn it off!”  We’ll Dear Reader it is pretty simple.

  1. Alter All user databases and Set Encryption Off
  2. Drop the Database Encryption Key for each User Database that was encrypted
  3. (To Unencrypt the TempDB)-Restart the SQL Service
  4. *DO NOT DROP THE CERTIFICATE

BACK THAT THANG DATABASE UP

When you turn on TDE it affects all Datafiles at rest.  All Data Files, Log Files (only the VLF’s utilized while TDE was turned on), snapshot files, and backup files.  So here are the things to consider.

If you turn on TDE on your Prod system, and you want to do a data refresh to your Dev or Staging environments you will need to restore a copy of the Certificate used to encrypt your Prod databases to your Dev or Staging server where the restore will take place.

If you want to use Log Shipping or Database Mirroring you will need to have the Certificate you use on your Primary or Principal Server on your Secondary or Mirror Server.

So if you have a Database using TDE, if you are using a technology that uses Restore and Recovery then you will need a copy of the Certificates on each SQL Instance where a copy of the Database will reside.

Now keep in mind this doesn’t mean you can’t move data around.  TDE works with data at rest.  That means that if you BCP data out of the database it is in an unencrypted state.  You don’t need to have a copy of the Certificates in your destination. 

So how does this effect your recovery strategy?  You need to backup your Certificates regularly.  You need to make sure that you plan for them like you do your backups.  If you sweep your backups to tape you’ll want to sweep your certificates to tape.  If you back up your databases to disk, backup your certificates to disk.  Keep redundant copies.  If you ever have to restore your database to a different server you will need them. 

I’ll be doing covering Key Management in my next post but all of the scripts you will need are in the download.

Each of these scripts need a little tweeking, but all the information is there.

Local DB, function, table, and encryption.sql (Sets Up a local DB, and TDE, to store passwords used in later scripts)

backup tde certificates.sql (T-SQL Script to automate certificate backups)

delete old TDE certificates.sql (T-SQL Script to automate certificate clean up)

Hope these get you on your way to learning how to use TDE!

Thanks,

Brad

Monday, March 7, 2011

T-SQL Tuesday 16: Aggregation




So it’s that time again, T-SQL Tuesday.  This week our event is Sponsored by Jes Borland (Blog|Twitter).  So Jes what’s the subject?

I want to hear how you solved business problems with aggregate functions. I want to see your cool T-SQL tricks. How are aggregates used in SSRS, SSAS, or SSIS? What have you learned about aggregate functions?

So Dear Reader, Aggregation it is!

SOME THINGS JUST GO TOGETHER




We’ve all probably done our fair share of fancy aggregates, I have to say that most of the code I’ve used aggregates for is for business functionality and sitting with clients that I’ve worked for in the past.  I think we all have a nightmare report, script, or task that we accomplished using aggregates and while those things make for interesting stories, they aren’t that useful.  I’d like to give you something that you can use, (if you don’t already have a script to do what I’ve done here). 

Sometimes the most useful things are simple, like PB & J (simple and delicious!).  Here are a couple of queries that I use that have the SUM aggregate function in them.  Simple but effective.

The first is a quick script that extracts the information from DBCC SHOWFILESTATS and uses the "undocumented" sp_msforeachdb in order to gather total space used by all datafiles on an instance, total space reserved for all data files on an instance, then a breakdown by datafile per database, lastly we take the info from DBCC SQLPERF(LOGSPACE)and sort by the log file size.

The second script is of the same concept.  You can use the built-in reports “Disk Usage by Tables” to get the same info, but sometimes it’s just nice to build a script to work for you.  This script uses sp_spaceused and the "undocumented" sp_msforeachtable in order to build a list of tables by row number and the Reserved size for Data. (I use the Second script on my Compression demos to help you find an appropriate table (size wise there are multiple steps J) while examining compression).  Then I do a SUM function on each of the fields to show an aggregate of your database by user table.

SCRIPT1 – Get Database Size Stats

/*============================================================
--@SQLBalls: SQLBalls@gmail.com
--http://www.SQLBalls.com
--T-SQL TUESDAY 16 Aggregation
--
--This Sample Code is provided for the purpose of illustration only
--and -is not intended to be used in a production environment. 
--THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS"
--WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING
--BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR
--FITNESS FOR A PARTICULAR PURPOSE.
==============================================================*/

IF exists(SELECT * FROM tempdb.sys.tables WHERE name like '#myTable%' and name not like'#myTable2%')
BEGIN
    DROP TABLE #myTable;
END
CREATE TABLE #myTable (
                      [myID] [int] IDENTITY(1,1) NOT NULL,
                      fileid varchar(4),
                      fileGroup varchar(4),
                      TotalExtents int,
                      usedExtents int,
                      DBName sysname,
                      filename1 varchar(8000));
GO
                     
IF exists(SELECT * FROM tempdb.sys.tables WHERE name like '#myTable2%')
BEGIN
    DROP TABLE #myTable2;
END
CREATE TABLE #myTable2(
                      dbName sysname,
                      logSize float,
                      spaceUsed float,
                      dbStatus int
                      );                   
GO

INSERT INTO #myTable
EXEC ('sp_msforeachdb @command1="use [?]; dbcc showfilestats"');
GO
                     
INSERT INTO #mytable2
EXEC( 'dbcc sqlperf(logspace)');
GO

SELECT
((SUM(usedExtents)*64)/1024) AS DatabaseUsedSpaceinMB
,((SUM(totalExtents)*64)/1024) AS DatabaseSizeinMB
FROM #myTable;
GO

SELECT
    DBName
    ,usedExtents
    ,TotalExtents
    ,((usedExtents*64)/1024) AS DatabaseUsedSpaceinMB
    ,((totalExtents*64)/1024) AS DatabaseSizeinMB
    ,(((totalExtents*64)/1024)-((usedExtents*64)/1024)) as DataFileFreeSpace
 FROM
    #mytable
 ORDER BY
    ((totalExtents*64)/1024) DESC;
 GO


SELECT
    *
FROM
    #myTable2
ORDER BY
    logsize DESC;
GO

DROP TABLE #myTable;

DROP TABLE #myTable2;
GO




SCRIPT2-Get Table Size Stats

/*============================================================
--@SQLBalls: SQLBalls@gmail.com
--http://www.SQLBalls.com
--T-SQL TUESDAY 16 Aggregation
--
--This Sample Code is provided for the purpose of illustration only
--and -is not intended to be used in a production environment. 
--THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS"
--WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING
--BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR
--FITNESS FOR A PARTICULAR PURPOSE.
==============================================================*/

IF EXISTS(SELECT * FROM TEMPDB.sys.tables WHERE name like '#myTable%')
BEGIN
    DROP TABLE #myTable
END

CREATE TABLE #myTable(
                               [Name] sysname,
                               [Rows] int,
                               [Reserved] varchar(500),
                               [Data] varchar(500),
                               [Index_Size] varchar(500),
                               [Unused] varchar(500)
                               );

INSERT INTO #myTable
EXEC ('sp_msforeachtable @command1="sp_spaceused [?]"');
GO

SELECT * FROM #myTable
ORDER BY
    [rows] DESC,
    LEFT([reserved],(LEN([reserved])-3)) DESC;

SELECT
    SUM(CAST(LEFT([reserved],(LEN([reserved])-3)) as int)) as [Total_Reserved(KB)]
    ,SUM(CAST(LEFT([data],(LEN([data])-3)) as int)) as [Total_Data(KB)]
    ,SUM(CAST(LEFT([Index_Size],(LEN([Index_Size])-3)) as int)) as [Total_IndexData(KB)]
    ,SUM(CAST(LEFT([Unused],(LEN([Unused])-3)) as int)) as [Total_UnusedSpace(KB)]
FROM  #myTable

GO

DROP TABLE #myTable;

  
Happy T-SQL Tuesday I hope you enjoy the scripts!

Thanks,
Brad