Tuesday, April 26, 2011

SQL Saturday 74 Jacksonville, Last Stop before SQL Rally

This Saturday is SQL Saturday #74 in Jacksonville, FL.  This will be the last time that I present on Data Compression in SQL Server, before SQL Rally in May.

This will be my first time in Jacksonville, but I’ve been told by many experienced SQL Saturday Alumni that the location is beautiful, and the event is always a good one.  We are on the University of North Florida Campus, in the College of Computing and Engineering Building, for the address Click Here


There is a great line up for this SQL Saturday.  You’ve got great Community Leaders & MVP’s like Kevin Boles (twitter), Andy Warren (blog|twitter), Plamen Ratchev (blog|twitter), Kendal Van Dyke (blog|twitter), Don Stevic (blog|twitter), Adam Jorgensen (twitter), Jose Chinchilla (blog|twitter), Troy Gallant (blog|twitter), Brian Knight , and many many more.  I also have it on good authority that this Balls fellow will be there.  For the full event schedule Click Here.

So far over 450 people have pre-registered to Attend, and you can be one of them the cost is A WHOLE LOTTA NOTHIN!  That’s right Dear Reader, the only cost is your time, you showing up and getting free training in SQL from the cast listed above, Click Here to Register.

This is also the last SQL Saturday in the US before SQL Rally, and I’m psyched. 

“But Balls,” you say, ”What if I’m not going to SQL Rally?”

Even more reason to attend!  Some of the presentations given today will be given at the Rally!  Not only that, but your feedback can help make the SQL Rally better.  

Every presentation I do I look for feedback to figure out if there is anything for my presentation that I’m missing or could be doing differently. 

So come out and enjoy a day of free training with your peers, discuss SQL, and if you find a place to leave feedback help influence the presenters to be better at what we do!

Thanks,

Brad





Monday, April 25, 2011

Cloudy With A Chance of SQL

We’ve all heard the phrases “The Cloud” and that Microsoft is “All in on the Cloud”. We manage SQL Server, we Develop on SQL Server, and we support SL Servers.  We make our living supporting a product that our companies use that was made by Microsoft.  Some are predicting that the Cloud will change the world, and plenty more just don’t buy it.

“So Balls,” you say,”As DBA’s what does this Cloud thing mean for us?” 

Well Dear Reader it means learning something new, which is always a good thing.   The Cloud offers us SQL Azure.

WHO IS THIS SQL AZURE, AND WHO SAID HE COULD TAKE MY BIRD SEED DATABASES


 SQL Azure is a totally different version and build of SQL, depending on the release you could get a different version number but executing SELECT @@VERSION should return 10.25.****.

SQL Azure by default disables DMO’s that work below an instance level.   You set up a SQL Azure Server, which is a logical container similar to a SQL Instance.  However, this is not the type of server that you would create Multiple instances on.  Your database will be automatically replicated to three separate locations, load balanced, and supported by High Availability. 


While those features are all sound pretty sweet, there are quite a few things that you cannot do, to see the list click here.

One thing that won’t be going out of style with anyone that uses Azure is from the Database on up.  By that I mean Indexing, Performance Tuning, and general Database Design.

NOTHING TO SEE HERE

I think one of the biggest drawbacks to the cloud is that you cannot look at the DMO commands or Perfmon for anything at a server level.  If an application starts to have a hiccup, the database and its performance normally come in to question at some point.  Having access to Server level objects helps us to know things are doing good, or not so good.

Then again, Azure is something new.  First we don’t always virtualize it.  Second if we virtualize it, we don’t cluster and replicate it to three different locations.  Third we don’t LOAD BALANCE it.  

So if we did have access to the server, there may be a lot of new things we would have to teach ourselves.   Personally I’d like that opportunity, but that could just be me.  

So we’ve got to Trust Dear Reader that everything is going just fine with the servers.  I’m sure there is a support staff we could contact if need be, but not having done that before I could not tell you the cost nor how to do it.

SQL VS SQL

So let’s take a look at some of the Pro’s and Con’s to this

Pro’s

Time Savings – You won’t be setting up database jobs, no SQL Agent, no Maintenance Tasks

HA/DR- Your High Availability is built in, replicated 3 times, Clustered, Load Balanced, and if one of your Virtual Machines Fails, another will be spun right up.

Hardware-You can stand up Hardware much faster, you don’t have to cost out the depreciation of hardware, you are charged by the usage.  Grant Fritchey (Blog|Twitter) wrote about the Cost of SQL Azure, go read that for more great info!

SLA’s-Microsoft has been working with Internet Facing Applications for quite some time, cough cough MSN, Hotmail, Microsoft.com, Technet, I hope you get the point.  They have published SLA’s that guarantee 99.95% uptime.

Security-When you go to MVA to learn about Azure there is a Security Model, that show’s how Microsoft is going above and beyond to provide a secure data center experience.

Con’s
Server Access- As we discussed before, your trouble shooting will only go so deep, you won’t be able to look at his for yourself.

No SQL Agent- Okay I get you want to take server management because you’re doing that, but how do I defragment my Indexes, what if I want to run a batch script, or have a job to update statistics.

Trust- When my boss says take a look at that SQL Server and see if anything is wrong, I’m limited, unless I involve boots on the ground that I don’t have a current working relationship with.  This will probably get tackled over time, but it does take time.

Lack of features- I posted that earlier, but there are things like XML, Filestream, and Encryption that some applications will need to have.

“So Balls”, you say “Do we use this or what?”

The places I see this being of most benefit, if you have a business group that needs to move fast, you can have a SQL Server, or app servers for that matter, set up in the blink of an eye and the swipe of a credit card.   I can see some places this will work great.  And I can see some places this will not.  I’m going to Total Recall it here Dear Reader, you’ve got to decide on your own, and like just about everything else SQL “It Depends”.

GO LEARN SOMETHING ALREADY!

Right now the Microsoft Virtual Academy is offering free training courses in Cloud technologies.  I would recommend you sign up and take the courses.  There are video’s, PDF’s and quizzes on Introduction To SQL Azure, Introduction to Windows Azure, Planning, Building and Managing a Private Cloud, and Windows Azure Security Overview.

 (By the way everyone end's up being #1 when you take All the tests, while I'd love to be super special you'll get to be #1 as well)

I’ve already taken my training and it was well spent time!  Go and learn something!

Thanks,

Brad

Tuesday, April 19, 2011

Let's Talk About Indexes: An Introduction

A very good friend of mine, whose got this kick-ass job working for the President J, wrote me a question about indexes.  And I thought how better than a phone call and a follow up blog talking about indexes in general.  The question centered on the uses of Clustered Indexes vs. Non-Clustered Indexes and the use for each.

CLUSTERED vs NON-CLUSTERED


The good news is this really isn't a VS kind of thing.  Clustered Indexes and Non-Clustered Indexes are both indexes.  They are both made of of an internal B-Tree structure.  




When you think of a B-Tree, think of your standard Hierarchy.  There are 3 named levels to a B-Tree structure.  The Root Level which is our top level and contains values that direct a path towards the next level, the Intermediate level and just like the Root Page points to the next level of Data, and finally the Leaf Page.

One of difference between a Clustered Index and a Non-Clustered Index is that a Clustered Index physically sorts the Data, based off of the Clustered Index.  Another difference is you can only have one Clustered Index per Table, but you can have multiple Non-Clustered Indexes.  The Clustered Index is most commonly thought of as a Primary Key, (the Clustered Index doesn't have to be the Primary Key, but that could get a little confusing so we'll save that for another day).

So we'll say that our B-Tree up above is using the letters of the Alphabet as it's Clustered Index.  I'll issue a quick T-SQL Select to get our letter.

SELECT
    *
FROM dbo.alphabet
WHERE
    letter='G'

If I wanted to go get the letter G, at the Root Level I deduce that G comes after A and before J so I would take a pointer and go to the Intermediate level that contained A, C, F, and H, From there I would further deduce that G is after F and before H, and I would travel down to our Leaf Page where I would find the letter G.

*IF your query's first Search Argument is the Clustered Index, and your range of values are narrow, then you will use your Index structure in a manner called a Seek.  If you have to read the entire table to get the contents of your information back, it is called Scan.  Scan's have their place and are not necessarily a bad thing.  But it is entirely dependent on the amount of data being returned, and the route taken to search for it.



One of the best examples I can think of to describe what an Index does, and further illuminate our B-Tree Structure, and show how  Clustered Indexes and  Non-Clustered Indexes  work together is the good old Phone Book.  If I told you to find Bradley Ball's address in the phone book you've got 2 ways of finding my name.

1. Turn through the book 1 page at a time until you get to Bradley Ball, and then read his address.

2. Flip to the B's (our Clustered Index) and scan until you get to Ball.  Now Scan Ball for the first name of Bradley (our Non-Clustered Index), and find the address.  You just performed a Seek, using the search arguments of Last Name and First Name to find Address.  

So now let's toss out a little vocabulary, to help further this discussion along.

HEAP- A heap is a table that has no clustered index.  Data is stored on a head in the order that it is inserted.  This means it is easily fragmented, and could require scanning as selecting data would require a full scan of the heap in order to find it.

CLUSTERED INDEX- A Clustered Index is a Unique value, that determines the physical sort order of the data as it is stored on a page.  The leaf level of a Clustered Index is the data of the table.  Clustered indexes have a limited key size of 900 bytes.

NON-CLUSTERED INDEX- A Non-Clustered Index is an index placed on a row in a Table, the leaf level of a Non-Clustered Index contains the data that makes up the Non-Clustered Index, as well as a physical pointer to the leaf level of the Clustered Index.  You can have many Non-Clustered Indexes on a Clustered Index Table, or a Heap.  Non-Clustered Indexes have the same 900 byte limitation and are limited to 16 columns, you can get around this using Include Columns.

"So Balls", you say, "Indexes sound great, we should put them on EVERYTHING! Right?"

I'm Glad you asked Dear Reader, and the response, regarding OLTP systems, is NO.

BALANCING ACT

Every Index that you place on a table has a cost of additional overhead.  You want to use them to speed up your database, but if you're not careful you could grind it to a halt.  

If you have a table with a Clustered Index, when you insert and update pages, and GROW the data, you are increasing your over head, slightly and it's worth it, to make sure that your Root and Intermediate level pages have the correct values to point to your leaf pages.

When you add a Non-Clustered Index, every time you insert a value for that Index you write to your table AND your Non-Clustered Index.  So if you Add another, and another, and another, then suddenly every time you write 1 record to your table, you could have many, many, many writes to keep all of your Indexes updated.

One of the first things I look at when I get a call about a "performance" problem on a database is to look at the indexes.  I've had instances where I have found tables with more indexes than columns.


COVERING INDEXES

The final high level topic that I want to talk about is Covering Indexes.  A Covering Index is an index that satisfies all of the return requirements of a query.  You could take all of the fields that you want to return, limit 16, in a query and if you add them to a single Non-Clustered Index you can do so.

One way to make a covering index is to add each column to the Non-Clustered index, another is to add additional columns as Include Columns.  Include columns allow you to get around the 900 byte limitation of an index, because Include columns are not stored as key values.  There are some data types that are not allowed to participate in Included Columns, Text and Image datatypes.  You can have 1023 Included columns on an index.

I just want to point out thought that, 1023 should not be a goal.  Every value you add, will add additional overhead to your leaf level Non-Clustered index.  Be sure to take baselines before and after, in order to know if your Index changes are having a positive affect on your query performance.




Thanks,

Brad