Thursday, October 31, 2013

Tales from the Query Processor: II

Hello Dear Reader!  Today is Halloween, October 31st.  My Daughter turns a big 15 today!  Happy Birthday Ches!!!!!

To celebrate two years ago I posted my first Halloween Blog.  Tales from the Query Processor.  It was a lot of fun to write a fiction/horror story revolving around SQL Server and computers.  I wanted to do this last year but got a little busy with family stuff. 

Leading up to this year, I wanted to do it again.  This is one of those stories that's been sitting in the back of my head for a while.  I hope you enjoy it!  



INTO THE DARK

“It’s the same dream every night Doctor, and I don’t know how to make it stop and it just keeps happening.”  The panic in my voice was clear.  “I’m just a simple data page, I don’t know how to stop something like this!”

“Settle down, and start from the beginning,” was the doctor’s soothing reply. 

I settled in on the doctor’s couch, take a deep breath.  I’m in a safe place, I tell myself.  I’m in the buffer far from the disk, there’s plenty of room in the cache and the LRU algorithm is fast asleep.  Nothing can get me here.

CRASH!!!!

“WHAT WAS THAT!”, I screamed as I bolted straight up.

“Settle down,” said the doctor his tone growing a bit frustrated.  “It was just a stray hash bucket.  I’m sure it was nothing.  Lie back down, that's it. You were about to tell me about your dreams.”

Another deep breath, I’m in a safe… oh the hell with relaxing, “It always begins the same…. “

I’m at home in the disk.  I’m at the bottom of the B-Tree where data pages live.  Everyone is going about their business as usual.  It seems darker than normal, but I’m the only one who notices.  The lights of Data Center cascading through the rack, the blink of packets coming and going during their commute, the hustle and bustle of page splits and re-orgs.  It’s all there.  But there’s a fog.  Like I’m the only one that can see it.  A haze in the air, and everything is in slow motion.  I move myself out of physical order, changing my pointer so I can be found and maintain logical order.

My table is large and there are many IAM pages I can hop to.  I transverse the chain trying to get a better view.  The spinning of the disk heads are in the air, their normal beat and pulse that we’ve all come to expect.  It comforts us, like a baby in the womb.  I stare out into the vastness.  Pages so far it looks like the hex starts to fall off the end of the world into nothing but darkness.  That deep darkness captures me.  I cannot look away.  It's deep like a night with no stars and no moon.  Objects just start to melt in it.  The longer I look at it the larger it grows. Instantly have a stab of panic.  I know that I should stop looking.  But I can't!

It’s out there!  It sees me!  I shouldn’t have gone looking for it!  I should have stayed where I was.  I was safe back in my allocated space.  And I'm still staring!  I want to tear my eyes away from the darkness but I can't.  It's so empty and vast.  At least I thought it was.  The hairs on my neck stand up.  I can feel it.  I'm the only one looking for it and now it can see me.

I’m to close now.  I still can’t see it, but I can feel it with a certainty.  IT IS OUT THERE.  The normal spinning of the disk feels wrong.  It feels off.  Like a 757 flying 300 feet off the ground at 500 miles per hour, the disk continues to spin.  But if you watch it, you just know that any second somethings going to happen.  And that’s when I see it.

It lights up the sky with its spark.  Everyone else is still sitting there, like it’s a picnic on a sunny day.  Be lazy, enjoy the moment.  But I know.  I KNOW.  

I shouldn’t see it.  BUT I DO.  The light caused it to be visible.  The light that was some how gone a moment ago, the vile of fog has been pierced.  Worst of all, now I know that it sees me too.  I was the only one looking for it.  I stood out on the edge, and IT SAW ME.

It’s big and malformed.  It’s hideous.  It shouldn’t be there.  I know it.  Now it must catch me and add me to it or I’ll tell everyone.

I start running.  I need to get back to my allocated position.  I’ve got 10 IAM pages to transverse to get there.  524,288 pages per allocation.  I’m running and running.  This should be taking nanoseconds, but the pages are stretching before me, more, and more, and more.  It’s taking microseconds just to transverse one page.  I can feel the ground quake behind me.  I can feel it closer to me.

I’m terrified.  Running.  This is the end and I know it.  I see my spark starting to go out in the darkness.  I know it wants my bits and it wants to re-arrange them. 

I scream and there is no sound.  I’m trying to get the Checksum’s attention.  Screaming and Screaming, but there’s no sound!  Everyone is still going about their business.  Sunday picnic’s!  Maintenance windows, Read-ahead’s from a query, none of them see it!

READ-AHEAD’S!  If I can grab onto one it can pull me up!  There’s a chance!  There’s a chance I can make it!  But the GAM Allocation stretches out even further.  I extend my hand, like that will do any good.  Reach for as far as I can.  But it’s no good.  It falls on me.

“And that’s where I wake up”

The Doctor was on the edge of his chair.  Moments ago, he would have thought this simple data page another fragmented mess just in need of a bit of re-ordering.  Unknowingly he has stopped taking notes, and his right hand is digging deep into the arm of his chair.

“What was it?”, asks the Doctor.

“Doc, you know what it was as well as I do.  I don’t want to say it.”

“No of course not,” said the doctor.  Forcibly relaxing.  “Look, dreams like this are somewhat common for data pages.  I wouldn’t worry about it.” 

“Really?”

“Yes I’m sure everything is fine.  Often times dreams are just subliminal ways of dealing with a long day.  Head on home and let’s meet again next week,” he says jotting down some notes.  “Can I see your insurance card and driver’s license one more time?  I’ve want to make sure I get all the numbers down correctly.”

“Sure,” I say handing them over.  “So next week then.”

“mmmhmmm”, mutters the Doctor and he copies down the number next to his notes.
I rise.  The doctor is right I feel a little better just getting it off my chest.  We smile shake hands and I leave the office.

I’m almost to the door when I realize my keys feel out of my pocket.  I go back to the knock on the door, it’s still cracked open.  That’s when I hear him.

The Doctor is yelling into the phone.  “What DO YOU MEAN THERE’S NO ALERTS ON THE SUSPECT_PAGES TABLE!  We need to raise some sort of alert!  They need to check the Windows Error Log! NO NO YOU FOOL, 823 & 824 DON’T APPEAR IN THE SQL ERROR LOG IT’S GOT TO BE WINDOWS!”

The other end of the phone squabbles back.  The Doctor cuts him off sharply.

“Look I know damn well CHECKSUM is not ON!  I just had a torn page in my office!  IN MY OFFICE!  CHECKSUM was enabled at some point, but it must have been disabled.” 

“Doctor, I’m a torn Page!”, I say announcing my presence at the door.

“CHECKPOINT!!” The Doctor yells into the phone, “CHECKPOINT!!!”


And the world went white.


WRAP UP

As Always Dear Reader, Thanks for stopping by.  And Happy Halloween!

Thanks,

Brad

Friday, October 18, 2013

PASS Summit 2013 Deck and Demos Live

Ready to rock some Internals.
Hello Dear Reader!  This is just a quick blog to say that my Deck and Demo's are now live for the PASS Summit 2013.  My presentation is at 9:45 am in Room 217-A.  You can get them by going to the Resource Page, or Clicking Here for the Deck or Here for the Demos.

I did something different this time that I hope you will enjoy.  Internals are great but why do they matter?  In the world of performance tuning and troubleshooting an understanding of Internals is essential.

Understanding what's going on under the hood often tells us where we need to be looking when we are problem solving.

"So Balls", you say, "You said you were doing something new?"

Thank You for keeping me on point Dear Reader!  I'll be mixing a whole list of Why Do Internals Matter with this presentation and I'd love your feedback.

Hope to see you there!  As always Thanks for stopping by!

Thanks,

Brad

Thursday, October 17, 2013

PASS Summit 2013 Keynotes Day 2

Okay Dear Reader one more time let me say that if you are interested in Hekaton In-Memory OLTP  Then you need to watch the video when this is live.  This is a great overview of how transactions and records work.  In my opinion Hekaton is the biggest thing to happen to OLTP work loads in SQL Server since.... well since I can remember.  Get the deck here scroll down the page.

Thank you for hanging in there with me today Dear Reader and as always Thanks for stopping by.

Thanks,

Brad

9:50 am We are looking at the Post Processing Phase.  there are 3 sub-phases.  1 generate log record containing all new version of rows, Primary key of all deleted rows  2. force log records as a single */) to sql server log  3. For all rows in transactions write set apply end timestamp.

Check points and recovery.  Holds data for memory optimized tables during shutdowns.  design goals wher incremental non blocking and parallel

restart recovery starts by loading a known checkpoint.

Query Execution is next.  Interpretation.

In Hekaton you have two options.  Regular T-SQL Queries.  T-SQL can access all three query engines.

In Hekaton you can make a natively compiled stored proc which will make a compiled DLL that works at the machine level and is much more efficient with CPU.

We get an example of a regular query against regular OLTP tables.  We review Interpreted Plan Execution.  We are charged for the generic capabilities of the regular T-SQL optimizer parsing.

Native Plan generation.  The initial steps are the same  Parse, semantics analysis, query optimization.  Physical plans are different.  We get a physical plan, go through a translator.  It is compiled to C code to the C compiler at the DLL.  Ugly Ugly coed.  Totally specific for the query, no function calls within the DLL.

The DLL is then loaded and Invoked.

We get a nice table with 3 columns one for a Classic/Non Hekaton Table.  Hekaton Table using Interop for T-SQL, then Hekaton with a Natively compiled proc.

row matching query

Classic 700 Instructions
Hekaton/interop 332 Instructions
Hekaton/Native 75 instructions

non-row matching query

Classic 300 Instructions
Hekaton/interop 110 Instructions
Hekaton/Native 30 instructions

Reducing Instructions reduces compute.  That is where we see our optimization.  This is the assembly line vs manual labor.

We get some random performance results.  Not every thing will get this level of improvement.  We see comparisons on # of lookups ranging between 10.8x - 20.4x improvement.

Updates range from 20.2x to 30.5x.

Okay, Kool-aid is firmly distributed.  We get some numbers from clients currently running customers.  


9:40 am We get to see a transaction flow of Locking vs. Optimistic.  Serialization using variables Xa and Xb. Good stuff and we are getting deep.

Now that we've covered Concurrency, what about all the old versions?  Do we get a lot of junk stored memory.  Hekaton garbage collector is non-blocking, cooperative, incremental, parallel, self-throttling and has minimal impact on performance.

We are starting the wrap up.


9:30 am  Rows are allocated space form SQL's heap storage.  All rows in a table are organized on Hash or range keys.  We are getting an example showing how the row is updated.  The Beginning timestamp is the end of the transaction that replaces it.  The End will be when the record is changed.

step 1

create a new version of the row
step 2 store it's identifier in the End TS and in the Beginning of the new time stamp.  If the End timestamp is later validated.  When validation occurs the delta end timestamp receives is final value at the moment of commit.

We call this timestamp and version.  No Latches is required.

two concurrent transactions.  We get our delta record from the previous example.  We get a concurrent read operation.

Things still occur in a dirty yet concurrent fashion based on the last committed timestamp.  


9:20 am   We get a nice view of SQL Server vs. Hekaton.  Lock Free vs. Latching in order to get this it truly is rocket science.  It was invented by Maurice Herlihy at Brown University.  When you hear Lock Free, it's important to realize version control gives us the ability to operate Latch Free.

Nice demo of Latch vs. Latch Lock Free and the way the speed of a database works speed test wise.  He actually set's the Latch on fire.  Incredible deck.  Again if you aren't watching this make sure to as soon as the video is live.

Optimisitc Concurrency will allow transactions to run and perform validation for conflict detection after the transaction runs.  Multirow version allows for multiple records to be gathered for updates.  Timestamps are placed on each row to create a total order for transactions to obtain equivalent datasets.

Transaction Phases in Hekaton.  Begin, Normal processing, Pre-Commit, Validation (this is where transaction concurrency occurs), Commit, Post-Processing, Terminate.

Each transaction has a Begin timestamp and they receive and End timestamp.  Each transaction will get a unique timestamp.  If you think back to the two phase locking demo, we need to get total order which is equivalent to serial order.  Using this method you can achieve a much lighter weight locking without latches.

We get a nice example on hash tables for tree utilization and table storage.



9:10 am  We are now looking at a locking example for our two queries.  we look at the serial manner that locking and blocking occur.  Setting a lock requires setting several latches.  Remember we get locking, blocking, and CPU.  Eventually we also need a mechanism for Deadlock detection and resolution.

Hekaton get's rid of all of this.  It's not something that was just made up it was a 5 year effort.

B-Tree's are expensive to interprete when they are on disk.  Putting them in main memory is inexpensive.

Hekaton introduces Lock-free data structures.  (We'll get to that).  We use Optimistic Multiversion optimistic concurrency control.  That's a mouthful.  Also we can compile queries and compile them into DLL's as natively compiled stored procs.


We get a nice view of SQL Server 2014.  We have another query engine specifically for Column Store Indexes.  For 2014 we got a THIRD engine for Hekaton.  Hekaton spans all three engines.

There will be some limitations in V1, Drastic improvements are coming.

Hekaton can have Durable and Non-Durable tables that are stored in memory.  Indexes are either Hash or Range.  We get a new type of tree called a D-Tree.  Schema limitations.  No Blob's, no XML but they will clean that up.

Populating a Hekaton table requires the exact size of the table to be reserved in memory.  You've got to plan for this.  You will get a minimum of 3x for Performance Boost by Hekaton.  x5 to x30 for Natively compiled stored procs.



9:00 am  Dr. DeWitt doesn't trash competitors with marketing slogans.  He does it with knowledge.  That is much more powerful than a catchy slogan.


He's walking us through the implications of a shared buffer pool.  He is giving us details of why Latches were introduced into SQL Server for the buffer pool.

He's talking about how Latches are put on Frames within the Buffer Pool to make sure that meta data over writes do not occur. Latches are great for Data Integrity, terrible for performance.  They cause Spinlocks, utilize a lot of CPU, and slow things down.

Reason #2 for a new Query Processor?  Concurrency Control.

The complexity of this information and the ease at which he presents is remarkable.  This is why DeWitt is so sought after as a speaaker.

We are looking at two transactions one doing 100 actions another doing 500 actions.

He shows how the separate transactions can cause dirty reads.

He is speaking about Jim Gray and the two phased locking mechanism in SQL Server using exclusive and shared locking.  Before access a query must acquire "appropriate" lock type form the Lock Manager, and once a query releases a lock, no further locks can be acquired. If the rules are followed the resulting schedule of actions is equivalent to some serial (good) schedules for actions.



8:50 am The Microsoft appreciation party is tonight and Tom is giving out the details of how to get there.

Tom Introduces Dr David DeWitt who is a technical fellow at teh Jim Gray Lab for Microsoft.

Dr DeWitt takes the stage!

Dr. DeWitt acknowledges his co-author.  He's got a lot of funny animations to start off.  This is streaming live and if you arent' watching it should be posted on the PASS Summit later.  This Dear Reader will be well worth a watch.

He tells us how he picked this talk.  He compares Hekaton to a rocket ship for OLTP databases.

Hekaton is Memory-Optimized but durable.  Very high performance OLTP Engine.  Fully integrated into SQL server 2014 ad completely architected for modern CPUs.

The Deck is already posted and Dr. DeWitt tells us to go get them.  Oh and VOTE for #Hekaton to be the name on twitter!!

A new Query engine was introduced in SQL 2014, why?  Historically OLTP performance improvements were the result of CPU Performance doubling every 2 years.  CPU's are not doubling anymore and we've got to find increases.  That well is running dry.

How did they choose the name Hekaton?  They wanted 100K improvement. Hekaton is a mythical creature with 100 Heads and 100 arms.  They don't get 100K, but they do get around 30 x.

In order to make things go faster you need to reduce the instruction set for the CPU.


8:40 am  Bill Graziano is now out, he's thanking Douglas for his 6 years of hard work on the Board.  Rob Farley also get's a nice honorable mention for his hard work.  Rushabh Mehta is on stage.  Rushabh is the immediate PASS president.  He has been on the Board or working on the Executive committee for PASS.

Thomas Larock, the one and only SQLRockstar the new incoming PASS President takes the stage.  He introduces the new board members and Executive committee for the next year.

Tom great's us with hashtags.  Next Year's PASS Summit will be November 4th - 7th back in Seattle next year.


8:30 am Douglas McDowell is up on stage doing the recap of finances for the year.  PASS holds the financial meeting because we don't travel just for a budgetary process, since they are a non profit and we are all members, they give us a transparent look at the funds.

PASS has done well this year, the majority of the funds that sustain us through the year come from the PASS Summit.

Last year PASS spent $7.6 MILLION DOLLARS, que the picture of Dr Evil.  30% of the money spent was focused on international growth and chapters.

In short PASS is in good financial state.


Hello Dear Reader!  Getting ready for the Day 2 Keynote for the 2013 PASS Summit.  Just a quick reminder the way it works is head down to the bottom and read from the beginning. And away we go!

Wednesday, October 16, 2013

PASS 2013 Keynotes Live Day 1

Final thoughts Dear Reader.  This is a big launch year for Microsoft and we are getting some very cool toys to play with today.  Make sure you go get the SQL 2014 CTP 2 bits.  I was telling Brent Ozar before the keynote that I thought they would extend the supported upgrade path from SQL 2005 to 2014.  I was 50/50 on that.  After the keynote I'm 90/10.

As always Thanks for stopping by!

Thanks,

Brad

9:50 am  We slice by International and Domestic, and using Power BI through the Office 365 portal we are able to use the power of Bing and get a world map breakout of calls.  Using a simple typed search you get a scatter plot.

The slicing on the fly via simple English language queries.  They look at the data by country, by platform, then they add additional countries receiving different reports by typing in a simple text box.  Just like using a search engine.

ANNOUNCING a Power BI contest that is aligned with the BACON, PASS BA Conference, coming in April.  You can vote on the winner on facebox http://www.facebook.com/microsoftbi

The winner will get a paid trip to BACON.  Who doesn't love BACON.

We review how we've pushed the boundaries.  Looks like we are wrapping up.

9:40 am We will be pulling from multiple data sets and area's.  Power Query they are pulling scrubbed user data, nice dig at Google and respecting users privacy.  Couple clicks we've gone from big data to structured data.

Now we are going to pair data and graphics with our data.  We head over to Power Query, you do an online search and you can tie that data to columns.  The data is shredded and displayed in a relational format.  Dig at Quentin's ability to do joins by the speaker.  Lot of funny banter between the two.  Population statistics are pulled in to pair up with the country, divided by population to show what the Skype calls per capita are.

They are using M behind the scenes to, for lack of a better word, power Power Query.  M is how they internally refer to the language.  It is some powerful stuff.

We we get a display of how this works on a Surface, Android Tablet, iPAD, Chrome laptop.  It just works, and it used HTML-5.  No more Silverlight.

We get a great breakout of the data calls per-capita.

9:30  am  We get a video about how the City of Barcelona is using Hadoop and Big data to manage all the city services.  Nice plug that reminds me of how IBM advertises their infrastructure services.  Nice to see a Microsoft video showing how they do the same.

Every organization has an IT need, and must be an IT organization to better handle their data and distribute services.  Interesting to see a large international city that recognizes that.

NOW for some BI!

Quentin is telling us all about Power BI, Power Query, Power Pivot, Power View, and Power Map.  Sense a trend here, you are empowered via excel.

DEMO

Very cool, looks like we will be using Skype data, 35 TB's a day are produced.

9:20 am We have a nice slide highlighting the improvements for HA/DR and how they utilize the cloud.  Next up Redefining mission critical data.

So how do we make the Cloud mission critical.  Resource Governor in the cloud using Premium Cloud Services.

Next up Big Data.  Hadoop has a lot of ways it is being integrated.  HDInsight is Microsoft Hadoop in the cloud.  Horton Works is the premier MS partner for Hadoop.  It runs on Apache, not Windows Server.

There are workloads not made for SQL Server.  Dr Dewitt did a great deep dive on Hadoop at the Summit 2011.  His keynote is online, and should be watched by any SQL Server person.  It contains the roadmap of what their team was looking at.  Hadoop tackles alot of those non-SQL Server based workloads and data sets.  In PDW v2 we have PolyBase to be able to run queries against Hadoop.

Lot of cool stuff coming with PolyBase.


9:10 am  We are getting more about hybrid DR and Availability can be used with on prem and the cloud.  Backup all versions SQL 2005 - 2014 directly to Azure storage is announced.

Natively encrypted backups for SQL 2014.  More to come on that. We also get a little more info on Managed Backup.  For shops without a full time DBA this will be a great feature.  If you're paying for a DBA managing Backups should be part of their job.

I would bet DIFF_MAP's are being monitored to trigger this.  Could be some interesting threasholds with this.  I do love good blog fodder.

Backup to Azure is being Demo.  We are walking through registering the Azure storage.  Nice thing is once you configure it, it can fire off automatically from there.

I understand that people are a little scared of the cloud.  But this is IaaS, Infrastructure as a Service using Azure Blog Storage.  Think backing up to a VM in another datacenter and the on-prem translation is easier to understand.

9:00 am Enough about the slides, let's do a DEMO!  Tracey from the SQL team comes out to do the demo.

We've got a game website for X-Box 360 games.  We see the load times when games are recommended.  We see a 10x improvement just by converting the table to an In-Memory table.  converting the natively compiled stored procedure gives us another x11 improvement.

Look ups taking almost 10 seconds are now sub-second.   We have a lookup process that runs for 26 minutes.  Using a Clustered Columnstore index we get down to less than a minute.

No need for application coding changes to use Hekaton and Clustered Columnstore.  I can't tell you how many calls and presentations I've done on those two technologies.  They are the big two for 2014 and they are going center stage.

Back to pushing the boundaries, next up Breakthroughs in availability and recovery.


8:50 am  Quentin says "Good Morning"!  He Thank's the community for the work we do to evangelize SQL Server.  There are a lot of pictures they were taking at the Pre-cons that are subject area's surrounding Cloud and Big Data.

Quentin is giving us a story about On-Premises and Cloud.  We are going to hear a lot about the merger between cloud and on-prem.  SQL 2012 CU 2 had a lot of integration, SQL 2014 takes that even further.  Nice graphic of climbing the Himalayas to signify the journey of bringing cloud and on prem together.

We will only be talking about product in GA, in preview, or that will be in GA soon.  Nice chuckle from the AZURE CAT team.

PDW V2, Power BI for Office, Power Map, SQL Server in Azure VM, HDInisght, 2012 SP1, SQL Server 2014 CTP 2.  (*There it is).

Quentin has announced that CTP 2 is the last publicly available CTP before RTM!  Next up, all about SQL 2014 and "Pushing the boundaries".  The keywords begin.

1st Up In-Memory, nice jab at Oracle. We are covering the high points of Hekaton/In-Memory OLTP.


8:40 am Amy Lewis has received the PASSION award for volunteerism.  Way to go AMY!!!  Honorable mentions are given to another nominee Ryan Adams.  Ryan does a lot of work in his local user group and the DBA performance virtual chapter.  All of the volunteers, speakers, and chapter leaders are asked to stand.  There is a myVolunteering link at the PASS website.  Bill is asking everyone to update their profile.

Quentin Clark is introduced to the stage.  We get a nice video about how SQL Server is awesome, nice community pictures, Plug for SQL Server 2014.


8:30  am     Bill Graziano is on stage making the PASS announcements.  Over 700,000 hours of free training where given this year.  The main focus will be volunteers.

Really cool video using Power BI to show the history of SQL Saturday, nice shout out to Andy Warren, Brian Knight, and Steve Jones for starting what has become a great community mainstay.



Hello Dear Reader!  It's that time of the year again, I'm sitting at the PASS Summit and I'll be live blogging the keynotes.  Read it from the bottom up!  With the 2014 CTP 2 bits already live on the web, I expect that announcement shortly!

Wednesday, October 2, 2013

DevConnections Deck & Demos Live!

Hello Dear Reader!  Just a quick blog today.  The Deck and Demo’s for my Presentations today at SQL DevConnections are Live!  

I just want to say a Big Thank You to the Organizers for having me and putting on a great conference.  I also want to Thank the people who attended the presentations today.

We had almost a packed house and standing room only for the Inside the query Optimizer session  and lots of great questions all around!   


Thanks again for a great day of SQL learning here in Vegas!

Thanks,


Brad