Thursday, October 31, 2013

Tales from the Query Processor: II

Hello Dear Reader!  Today is Halloween, October 31st.  My Daughter turns a big 15 today!  Happy Birthday Ches!!!!!

To celebrate two years ago I posted my first Halloween Blog.  Tales from the Query Processor.  It was a lot of fun to write a fiction/horror story revolving around SQL Server and computers.  I wanted to do this last year but got a little busy with family stuff. 

Leading up to this year, I wanted to do it again.  This is one of those stories that's been sitting in the back of my head for a while.  I hope you enjoy it!  


“It’s the same dream every night Doctor, and I don’t know how to make it stop and it just keeps happening.”  The panic in my voice was clear.  “I’m just a simple data page, I don’t know how to stop something like this!”

“Settle down, and start from the beginning,” was the doctor’s soothing reply. 

I settled in on the doctor’s couch, take a deep breath.  I’m in a safe place, I tell myself.  I’m in the buffer far from the disk, there’s plenty of room in the cache and the LRU algorithm is fast asleep.  Nothing can get me here.


“WHAT WAS THAT!”, I screamed as I bolted straight up.

“Settle down,” said the doctor his tone growing a bit frustrated.  “It was just a stray hash bucket.  I’m sure it was nothing.  Lie back down, that's it. You were about to tell me about your dreams.”

Another deep breath, I’m in a safe… oh the hell with relaxing, “It always begins the same…. “

I’m at home in the disk.  I’m at the bottom of the B-Tree where data pages live.  Everyone is going about their business as usual.  It seems darker than normal, but I’m the only one who notices.  The lights of Data Center cascading through the rack, the blink of packets coming and going during their commute, the hustle and bustle of page splits and re-orgs.  It’s all there.  But there’s a fog.  Like I’m the only one that can see it.  A haze in the air, and everything is in slow motion.  I move myself out of physical order, changing my pointer so I can be found and maintain logical order.

My table is large and there are many IAM pages I can hop to.  I transverse the chain trying to get a better view.  The spinning of the disk heads are in the air, their normal beat and pulse that we’ve all come to expect.  It comforts us, like a baby in the womb.  I stare out into the vastness.  Pages so far it looks like the hex starts to fall off the end of the world into nothing but darkness.  That deep darkness captures me.  I cannot look away.  It's deep like a night with no stars and no moon.  Objects just start to melt in it.  The longer I look at it the larger it grows. Instantly have a stab of panic.  I know that I should stop looking.  But I can't!

It’s out there!  It sees me!  I shouldn’t have gone looking for it!  I should have stayed where I was.  I was safe back in my allocated space.  And I'm still staring!  I want to tear my eyes away from the darkness but I can't.  It's so empty and vast.  At least I thought it was.  The hairs on my neck stand up.  I can feel it.  I'm the only one looking for it and now it can see me.

I’m to close now.  I still can’t see it, but I can feel it with a certainty.  IT IS OUT THERE.  The normal spinning of the disk feels wrong.  It feels off.  Like a 757 flying 300 feet off the ground at 500 miles per hour, the disk continues to spin.  But if you watch it, you just know that any second somethings going to happen.  And that’s when I see it.

It lights up the sky with its spark.  Everyone else is still sitting there, like it’s a picnic on a sunny day.  Be lazy, enjoy the moment.  But I know.  I KNOW.  

I shouldn’t see it.  BUT I DO.  The light caused it to be visible.  The light that was some how gone a moment ago, the vile of fog has been pierced.  Worst of all, now I know that it sees me too.  I was the only one looking for it.  I stood out on the edge, and IT SAW ME.

It’s big and malformed.  It’s hideous.  It shouldn’t be there.  I know it.  Now it must catch me and add me to it or I’ll tell everyone.

I start running.  I need to get back to my allocated position.  I’ve got 10 IAM pages to transverse to get there.  524,288 pages per allocation.  I’m running and running.  This should be taking nanoseconds, but the pages are stretching before me, more, and more, and more.  It’s taking microseconds just to transverse one page.  I can feel the ground quake behind me.  I can feel it closer to me.

I’m terrified.  Running.  This is the end and I know it.  I see my spark starting to go out in the darkness.  I know it wants my bits and it wants to re-arrange them. 

I scream and there is no sound.  I’m trying to get the Checksum’s attention.  Screaming and Screaming, but there’s no sound!  Everyone is still going about their business.  Sunday picnic’s!  Maintenance windows, Read-ahead’s from a query, none of them see it!

READ-AHEAD’S!  If I can grab onto one it can pull me up!  There’s a chance!  There’s a chance I can make it!  But the GAM Allocation stretches out even further.  I extend my hand, like that will do any good.  Reach for as far as I can.  But it’s no good.  It falls on me.

“And that’s where I wake up”

The Doctor was on the edge of his chair.  Moments ago, he would have thought this simple data page another fragmented mess just in need of a bit of re-ordering.  Unknowingly he has stopped taking notes, and his right hand is digging deep into the arm of his chair.

“What was it?”, asks the Doctor.

“Doc, you know what it was as well as I do.  I don’t want to say it.”

“No of course not,” said the doctor.  Forcibly relaxing.  “Look, dreams like this are somewhat common for data pages.  I wouldn’t worry about it.” 


“Yes I’m sure everything is fine.  Often times dreams are just subliminal ways of dealing with a long day.  Head on home and let’s meet again next week,” he says jotting down some notes.  “Can I see your insurance card and driver’s license one more time?  I’ve want to make sure I get all the numbers down correctly.”

“Sure,” I say handing them over.  “So next week then.”

“mmmhmmm”, mutters the Doctor and he copies down the number next to his notes.
I rise.  The doctor is right I feel a little better just getting it off my chest.  We smile shake hands and I leave the office.

I’m almost to the door when I realize my keys feel out of my pocket.  I go back to the knock on the door, it’s still cracked open.  That’s when I hear him.

The Doctor is yelling into the phone.  “What DO YOU MEAN THERE’S NO ALERTS ON THE SUSPECT_PAGES TABLE!  We need to raise some sort of alert!  They need to check the Windows Error Log! NO NO YOU FOOL, 823 & 824 DON’T APPEAR IN THE SQL ERROR LOG IT’S GOT TO BE WINDOWS!”

The other end of the phone squabbles back.  The Doctor cuts him off sharply.

“Look I know damn well CHECKSUM is not ON!  I just had a torn page in my office!  IN MY OFFICE!  CHECKSUM was enabled at some point, but it must have been disabled.” 

“Doctor, I’m a torn Page!”, I say announcing my presence at the door.

“CHECKPOINT!!” The Doctor yells into the phone, “CHECKPOINT!!!”

And the world went white.


As Always Dear Reader, Thanks for stopping by.  And Happy Halloween!



Friday, October 18, 2013

PASS Summit 2013 Deck and Demos Live

Ready to rock some Internals.
Hello Dear Reader!  This is just a quick blog to say that my Deck and Demo's are now live for the PASS Summit 2013.  My presentation is at 9:45 am in Room 217-A.  You can get them by going to the Resource Page, or Clicking Here for the Deck or Here for the Demos.

I did something different this time that I hope you will enjoy.  Internals are great but why do they matter?  In the world of performance tuning and troubleshooting an understanding of Internals is essential.

Understanding what's going on under the hood often tells us where we need to be looking when we are problem solving.

"So Balls", you say, "You said you were doing something new?"

Thank You for keeping me on point Dear Reader!  I'll be mixing a whole list of Why Do Internals Matter with this presentation and I'd love your feedback.

Hope to see you there!  As always Thanks for stopping by!



Thursday, October 17, 2013

PASS Summit 2013 Keynotes Day 2

Okay Dear Reader one more time let me say that if you are interested in Hekaton In-Memory OLTP  Then you need to watch the video when this is live.  This is a great overview of how transactions and records work.  In my opinion Hekaton is the biggest thing to happen to OLTP work loads in SQL Server since.... well since I can remember.  Get the deck here scroll down the page.

Thank you for hanging in there with me today Dear Reader and as always Thanks for stopping by.



9:50 am We are looking at the Post Processing Phase.  there are 3 sub-phases.  1 generate log record containing all new version of rows, Primary key of all deleted rows  2. force log records as a single */) to sql server log  3. For all rows in transactions write set apply end timestamp.

Check points and recovery.  Holds data for memory optimized tables during shutdowns.  design goals wher incremental non blocking and parallel

restart recovery starts by loading a known checkpoint.

Query Execution is next.  Interpretation.

In Hekaton you have two options.  Regular T-SQL Queries.  T-SQL can access all three query engines.

In Hekaton you can make a natively compiled stored proc which will make a compiled DLL that works at the machine level and is much more efficient with CPU.

We get an example of a regular query against regular OLTP tables.  We review Interpreted Plan Execution.  We are charged for the generic capabilities of the regular T-SQL optimizer parsing.

Native Plan generation.  The initial steps are the same  Parse, semantics analysis, query optimization.  Physical plans are different.  We get a physical plan, go through a translator.  It is compiled to C code to the C compiler at the DLL.  Ugly Ugly coed.  Totally specific for the query, no function calls within the DLL.

The DLL is then loaded and Invoked.

We get a nice table with 3 columns one for a Classic/Non Hekaton Table.  Hekaton Table using Interop for T-SQL, then Hekaton with a Natively compiled proc.

row matching query

Classic 700 Instructions
Hekaton/interop 332 Instructions
Hekaton/Native 75 instructions

non-row matching query

Classic 300 Instructions
Hekaton/interop 110 Instructions
Hekaton/Native 30 instructions

Reducing Instructions reduces compute.  That is where we see our optimization.  This is the assembly line vs manual labor.

We get some random performance results.  Not every thing will get this level of improvement.  We see comparisons on # of lookups ranging between 10.8x - 20.4x improvement.

Updates range from 20.2x to 30.5x.

Okay, Kool-aid is firmly distributed.  We get some numbers from clients currently running customers.  

9:40 am We get to see a transaction flow of Locking vs. Optimistic.  Serialization using variables Xa and Xb. Good stuff and we are getting deep.

Now that we've covered Concurrency, what about all the old versions?  Do we get a lot of junk stored memory.  Hekaton garbage collector is non-blocking, cooperative, incremental, parallel, self-throttling and has minimal impact on performance.

We are starting the wrap up.

9:30 am  Rows are allocated space form SQL's heap storage.  All rows in a table are organized on Hash or range keys.  We are getting an example showing how the row is updated.  The Beginning timestamp is the end of the transaction that replaces it.  The End will be when the record is changed.

step 1

create a new version of the row
step 2 store it's identifier in the End TS and in the Beginning of the new time stamp.  If the End timestamp is later validated.  When validation occurs the delta end timestamp receives is final value at the moment of commit.

We call this timestamp and version.  No Latches is required.

two concurrent transactions.  We get our delta record from the previous example.  We get a concurrent read operation.

Things still occur in a dirty yet concurrent fashion based on the last committed timestamp.  

9:20 am   We get a nice view of SQL Server vs. Hekaton.  Lock Free vs. Latching in order to get this it truly is rocket science.  It was invented by Maurice Herlihy at Brown University.  When you hear Lock Free, it's important to realize version control gives us the ability to operate Latch Free.

Nice demo of Latch vs. Latch Lock Free and the way the speed of a database works speed test wise.  He actually set's the Latch on fire.  Incredible deck.  Again if you aren't watching this make sure to as soon as the video is live.

Optimisitc Concurrency will allow transactions to run and perform validation for conflict detection after the transaction runs.  Multirow version allows for multiple records to be gathered for updates.  Timestamps are placed on each row to create a total order for transactions to obtain equivalent datasets.

Transaction Phases in Hekaton.  Begin, Normal processing, Pre-Commit, Validation (this is where transaction concurrency occurs), Commit, Post-Processing, Terminate.

Each transaction has a Begin timestamp and they receive and End timestamp.  Each transaction will get a unique timestamp.  If you think back to the two phase locking demo, we need to get total order which is equivalent to serial order.  Using this method you can achieve a much lighter weight locking without latches.

We get a nice example on hash tables for tree utilization and table storage.

9:10 am  We are now looking at a locking example for our two queries.  we look at the serial manner that locking and blocking occur.  Setting a lock requires setting several latches.  Remember we get locking, blocking, and CPU.  Eventually we also need a mechanism for Deadlock detection and resolution.

Hekaton get's rid of all of this.  It's not something that was just made up it was a 5 year effort.

B-Tree's are expensive to interprete when they are on disk.  Putting them in main memory is inexpensive.

Hekaton introduces Lock-free data structures.  (We'll get to that).  We use Optimistic Multiversion optimistic concurrency control.  That's a mouthful.  Also we can compile queries and compile them into DLL's as natively compiled stored procs.

We get a nice view of SQL Server 2014.  We have another query engine specifically for Column Store Indexes.  For 2014 we got a THIRD engine for Hekaton.  Hekaton spans all three engines.

There will be some limitations in V1, Drastic improvements are coming.

Hekaton can have Durable and Non-Durable tables that are stored in memory.  Indexes are either Hash or Range.  We get a new type of tree called a D-Tree.  Schema limitations.  No Blob's, no XML but they will clean that up.

Populating a Hekaton table requires the exact size of the table to be reserved in memory.  You've got to plan for this.  You will get a minimum of 3x for Performance Boost by Hekaton.  x5 to x30 for Natively compiled stored procs.

9:00 am  Dr. DeWitt doesn't trash competitors with marketing slogans.  He does it with knowledge.  That is much more powerful than a catchy slogan.

He's walking us through the implications of a shared buffer pool.  He is giving us details of why Latches were introduced into SQL Server for the buffer pool.

He's talking about how Latches are put on Frames within the Buffer Pool to make sure that meta data over writes do not occur. Latches are great for Data Integrity, terrible for performance.  They cause Spinlocks, utilize a lot of CPU, and slow things down.

Reason #2 for a new Query Processor?  Concurrency Control.

The complexity of this information and the ease at which he presents is remarkable.  This is why DeWitt is so sought after as a speaaker.

We are looking at two transactions one doing 100 actions another doing 500 actions.

He shows how the separate transactions can cause dirty reads.

He is speaking about Jim Gray and the two phased locking mechanism in SQL Server using exclusive and shared locking.  Before access a query must acquire "appropriate" lock type form the Lock Manager, and once a query releases a lock, no further locks can be acquired. If the rules are followed the resulting schedule of actions is equivalent to some serial (good) schedules for actions.

8:50 am The Microsoft appreciation party is tonight and Tom is giving out the details of how to get there.

Tom Introduces Dr David DeWitt who is a technical fellow at teh Jim Gray Lab for Microsoft.

Dr DeWitt takes the stage!

Dr. DeWitt acknowledges his co-author.  He's got a lot of funny animations to start off.  This is streaming live and if you arent' watching it should be posted on the PASS Summit later.  This Dear Reader will be well worth a watch.

He tells us how he picked this talk.  He compares Hekaton to a rocket ship for OLTP databases.

Hekaton is Memory-Optimized but durable.  Very high performance OLTP Engine.  Fully integrated into SQL server 2014 ad completely architected for modern CPUs.

The Deck is already posted and Dr. DeWitt tells us to go get them.  Oh and VOTE for #Hekaton to be the name on twitter!!

A new Query engine was introduced in SQL 2014, why?  Historically OLTP performance improvements were the result of CPU Performance doubling every 2 years.  CPU's are not doubling anymore and we've got to find increases.  That well is running dry.

How did they choose the name Hekaton?  They wanted 100K improvement. Hekaton is a mythical creature with 100 Heads and 100 arms.  They don't get 100K, but they do get around 30 x.

In order to make things go faster you need to reduce the instruction set for the CPU.

8:40 am  Bill Graziano is now out, he's thanking Douglas for his 6 years of hard work on the Board.  Rob Farley also get's a nice honorable mention for his hard work.  Rushabh Mehta is on stage.  Rushabh is the immediate PASS president.  He has been on the Board or working on the Executive committee for PASS.

Thomas Larock, the one and only SQLRockstar the new incoming PASS President takes the stage.  He introduces the new board members and Executive committee for the next year.

Tom great's us with hashtags.  Next Year's PASS Summit will be November 4th - 7th back in Seattle next year.

8:30 am Douglas McDowell is up on stage doing the recap of finances for the year.  PASS holds the financial meeting because we don't travel just for a budgetary process, since they are a non profit and we are all members, they give us a transparent look at the funds.

PASS has done well this year, the majority of the funds that sustain us through the year come from the PASS Summit.

Last year PASS spent $7.6 MILLION DOLLARS, que the picture of Dr Evil.  30% of the money spent was focused on international growth and chapters.

In short PASS is in good financial state.

Hello Dear Reader!  Getting ready for the Day 2 Keynote for the 2013 PASS Summit.  Just a quick reminder the way it works is head down to the bottom and read from the beginning. And away we go!