Category Archives: SQL 2012

Skewed Data – Poor Cardinality Estimates… and Plans Gone Bad

The session Skewed Data, Poor Cardinality Estimates, and Plans Gone Bad by Kimberly Tripp (@KimberlyLTripp) has been published on channel SQLPASS TV.
 
 
Abstract
 
When data distribution is heavily skewed, cardinality estimation (how many rows the query optimizer expects each operator to process) can be wildly incorrect, resulting in poor quality query plans and degraded performance. You’ve probably seen the advice to update all statistics if a query plan looks wrong – but is that the right advice? In many cases, no! These are “sledgehammer” approaches, and while they might solve some problems (usually parameter sniffing problems), they don’t solve the actual problem. In this session, you’ll learn a generalized yet tailored-to-the-table way to solve query plan quality problems for very large tables (VLTs). Topics will include creating, using, and updating filtered statistics; using forced parameterization and templatized plan guides; and understanding stored procedures and how they can leverage filtered statistics.
 
Enjoy!

Review of the book SQL Server 2012 Reporting Services Blueprints

Recently I had the opportunity to read the book SQL Server 2012 Reporting Services Blueprints written by Marlon Ribunal and Mickey Stuewe and published by Packt Publishing.

First, I have to tell you that I am not a Reporting Services guy, I have much more knowledge on other topics such as Database Development, Performance Tuning, and so on. When I have heard about the opportunity to read this book I have thought it was a great opportunity to increase my knowledge on SQL Server Reporting Services, but the time is always short, so I needed a practical book that goes to the point quickly.

If I would describe you the book with only one adjective, absolutely I could tell you that this book is concrete and easy to read. There are many pictures in the book and the authors guide the reader, step-by-step, from the Definition of the Data Source for an SSRS Project up to the Reporting Services Best Practices and Transactional Replication topics.

The book covers all feature of SQL Server Reporting Services and provides you all information you need to start develop database reports with SSRS in a week. If you want to look inside the book you can download a sample chapter here and this is the table of contents:

  • Chapter 1: Let’s Breakdown the Numbers
  • Chapter 2: Making a Good Impression
  • Chapter 3: Getting Some Actions
  • Chapter 4: Everything Looks Better with Charts
  • Chapter 5: Location, Location, Locations!
  • Chapter 6: Let’s Get Analytical!
  • Chapter 7: Sir, We’re Ready to Deploy!
  • Chapter 8: Beyond the Horizon!
  • Appendix A: SSRS Best Practices
  • Appendix B: Transactional Replication for Reporting Services

My preferred chapters are the 2, 4 and 5.

This book cannot missing in your digital or physical library!

Building Your T-SQL Tool Kit: Window Function Fundamentals

The session Building Your T-SQL Tool Kit: Window Function Fundamentals by Christina E. Leo (@christinaleo) has been published on channel SQLPASS TV.

Abstract

Have you pulled a script to identify duplicates from a blog post but couldn’t quite get it to work because you weren’t sure what that ROW_NUMBER() function was doing? Maybe you heard talk about creating running totals without using subqueries, but you got frustrated when the groups weren’t totaling correctly. Or maybe, you’ve never even heard of window functions. All are good reasons to attend this all-demo session, which demystifies this versatile T-SQL tool. First, we’ll break apart the OVER clause, the key to understanding how window functions work. Then we’ll expand on each group of functions that can use the OVER clause: ranking, aggregate, and analytic functions. Finally, we’ll look at real scenarios where this tool works and talk about performance considerations. When you leave, you’ll have the fundamentals you need to fully develop your mastery of window functions.

Enjoy!

Fundamentals of SQL Server 2012 Replication

From Red-Gate web site you can download, for free (plus a trial of SQL Monitor), the eBook Fundamentals of SQL Server 2012 Replication written by Sebastian Meine (@sqlity).

Data replication is traditionally considered a complex feature and generally we get discouraged easily, already during the preparation of the test environment. Why the implementation of a SQL Server Replication is considered a complex activity? Which problems could you meet? The most common problems that you could encounter are related to the database schema that it has to be thought for replication scenarios and there could be problems related to the replication agents’ permissions. Finally, you may encounter problems related to the architecture of the network and the Internet bandwidth especially if you want to implement a web replication or if you have to use an ftp area for data exchange. To do a good job, we need different skills.

This book cannot miss in your digital library! And obviously you can buy the paperback on Amazon.

Editorial Review

Many of my clients need to make data that lives on one server available on another server. There are many reasons for such a requirement. You might want to speed up cross-server queries by providing a local copy of the data. Or you might want to make the data available to resource intensive reporting queries without impacting the OLTP load, maybe even with an intentional delay so you’re always reporting against complete days only. Finally, you might be looking to implement high availability. In all these situations, SQL Server Replication is a viable option to look at when planning for the implemen­tation of such a requirement.

With this book, I want to introduce you into the vast world of SQL Server Replication and show you its most important strength and weaknesses. After working through the exercises, you will be able to make an informed decision whether replication is the right feature to use and which type of replication is the most advantageous in your situation. You will also know when to stay away from replication and use other features such as simple log shipping or the new “Always On” feature set.

Fundamentals of SQL Server 2012 Replication provides a hands-on introduction to SQL Server replication. The book begins with a short overview that introduces you to the technologies that make up replication. In the following chapters, the book will walk you through setting up different replication scenarios. All hands-on exercises are designed with security best practices in mind. When you’re finished working through the exercises, you will be able to implement your own multi-server replication setup while following the principle of least privilege.

Enjoy the book!

Enter Our Daily Sweepstakes to Celebrate the Anniversary of SQL Server 2012!

This month, we are celebrating the one year anniversary of SQL Server 2012, and we’d like to give a little shout out to our community to say, “Happy anniversary!” We can hardly believe that it was just one year ago that SQL Server 2012 was released to manufacturing. When we take a look back at how far we’ve come over the past year since this exciting occasion, we’re amazed by all the fantastic moments we’ve enjoyed with you! From the release of the TechNet Radio SQL Server 2012 and Fantastic 12 of 2012 series last year, to the great events we’ve attended together like PASS Summit and Strata, we are proud to continue this great journey onward with you!

On this first anniversary of SQL Server 2012, we are well over 63,000 fans strong on the SQL Server Facebook page, and nearly 128,000 fans strong on the SQL Server Twitter channel. So, to say thank you for your enthusiasm and support over the past year, we’d like to celebrate with a daily sweepstakes where we will give out a Microsoft/SQL Server branded swag pack to one of our amazing community members, each day in April, beginning tomorrow.

To enter the daily sweepstakes:

  • Celebrate_SQL_Server_2012_Anniversary_with_a_daily_sweepstakesYou must be following the @SQLServer Twitter account;
  • Beginning April 4th, 2013, watch for the SQL Server Anniversary Sweepstakes “question of the day” to be posted by @SQLServer at approximately 9 am PDT each day.
  • From your account, reply with an answer to the question of the day to @SQLServer;
  • Your tweeted reply must include the hashtag “#sql2012anniversary”;
  • Your entire tweeted reply should be no longer than 130 characters in length including the #sqlanniversary hashtag;
  • You may only reply to the Sweepstakes question of the day on the day that it is posted; you must enter each daily prize period separately. (A “day’ begins at 9 am PT and ends at 11:59 pm PT each day during the Sweepstakes entry period.)
  • A qualifying reply is your entry into the Sweepstakes.
  • You may only submit one response to the question per day. If you submit more than one response, all of your responses (including the first) will be disqualified for the day;
  • Prizes are limited to one prize per promotional entrant, so if you are selected as a winner, you will be ineligible for additional prizes.

Be sure to follow @SQLServer throughout the month of April – the more daily questions you answer, the more chances you have to win!  See complete contest rules below.

Thank you again SQL Server community for a wonderful year and for joining us in this celebration of the one year anniversary of SQL Server 2012! 

*       *       *

SQL SERVER 2012 ANNIVERSARY DAILY SWEEPSTAKES – OFFICIAL RULES

NO PURCHASE NECESSARY. COMMON TERMS USED IN THESE RULES:

These are the official rules that govern how the ‘SQL Server 2012 Anniversary Daily Sweepstakes’ (“Sweepstakes”) promotion will operate. This promotion will be simply referred to as the “Sweepstakes” throughout the rest of these rules. In these rules, “we,” “our,” and “us” refer to Microsoft Corporation, the sponsor of the Sweepstakes. “You” refers to an eligible Sweepstakes entrant.

WHAT ARE THE START AND END DATES?

This Sweepstakes starts at 9 a.m. PT on April 4, 2013 and ends at 11:59 pm PT on April 30, 2013 (“Entry Period”). The Sweepstakes consists of 27 daily Prize Periods. Each Prize Period will begin at 9 a.m. PT and end at 11:59 p.m. PT each day during the Entry Period. You may enter one time per Prize Period.

CAN I ENTER?

You are eligible to enter this Sweepstakes if you meet the following requirements at time of entry:

· You are professional or enthusiast with expertise in SQL Server or Business Intelligence and are 18 years of age or older; and

o If you are 18 of age or older, but are considered a minor in your place of residence, you should ask your parent’s or legal guardian’s permission prior to submitting an entry into this Sweepstakes; and

· You are NOT a resident of any of the following countries: Cuba, Iran, North Korea, Sudan, and Syria.

PLEASE NOTE: U.S. export regulations prohibit the export of goods and services to Cuba, Iran, North Korea, Sudan and Syria. Therefore residents of these countries / regions are not eligible to participate

You are NOT an employee of Microsoft Corporation or an employee of a Microsoft subsidiary; and

You are NOT involved in any part of the administration and execution of this Sweepstakes; and

You are NOT an immediate family (parent, sibling, spouse, child) or household member of a Microsoft employee, an employee of a Microsoft subsidiary, or a person involved in any part of the administration and execution of this Sweepstakes.

This Sweepstakes is void wherever prohibited by law.

HOW DO I ENTER?   

At 9:00 am PT each day a new Sweepstakes “question of the day” will be posted by @SQLServer. You will earn a Sweepstakes Entry when you do all of the following within the Entry Period:

  1. Sign in to your Twitter account. If you do not have an account, visit www.twitter.com to create one. Twitter accounts are free.
  2. Once logged into your Twitter account follow the links and instructions to become a follower of SQLServer at @SQLServer.
  3. From your own Twitter account, reply to @SQLServer with your answer to the posted Sweepstakes question of the day. You may only reply to the Sweepstakes question the day it is posted; you must enter each daily prize period separately
  4. Your tweet must contain the #sql2012anniversary hashtag to be eligible for entry and your tweet must not be longer than 130 characters, including the #sql2012anniversary hashtag

Limit one Entry per person and Twitter account per daily Prize Period. For the purposes of these Official Rules, a “day” begins at 9 am PT and ends at 11:59 PM each day during the Entry Period. However, on April 30th, 2013 the day starts at 9:00 am PT and will end at 2 pm PT. If you reply with more than one answer per day, all replies received from you that day will be automatically disqualified.

We are not responsible for entries that we do not receive for any reason, or for entries that we receive but are not decipherable for any reason, or for entries that do not include your Twitter handle.

We will automatically disqualify:

  • Any incomplete or illegible entry; and
  • Any entries that we receive from you that do not meet the requirements described above.

WINNER SELECTION AND PRIZES

On or around 5 pm each day beginning April 4 and ending on May 1, we, or a company acting under our authorization, will randomly select one winner from among all eligible Entries received in the previous daily Prize Period to win a Microsoft branded swag pack, approximate retail value, US$100. Pack includes Microsoft/SQL Server branded personal and business use items. Prizes are limited to one prize per promotional entrant, so if you are selected as a winner during this promotion, you will be ineligible for additional prizes.

Selected winners will be notified via a Direct Message (DM) on Twitter within 48 business hours of the daily drawing. The winner must reply to our Direct Message (DM) within 48 hours of notification via Direct Message on Twitter. If the notification that we send is returned as undeliverable, or you are otherwise unreachable for any reason, or you do not respond within 48 business hours, we will award the prize to an alternate winner as randomly selected. Only one alternate winner will be selected and notified; after which, if unclaimed, the prize will remain unclaimed. Prize packs will be shipped within two weeks of each winner’s confirmed response.

If you are a potential winner, we may require you to sign an Affidavit of Eligibility, Liability/Publicity Release within 10 days of notification. If you are a potential winner and you are 18 or older, but are considered a minor in your place of legal residence, we may require your parent or legal guardian to sign all required forms on your behalf. If you do not complete the required forms as instructed and/or return the required forms within the time period listed on the winner notification message, we may disqualify you and select an alternate, randomly selected winner.

If you are confirmed as a winner of this Sweepstakes:

  • You may not exchange your prize for cash or any other merchandise or services. However, if for any reason an advertised prize is unavailable, we reserve the right to substitute a prize of equal or greater value; and
  • You may not designate someone else as the winner. If you are unable or unwilling to accept your prize, we will award it to an alternate potential winner; and
  • If you accept a prize, you will be solely responsible for all applicable taxes related to accepting the prize; and
  • If you are otherwise eligible for this Sweepstakes, but are considered a minor in your place of residence, we may award the prize to your parent/legal guardian on your behalf.

Seamless insights on structured and unstructured data with SQL Server 2012 Parallel Data Warehouse

In the fast evolving new world of Big Data, you are being asked to answer a new set of questions that require immediate responses on data that has changed in volume, variety, complexity and velocity. A modern data platform must be able to answer these new questions without costing IT millions of dollars to deploy complex and time consuming systems.

On November 7, we unveiled details for SQL Server 2012 Parallel Data Warehouse (PDW), our scale-out Massively Parallel Processing (MPP) data warehouse appliance, which has evolved to fully embrace this new world. SQL Server 2012 PDW is built for big data and will provide a fundamental breakthrough in data processing using familiar tools to do seamless analysis on relational and Hadoop data at the lowest total cost of ownership.

  • Built for Big Data: SQL Server 2012 PDW is powered by PolyBase, a breakthrough in data processing, thatenables integrated queries across Hadoop and relational data. Without manual intervention, PolyBase Query Processor can accept a standard SQL query and join tables from a relational source with data from a Hadoop source to return a combined result seamlessly to the user. Going a step further, integration with Microsoft’s business intelligence tools allows users to join structured and unstructured data together in familiar tools like Excel to answer questions and make key business decisions quickly.   
  • Next-generation Performance at Scale: By upgrading the primary storage engine to a new updateable version of xVelocity columnstore, users can gain in-memory performance (up to 50x faster) on datasets that linearly scale out from small all the way up to 5 Petabytes of structured data.     
  • Engineered For Optimal Value: In SQL Server 2012 PDW, we optimized the hardware specifications required of an appliance through software innovations to deliver significantly greater cost savings, roughly 2.5x lower cost per TB and value. Through features delivered in Windows Server 2012, SQL Server 2012 PDW has built-in performance, reliability, and scale for storage using economical high density disks. Further, Windows Server 2012 Hyper-V virtualizes and streamlines an entire server rack of control functions down to a few nodes. Finally, xVelocity columnstore provides both compression and the potential to eliminate the rowstore copy to reduce storage usage up to 70%. As a result of these innovations, SQL Server 2012 PDW has a price per terabyte that is significantly lower than all offers in the market today.

With SQL Server 2008 R2 Parallel Data Warehouse, Microsoft already demonstrated high performance at scale when customers like HyVee improved their performance 100 times by moving from SQL Server 2008 R2 to SQL Server 2008 R2 Parallel Data Warehouse. SQL Server 2012 Parallel Data Warehouse takes a big leap forward in performance, scale, and the ability to do big data analysis while lowering costs. For the first time, customers of all shapes, sizes and data requirements from the low end to the highest data capacity requirements can get a data warehouse appliance within their reach.

We are very excited about SQL Server 2012 PDW which will be released broadly in the first half of 2013 and invite you to learn more through the following resources:

  • Watch the latest PASS Summit 2012 Keynote or sessions here
  • Microsoft Official Blog Post on PASS Summit 2012, authored by Ted Kummert here
  • Read customer examples of SQL Server 2008 R2 PDW (HyVee)
  • Visit HP’s Enterprise Data Warehouse for SQL Server 2008 R2 Parallel Data Warehouse site
  • Find out more about Dell’s SQL Server 2008 R2 Parallel Data Warehouse here

Breakthrough performance with in-memory technologies

In a blog post earlier this year on “The coming database in-memory tipping point”, I mentioned that Microsoft was working on several in-memory database technologies. At the SQL PASS conference this week, Microsoft unveiled a new in-memory database capability, code named “Hekaton1”, which is slated to be released with the next major version of SQL Server. Hekaton dramatically improves the throughput and latency of SQL Server’s transaction processing (TP) capabilities. Hekaton is designed to meet the requirements of the most demanding TP applications and we have worked closely with a number of companies to prove these gains. Hekaton’s technology adoption partners include financial services companies, online gaming and other companies which have extremely demanding TP requirements. What is most impressive about Hekaton is that it achieves breakthrough improvement in TP capabilities without requiring a separate data management product or a new programming model. It’s still SQL Server!

As I mentioned in the “tipping point” post, much of the energy around in-memory data management systems thus far has been around columnar storage and analytical workloads. As the previous blog post mentions, Microsoft already ships this form of technology in our xVelocity analytics engine and xVelocity columnstore index. xVelocity columnstore index will be updated in SQL Server 2012 Parallel Data Warehouse (PDW v2) to support updatable clustered columnar indexes. Hekaton, in contrast, is a row-based technology squarely focused on transaction processing (TP) workloads. Note that these two approaches are not mutually exclusive. The combination of Hekaton and SQL Server’s existing xVelocity columnstore index and xVelocity analytics engine, will result in a great combination.

The fact that Hekaton and xVelocity columnstore index are built-in to SQL Server, rather than a separate data engine, is a conscious design choice. Other vendors are either introducing separate in-memory optimized caches or building a unification layer over a set of technologies and introducing it as a completely new product. This adds complexity forcing customers to deploy and manage a completely new product or, worse yet, manage both a “memory-optimized” product for the hot data and a “storage-optimized” product for the application data that is not cost-effective to reside primarily in memory.

Hekaton is designed around four architectural principles:

1) Optimize for main memory data access: Storage-optimized engines (such as the current OLTP engine in SQL Server today) will retain hot data in a main memory buffer pool based upon access frequency. The data access and modification capabilities, however, are built around the viewpoint that data may be paged in or paged out to disk at any point. This perspective necessitates layers of indirection in buffer pools, extra code for sophisticated storage allocation and defragmentation, and logging of every minute operation that could affect storage. With Hekaton you place tables used in the extreme TP portion of an application in memory-optimized main memory structures. The remaining application tables, such as reference data details or historical data, are left in traditional storage optimized structures. This approach lets you memory-optimize hotspots without having to manage multiple data engines.

Hekaton’s main memory structures do away with the overhead and indirection of the storage optimized view while still providing the full ACID properties expected of a database system. For example, durability in Hekaton is achieved by streamlined logging and checkpointing that uses only efficient sequential IO.

2) Accelerate business logic processing: Given that the free ride on CPU clock rate is over, Hekaton must be more efficient in how it utilizes each core. Today SQL Server’s query processor compiles queries and stored procedures into a set of data structures which are evaluated by an interpreter in SQL Server’s query processor. With Hekaton, queries and procedural logic in T-SQL stored procedures are compiled directly into machine code with aggressive optimizations applied at compilation time. This allows the stored procedure to be executed at the speed of native code.

3) Provide frictionless scale-up: It’s common to find 16 to 32 logical cores even on a 2-socket server nowadays. Storage-optimized engines rely on a variety of mechanisms such as locks and latches to provide concurrency control. These mechanisms often have significant contention issues when scaling up with more cores. Hekaton implements a highly scalable concurrency control mechanism and uses a series of lock-free data structures to eliminate traditional locks and latches while guaranteeing the correct transactional semantics that ensure data consistency.

4) Built-in to SQL Server: As I mentioned earlier – Hekaton is a new capability of SQL Server. This lays the foundation for a powerful customer scenario which has been proven out by our customer testing. Many existing TP systems have certain transactions or algorithms which benefit from Hekaton’s extreme TP capabilities. For example, the matching algorithm in financial trading, resource assignment or scheduling in manufacturing, or matchmaking in gaming scenarios. Hekaton enables optimizing these aspects of a TP system for in-memory processing while the cooler data and processing continue to be handled by the rest of SQL Server.

To make it easy to get started, we’ve built an analysis tool that you can run so you can identify the hot tables and stored procedures in an existing transactional database application. As a first step you can migrate hot tables to Hekaton as in-memory tables. Doing this simply requires the following T-SQL statements2:

image

While Hekaton’s memory optimized tables must fully fit into main memory, the database as a whole need not. These in-memory tables can be used in queries just as any regular table, however providing optimized and contention-free data operation at this stage.

After migrating to optimized in-memory storage, stored procedures operating on these tables can be transformed into natively compiled stored procedures, dramatically increasing the processing speed of in-database logic. Recompiling these stored procedures is, again, done through T-SQL, as shown below:

Hekaton_PASS_Summit_2

What can you expect for a performance gain from Hekaton? Customer testing has demonstrated up to 5X to 50X throughput gains on the same hardware, delivering extreme TP performance on mid-range servers. The actual speedup depends on multiple factors, such as how much data processing can be migrated into Hekaton and directly sped up; and, how much cross transaction contention is removed as a result of the speed up and other Hekaton optimizations such a lock free data structures.

Hekaton is now in private technology preview with a small set of customers. Keep following our product blogs for updates and a future public technology preview.

Dave Campbell
Technical Fellow
Microsoft SQL Server


[1] Hekaton is from the Greek word ἑκατόν for “hundred”. Our design goal for the Hekaton original proof of concept prototype was to achieve 100x speedup for certain TP operations.

[2] The syntax for these operations will likely change. The examples demonstrate how easy it will be to take advantage of Hekaton’s capabilities.