Category Archives: In-Memory

Customers using Microsoft technologies to accelerate their insights

At yesterday’s Accelerate your insights event in San Francisco, we heard from CEO Satya Nadella, COO Kevin Turner and CVP Quentin Clark about how building a data culture in your company is critical to success. By combining data-driven DNA with the right analytics tools, anyone can transform data into action.

Many companies, and many of our customers, are already experiencing the power of data - taking advantage of the fastest performance for their critical apps, and revealing insights from all their data, big and small.

Since SQL Server 2014 was released to manufacturing in April we’ve seen many stories featuring the new technical innovations in the product.  In-memory transaction processing (In-Memory OLTP), speeds up an already very fast experience by delivering speed improvement of typically up to 30x.  Korean entertainment giant CJ E&M is using In-Memory OLTP to attract more customers for its games by holding online giveaway events for digital accessories like character costumes and decorations soon after each game is released.   When it ran tests in an actual operational environment for one of its most popular games, the results were that SQL Server 2014 delivered 35-times-faster performance over the 2012 version in both batch requests per second and I/O throughput. 

SQL Server 2014 also delivers enhanced performance to data warehouse storage and query performance – NASDAQ OMX is using the In-Memory Columnstore for a particular system which handles billions of transactions per day, multiple petabytes of online data and has single tables with quintillions of records of business transactions.  They have seen storage reduced by 50% and some query times reduced from days to minutes. 

Lufthansa Systems is using the hybrid features of SQL 2014 to anticipate customer needs for high-availability and disaster-recovery solutions.  It has piloted the combined power of Microsoft SQL Server 2014 and Windows Azure has led to even faster and fuller data recovery, reduced costs, and the potential for a vastly increased focus on customer service and solutions, compared with the company’s current solutions.

Growth in data volumes provides multiple challenges and opportunities.  For executives and researchers at Oslo University Hospital providing ease of access to data is important.  Using Power BI for Office 365, they can analyze data in hours rather than months, collaborate with colleagues around the country, and avoid traditional BI costs.  For Virginia Tech the data deluge presents challenges for researchers in the life sciences where new types of unstructured data types from gene sequencing machines are generating petabytes of data.  They are using the power of the cloud with Microsoft Azure HDInsight to not only analyzing data faster, but analyzing it more intelligently and which may in the future provide cures for cancer.  For The Royal Bank of Scotland handling multiple terabytes of data and an unprecedented level of query complexity more efficiently led them to use the power of the Analytics Platform System (formerly Parallel Data Warehouse).  As a result, it gained near-real-time insight into customers’ business needs as well as emerging economic trends, cut a typical four-hour query to less than 15 seconds, and simplified deployment. 

Whether you’ve already built a data culture in your organization, or if you’re new to exploring how you can turn insights into action, try the latest enhancements to these various technologies: SQL Server 2014, Power BI for Office 365, Microsoft Azure HDInsight, and the Microsoft Analytics Platform System.

Tune in tomorrow and accelerate your insights

Tomorrow’s the day! Tune in to hear from Microsoft CEO Satya Nadella, COO Kevin Turner, and Data Platform Group CVP Quentin Clark about Microsoft’s approach to data, and how the latest advancements in technology can help you transform data into action.

Who should watch?

Join us tomorrow morning at 10AM PDT if you like data or want to learn more about it. If you store it, you manage it, you explore it, you slice and dice it, you analyze it, you visualize it, you present it, or if you make decisions based on it. If you’re architecting data solutions or deciding on the best data technology for your business. If you’re a DBA, business analyst, data scientist, or even just a data geek on the side, join the live stream.

What will I hear about?

Data infrastructure. Data tools. And ultimately, the power of data. From finding the connections that could cure cancer, to predicting the success of advertising campaigns, data can do incredible things. Join us online and get inspired. You’ll see how your peers are putting their data, big and small, to work.

From a product perspective, we’ll celebrate the latest advancements in SQL Server 2014, Power BI for Office 365, SQL Server Parallel Data Warehouse, and Microsoft Azure HDInsight. And ultimately, we’ll explore how these offerings can help you organize, analyze, and make sense of your data – no matter the size, type, or location.

Where do I sign up?

Mark your calendar now or RSVP on Facebook so you’re ready to go tomorrow. When streaming goes live, you can join us here for all the action live from San Francisco.

When do things get started?

Tomorrow, April 15, at 10AM PDT. Be there.

See you tomorrow!

Using Natively Compiled Stored Procedures in SQL Server 2014

The new In-Memory OLTP feature in SQL Server 2014 greatly optimizes the performance of certain OLTP applications. By using the new memory-optimized tables you can speed up data access, in particular in concurrency situations, due to the lock- and latch-free architecture of the In-Memory OLTP Engine. This means that applications which suffer from a lot of contention between concurrent transactions can greatly benefit from just migrating your hot tables to memory-optimized.

The other part of the equation are the new natively compiled stored procedures, which allow you to speed up query processing and business logic execution. These native procs are T-SQL stored procedures that are compiled to native code, in the form of DLLs, which are linked to the SQL Server process, for very efficient execution. The efficiency in natively compiled procs comes from savings in the execution path by baking the operations into machine code that can interact very efficiently with the In-Memory storage engine. For example, when scanning an index and identifying rows that match a certain predicate, we see that a native proc requires around 1/4th the number of CPU instructions compared with traditional interpreted T-SQL queries and stored procedures.

In this post we walk through some of the considerations when developing and using natively compiled stored procedures in your application.

OLTP-Style Operations

Natively compiled stored procedures are optimized for OLTP-style operations. Now, what do we mean by that? Some characteristics: a) single-threaded execution (MAXDOP=1); b) point lookups and small range scans, no full table scans, in general operations that touch a relatively small number of rows; c) nested-loops joins and stream aggregation; d) short-running transactions, in the ideal case a transaction spans a single execution of a natively compiled stored procedure.

Some examples of OLTP-style operations:

  • Inserting a sales order along with its line items.
  • Updating the status of a sales order.
  • Money transfers between bank accounts.
  • Stock purchase.

Native procs are not optimized for reporting-style queries which require joins between and aggregation over large data sets.

ATOMIC Blocks in Native Procs

The body of a natively compiled stored procedure must comprise exactly one ATOMIC block. ATOMIC blocks are a new concept in SQL Server 2014 that can be used only with native procs. The basic thing an ATOMIC block gives you is that all statements within the block either succeed or fail, atomically. In the context of transactions this means that:

  1. If no transaction exists in the session, the atomic block will start a new transaction
  2. If a transaction already exists, the atomic block will create a savepoint in the running transaction
  3. If the block throws an exception, the block is rolled back as a whole
  4. If execution of the block succeeds, the transaction is committed at the end of the block or, if there was an existing transaction, the savepoint that was created for the atomic block is committed

Because transactions are handled through the ATOMIC block, there is no need to bother with BEGIN TRANSACTION, ROLLBACK, or COMMIT inside natively compiled stored procedures. In fact, that syntax is not supported.

For more details about ATOMIC blocks and transaction and error handling, see the corresponding topic in Books Online.

Retry logic for handling failures

As with all transactions that touch memory-optimized tables, with natively compiled stored procedures you will need to consider retry logic to deal with potential failures such as write conflicts (error 41302) or dependency failures (error 41301). In most applications the failure rate will be low, but it is still necessary to deal with the failures by retrying the transaction. Two suggested ways of implementing retry logic are:

  1. Client-side retries. This is the recommended way to implement retry logic in the general case. The client application would catch the error thrown by the procedure, and would retry the procedure call. If an existing client application already has retry logic to deal with deadlocks, extending the app to deal with the new error codes is straightforward.
  2. Using a wrapper stored procedure.  The client would call an interpreted T-SQL stored procedure that calls the native proc. The wrapper procedure then uses try/catch logic to catch the error and retry the procedure call if needed. You do need to consider in this case the possibility that results are returned to the client before the failure, and the client would not know to discard them. Therefore, to be safe, it is best to use this method only with native procs that do not return any result sets to the client. In addition, consider that using a wrapper stored procedure does impact performance. For optimal performance, use client-side retries.

For more details on retry logic and the error conditions to consider, see the corresponding topic in Books Online.

Table-Valued Parameters

Like traditional interpreted T-SQL stored procedures, natively compiled stored procedures support table-valued parameters (TVPs), which allow you to pass a rowset into a stored procedure. For example, if you want to insert a sales order along with its line items, you can use a TVP to encapsulate the line items.

The syntax and mechanisms to define and use table-valued parameters with natively compiled procs are the same as for interpreted procs. The only thing you need to take care of, is that you use a memory-optimized table type for the TVP. You can use memory-optimized table types with parameters in both native and interpreted stored procedures.

For more details and an example of the use of TVPs in natively compiled stored procedures, see the Books Online topic on memory-optimized table variables.

Optimizing Client Invocation

In general, you can view a stored procedure execution as consisting of three phases:

Figure 1: Phases of stored procedure execution

  1. Client invocation. This includes the call from the client, as well as the TDS processing and parsing in the server.
  2. Query and DML processing. This is the part of the stack that is highly optimized with natively compiled stored procedures, and it is this phase of the stored procedure execution where you will see the gain provided by In-Memory OLTP.
  3. Validation and commit processing. This includes hardening the commit log record to disk. The length of this phase of the procedure execution is in most cases a function of the latency of the log IO subsystem. If more time is spent writing to the log than processing the queries and DML, the gain to be had from native compilation is going to be limited.

If your application is sensitive to latency, i.e. how long it takes for a single stored procedure to execute, you will also want to optimize how you call the stored procedure from the client, in order to limit the overhead from the call of the stored procedure. Optimizing clients calls of stored procedures is not specific to natively compiled stored procedures, but it does play a bigger role for natively compiled procs, as the client invocation is proportionately a larger part of the overall procedure execution time, due to the optimization in the processing of queries and DML operations.

To optimize stored procedure calls, we recommend:

  • Use the ODBC driver in SQL Server Native Client, version 11 or higher.
  • Bind the parameters using SQLBindParameter, and use numbered parameters. Using numbered parameters is more efficient than used named parameters.
  • Match the types of the parameters passed to the procedure with the types in the procedure definition. This avoids type casting, which introduces additional overhead
  • If you need to invoke the procedure multiple times, use prepared execution – first prepare using SQLPrepare, and then execute using SQLExecute. If you need to execute a procedure only once, issue a direct RPC call using SQLExecDirect. Do not use the T-SQL EXECUTE statement.

For an example of both direct and prepared execution with the ODBC driver in SQL Native Client see Books Online here.

T-SQL Surface Area Limitations

SQL Server 2014 has some limitations on the features supported inside natively compiled stored procedures, which you should consider when using these stored procs, and if you want to get the most out of native procs.

Because of these limitations you will see it can be challenging to migrate stored procedures in your existing application to native. We suggest you look for patterns that fit the surface area for native procs and migrate those patterns to native. You do not always need to migrate an entire stored procedure: if the existing stored procedure has a substantial piece of logic that can be migrated to native, you can consider putting only that piece into a new native proc, and modify the existing proc to call the new one. Note that migrating a single statement to a natively compiled stored procedure may not be beneficial due to the overhead of stored procedure invocation – you really want to have a larger subset of the proc that you move to native.

To understand which features in an existing interpreted T-SQL stored procedure are supported in natively compiled stored procedures, we recommend using the Native Compilation Advisor, which is part of Management Studio in SQL Server 2014. The Advisor will tell you which features used in the stored procedure are not supported in native, which will help in identifying the parts of the procedure that can be migrated to native, and will indicate the limitations you may need to work around.

Two following two screenshots show an example of how to use the Advisor with the stored procedure dbo.uspGetBillOfMaterials in AdventureWorks.

Figure 2: Starting the Native Compilation Advisor for dbo.uspGetBillOfMaterials

 

Figure 3: Results of the Native Compilation Advisor for dbo.uspGetBillOfMaterials

In future blog posts we will go into more detail on how to perform a stored procedure migration given the surface area limitations, and give examples for migrating existing stored procedures to native.

Database Backup with Memory-Optimized Tables

The memory-optimized tables are backed up as part of regular database backups so you don’t need to do anything special to manage backups on databases with memory-optimized tables. Like for disk-based tables, the CHECKSUM of data/delta files is validated as part of database backup to proactively detect any storage corruption. However, as described in the blog storage allocation and management, the storage used by memory-optimized tables can be much larger than its footprint in the memory.  A full backup of a database with one or more memory-optimized tables consist of the allocated storage for disk-based tables, active transaction log and the data/delta file pairs (i.e. checkpoint file pairs) for memory-optimized tables. This blog focuses on the size of database backups that you can expect in database with memory-optimized tables.  

Full Database Backup

For the discussion here, we will focus on the database backups for databases with just durable memory-optimized tables because the backup part for the disk-based tables is the same irrespective of the existence of memory-optimized tables. The data/delta file pairs, also referred to as Checkpoint File Pairs or CFPs residing in the filegroup could be in various states at a given time. Please refer to merge-operation-in-memory-optimized-tables for details. The table below describes what part of the files is backed up both in CTP2 and in RTM.  

CFP State

Backup in CTP2

Backup in RTM

PRECREATED

File metadata only

File metadata only

UNDER CONSTRUCTION

File metadata + allocated bytes

File metadata only

ACTIVE

File metadata + allocated bytes

File metadata + used bytes

MERGE SOURCE

File metadata + allocated bytes

File metadata + used bytes

MERGE TARGET

File metadata + allocated bytes

File metadata only

REQUIRED FOR BACKUP/HA

File metadata + allocated bytes

File metadata + used bytes

IN TRANSITION TO TOMBSTONE

File metadata + allocated bytes

File metadata only

TOMBSTONE

File metadata + allocated bytes

File metadata only

Table - 1: CFP and Database Backup

You will find that the size of database backup in SQL Server 2014 RTMis relatively smaller than what you had in CTP2.

Let us walk through a few examples to show the size of the backups. All these examples are based on the following database and the table schema using pre-RTM bits. The state of checkpoint file pairs (i.e. CFPs) in the example here please refer to the blog state-transition-of-checkpoint-files-in-databases-with-memory-optimized-tables.

CREATE DATABASE imoltp

GO

ALTER DATABASE imoltp ADD FILEGROUP imoltp_mod CONTAINS MEMORY_OPTIMIZED_DATA

ALTER DATABASE imoltp ADD FILE (name='imoltp_mod', filename='c:\data\imoltp_mod') TO FILEGROUP imoltp_mod

GO

use imoltp

go

-- create the table with each row around 8K

CREATE TABLE dbo.t_memopt (

       c1 int NOT NULL,

       c2 char(40) NOT NULL,

       c3 char(8000) NOT NULL,

       CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1)

WITH (BUCKET_COUNT = 100000)

) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

Go

We will use the following query to look at the states of CFPs

select file_type_desc, state, state_desc, internal_storage_slot,file_size_in_bytes, file_size_used_in_bytes, inserted_row_count, deleted_row_count, lower_bound_tsn, upper_bound_tsn

from sys.dm_db_xtp_checkpoint_files

order by file_type_desc, upper_bound_tsn

Example-1: Backup a database with no rows in memory-optimized tables

  • Backup Command:

BACKUP DATABASE [imoltp] TO DISK = N'C:\data\imoltp-empty-data.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-Full Database Backup', SKIP, NOREWIND, NOUNLOAD,STATS = 10

  • Result: The size of the full database backup is around 5.3MB.  As you can see even though the total storage taken on the disk is (16MB*9) + (1MB*9) = 153MB, but the size of the backup is much smaller. Other thing to note is that even though there are no data rows inserted, still significant storage (i.e. 144 MB) was taken primarily because of the fixed storage overhead.

Example-2: Backup the database after loading 8000 rows

-- load 8000 rows. This should use 5 16MB files

declare @i int = 0

while (@i < 8000)

begin

       insert t_memopt values (@i, 'a', replicate ('b', 8000))

       set @i += 1;

end

Checkpoint files: Besides 8 CFPs in ‘PRECREATED’, there are now 5 CFPs ‘UNDER CONSTRUCTION’, each storing up to 1870 rows, to contain the 8000 data rows just inserted. Please refer to blog state-transition-of-checkpoint-files-in-databases-with-memory-optimized-tables.

  • Backup Command:

BACKUP DATABASE [imoltp] TO  DISK = N'C:\data\imoltp-full-data.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

  • Output:

Processed 304 pages for database 'imoltp', file 'imoltp' on file 1.

Processed 0 pages for database 'imoltp', file 'imoltp_mod' on file 1.

10 percent processed.

20 percent processed.

Processed543 pages for database 'imoltp', file 'imoltp_log' on file 1.

100 percent processed.

BACKUP DATABASE successfully processed 8847 pages in 1.191 seconds (58.027 MB/sec).

  • Result: The size of the full database backup is around 73.5MB.  Note that the size of the transactional log is 70MB and there is no data actually backed up because the CFPs are either in ‘PRECREATED’ or ‘UNDER CONSTRUCTION’ as show in the Table-1. The data is only guaranteed to be in the data/delta files after CFP transitions into ‘ACTIVE’ state.

Example-3: Backup the database after taking an explicit checkpoint

-- do an manual checkpoint

checkpoint

  • Checkpoint files: Besides 8 CFPs in ‘PRECREATED’, there are now 5 CFPs with state ‘ACTIVE’. The manual checkpoint transitions the CFPs in ‘UNDER CONSTRUCTION’ state to ‘ACTIVE’.
  • Backup Command:

-- the backup will include full log and the data. So the size is double - 138MB

BACKUP DATABASE [imoltp] TO DISK = N'C:\data\imoltp-full-data-chkpt.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

GO

  • Output:

Processed 304 pages for database 'imoltp', file 'imoltp' on file 1.

10 percent processed.

20 percent processed.

Processed 8064 pages for database 'imoltp', file 'imoltp_mod' on file 1.

30 percent processed.

40 percent processed.

Processed 8548 pages for database 'imoltp', file 'imoltp_log' on file 1.

100 percent processed.

BACKUP DATABASE successfully processed 16916 pages in 1.872 seconds (70.594 MB/sec).

  • Result: The size of the full database backup is around 138MB. It consists of 70MB of transaction log and also 66MB (8064 8k pages) of contents from data/delta files.  

Example-4: Backup the database after deleting 50% rows and merging the CFPs

-- now delete 50% rows

declare @i int = 0

while (@i <= 8000)

begin

       delete t_memopt where c1 = @i

       set @i += 2;

end

checkpoint

-- Do the manual merge. It generates merge-target and other files stay as regular files

-- The transaction range here is picked up by querying the DMV

-- sys.dm_db_xtp_checkpoint_files. Please find the appropriate range for your test.

--

exec sys.sp_xtp_merge_checkpoint_files 'imoltp',  1877, 12004

go

-- This installs the merge leading to 5 CFPs as MERGE SOURCE and the merge target

-- transitions to ACTIVE state

Checkpoint

go

BACKUP DATABASE [imoltp]

TO DISK = N'C:\data\imoltp-full-data-chkpt-del50-merge-chkpt.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

GO

  • Checkpoint files: Besides 8 CFPs in ‘PRECREATED’, there are now 5 CFPs with state ‘MERGE SOURCE’ and 2 CFPs as ‘ACTIVE’.
  • Backup Command:

-- the backup will include full log and the data.

BACKUP DATABASE [imoltp] TO DISK = N'C:\data\imoltp-full-data-chkpt.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

GO

  • Output:

Processed 304 pages for database 'imoltp', file 'imoltp' on file 1.

10 percent processed.

20 percent processed.

Processed 12143 pages for database 'imoltp', file 'imoltp_mod' on file 1.

30 percent processed.

40 percent processed.

Processed 8815 pages for database 'imoltp', file 'imoltp_log' on file 1.

100 percent processed.

BACKUP DATABASE successfully processed 21262 pages in 1.794 seconds (92.588 MB/sec).

  • Result: The size of the full database backup is around 172 MB. It consists of 72MB of transaction log and also 100MB of data/delta files. The reason the size has grown around 50% even though we have deleted 4000 rows is because ‘MERGE SOURCE’ CFPs still have all 8000 rows.

Example-5: Backup after MERGE SOURCE CFPs transition to TOMBSTONE state

checkpoint

go

-- Do the log backup. This log backup is around 71MB

BACKUP LOG [imoltp] TO DISK = N'C:\data\imoltp-log-1.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-log Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

GO

checkpoint

go

-- Do one more log backup. This backup reduces active transaction log size

-- to 7MB

BACKUP LOG [imoltp] TO DISK = N'C:\data\imoltp-log-2.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-log Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

GO

checkpoint

go

-- do one more log backup

BACKUP LOG [imoltp] TO DISK = N'C:\data\imoltp-log-3.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-log Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

GO

  • Checkpoint files: Besides 8 CFPs in ‘PRECREATED’, there are now 6 CFPs ‘IN TRANSITION TO TOMBSTONE’ and there 5 CFPs ‘ACTIVE’.[JdB1][SA2]  Only one ACTIVE CFP contains 4000 data rows other ACTIVE CFPs were created due to manual checkpoint. In a production environment, the checkpoints will occur automatically and log backups will be taken as part of regular operational process.
  • Backup Command:

Since the CFPs are converted to either TOMBSTONE or are in transition to it, the size of database backup is now 38MB (only 1 copy of data as transaction log has been backed up and freed)

BACKUP DATABASE [imoltp]

TO DISK = N'C:\data\imoltp-full-data-chkpt-del50-merge-chkpt-logbkup3.bak'

WITH NOFORMAT, INIT, NAME = N'imoltp-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

GO

  • Output:

Processed 288 pages for database 'imoltp', file 'imoltp' on file 1.

10 percent processed.

Processed 4128 pages for database 'imoltp', file 'imoltp_mod' on file 1.

Processed 23 pages for database 'imoltp', file 'imoltp_log' on file 1.

100 percent processed.

BACKUP DATABASE successfully processed 4439 pages in 0.394 seconds (88.012 MB/sec).

  • Result: The size of the full database backup is around 38 MB. This is because we have around 32MB of data in ACTIVE CFP. There are no CFPs in ‘MERGE SOURCE’ or ‘REQUIRED FOR BACKUP/HA’ states.

Summary

The size of backup of databases with one or more memory-optimized tables is typically bigger than the in-memory size of memory-optimized tables but smaller than the on-disk storage. The extra size will depend upon number of Checkpoint File Pairs (i.e. CFPs) in the states ‘MERGE SOURCE’ and ‘REQUIRED FOR BACKUP/HA’ which indirectly depends upon the workload.

In-Memory OLTP Index Troubleshooting, Part II

With the In-Memory OLTP feature in SQL Server 2014 we introduce two new types of indexes for memory-optimized tables: the memory-optimized NONCLUSTERED indexes, and the memory-optimized HASH indexes.

Memory-optimized NONCLUSTERED indexes behave similar to traditional NONCLUSTERED indexes from an application point-of-view. The main difference being that memory-optimized indexes are always covering (i.e. all columns are virtually included), while with traditional disk-based NONCLUSTERED indexes you need to specify which column you want to include alongside the index key columns.

Memory-optimized HASH indexes behave different from NONCLUSTERED indexes. They are optimized for point-lookup operations, and do not support ordered scans or inequality seek operations. In addition, you need to specify a BUCKET_COUNT when creating the index, and you need to pick the right value in order for the index to perform optimally. In earlier posts we covered the following three index troubleshooting aspects:

  • Bucket_count troubleshooting for HASH indexes: how to find out whether a bucket_count is too low, and how to determine what should be the correct bucket_count. You can also find more information about determining the right bucket_count in Books Online.
  • Searching on a subset of the index key: HASH indexes do not support searching on the leading columns of an index key.
  • Searching on inequality predicates and ordered scans: both operations are not supported with HASH indexes, but they are supported with NONCLUSTERED indexes. But note that NONCLUSTERED indexes support ordered scans only in the direction indicated with the index key: for example, if the index key is (c1 DESC), the index supports retrieving the values for c1 in descending order, but not in ascending order.

All that said, it seems that HASH indexes have quite a few limitations, compared with NONCLUSTERED indexes. For that reason, it is usually a safer bet to start with NONCLUSTERED indexes, both for new applications and when migrating existing disk-based tables to memory-optimized. You can then use HASH indexes to further optimize the workload. Indeed, HASH indexes are the most efficient index for equality search (point lookup) and full table scan operations.

In the remainder of this post we:

  • Describe a way to troubleshoot seek vs. scans on memory-optimized indexes using the new DMV sys.dm_db_xtp_index_stats: if there are more full index scans than expected, one of the above-mentioned issues may be the case: trying to search on a subset of a HASH index key or trying to search on inequality predicates with a HASH index.
  • Troubleshoot duplicates: you may run into problems if the index keys of a memory-optimized index contain a lot of duplicate values, particularly when using a HASH index. We describe how to detect this situation and how to work around.

Troubleshooting seek vs. scan using XTP index DMVs

The DMV sys.dm_db_xtp_index_stats shows statistics for index operations performed by the in-memory storage engine. The index contains stats about the usage of the index since its creation in memory – note that memory-optimized indexes are always recreated on database restart. You can use the following query to retrieve key statistics about the usage of indexes on your table:

SELECT ix.index_id, ix.name, scans_started, rows_returned

FROM sys.dm_db_xtp_index_stats ixs JOIN sys.indexes ix ON

ix.object_id=ixs.object_id AND ix.index_id=ixs.index_id

WHERE ix.object_id=object_id('<table name>')

For troubleshooting indexes, the columns ‘scans_started’ and ‘rows_returned’ contain key information:

  • scans_started – this is the number of scan operations the in-memory storage engine has started. Note that from the storage engine point-of-view, all operations to locate a row or the location to insert a new row are scans: for example, a full index scan, a point lookup and, a row insert all require a single scan operation.
  • rows_returned – the cumulative number of rows returned by all scan operations in the storage engine. Note that this number reflects the rows returned by the storage engine, not the number of rows returned to the client. For example, the query plan may call for a filter condition or aggregation that reduces the number of rows before it is returned to the client.
    • Insert operations do not result in rows being returned. Update and delete operations consist of a scan, to locate the row to be updated, followed by the actual row update/delete.

If the number of rows_returned is significantly larger than the scans_started, this is an indication that, on average, index operations scan a large part of the index. If all index operations are expected to be point lookups, this could be an indication of one of the earlier-mentioned problems where the query calls for an operation to be supported by the index, thus causing a revert to full index scan, such as: search requires a subset of hash index key columns or search on inequality predicates with a hash index.

The scans_started being larger than rows_returned is an indication that the workload is insert-heavy, or that a lot of point lookups failed to locate a row.

Index keys with many duplicate values

Issue: Index keys with many duplicate values can cause performance problems. If each index key has 5 duplicates this is usually not a problem, but if the discrepancy between the number of unique index keys and the number of rows in the tables becomes very large – more than 10X – this can become problematic.

All rows with the same index key end up in the same duplicate chain. For hash indexes this can create a lot of overhead in case of hash collisions: if multiple index keys end up in the same bucket due to a hash collision, index scanners always need to scan the full duplicate chain for the first value before they can locate the first row corresponding to the second value. For nonclustered indexes this causes additional overhead for garbage collection.

Symptom: For hash indexes the performance of DML operations degrades and CPU utilization increases. In addition, there is an increase in CPU utilization during database startup, and a potential increase in recovery time. This becomes especially clear when inserting a large number of rows.

For nonclustered indexes garbage collection will start to consume more resources, resulting in an overall increased CPU utilization in the system. The problem does not affect DML operations directly,[1] but it does put more overall pressure on system resources.

How to troubleshoot: The average number of duplicate rows for a given index key can be obtained using T-SQL queries. First determine the row count of the table, then determine the number of unique index key values. Divide the row count by the number of unique index keys to obtain the average number of duplicates.

To determine the row count for the table use the following query:

select count(*) as 'Row count' from <tableName>

To determine the number of unique index key values use the following query:

select count(*) as 'Distinct index key values' from (select distinct <indexKeyColumns> from <tableName>) a

For hash indexes specifically, you can also troubleshoot using the hash index stats DMV. Use the following query:

SELECT hs.object_id, object_name(hs.object_id) AS 'object name', i.name as 'index name', hs.*

FROM sys.dm_db_xtp_hash_index_stats AS hs

JOIN sys.indexes AS i ON hs.object_id=i.object_id AND hs.index_id=i.index_id;

If the average chain length is high and the number of empty buckets is high, it is likely that there are many rows with duplicate index key values or there is a skew in the key values.

Workaround: First, evaluate if the index is truly needed. The index may have been added in the past to support queries that are no longer part of the workload.

For HASH indexes there are two ways to work around the issue of many duplicates:

  • In most cases you will want to use a NONCLUSTERED index instead, as NONCLUSTERED indexes generally perform better in case of duplicates. If you go for this option, consider uniquifying the index key, as indicated below.
  • Alternatively, you can over-size the index by using a very high bucket count; for example, 20 – 100 times the number of unique index key values. This will make hash collisions unlikely.

For NONCLUSTERED indexes with a lot of duplicates, consider adding additional columns to the index key. For example, you can add the primary key columns to the index key to make it unique, in other words to uniquify the index.

Adding columns to the index key does come with a performance penalty for DML operations. Therefore, it is important to test the new uniquified index under production load, and compare the performance with the old index. Consider the overall throughput of the system, query performance, and also the overall resource utilization, especially CPU.



[1] In SQL 2014 CTP2 user transactions could be affected, due to garbage collection performed at commit time. This issue will be resolved in SQL 2014 RTM: garbage collection no longer directly affects user transactions.

State Transition of Checkpoint Files in Databases with Memory-Optimized Tables

In the previous blog Merge Operations, we described that how a CFP transitions across various stages in its life cycle. In this blog, we will walk through an actual example to show the state transition. Note, some of the states shown for CFPs are only visible in RTM bits (not in CTP2) but internally the state transitions is the same. Basing this example using RTM bits adds lot more clarity in how storage is managed. Another note is that RTM bits allocate CFPs (16MB of data file and 1 MB of delta file) on machines with physical memory <= 16GB. This example was run on a machine with 8GB of RAM using pre-RTM bits.

CREATE DATABASE hkdb1 ON 

 PRIMARY (NAME = [hkdb1_hk_fs_data], FILENAME = 'C:\hekaton_test\data\hkdb1_data.mdf'),

 FILEGROUP [hkdb1_hk_fs_fg] CONTAINS MEMORY_OPTIMIZED_DATA

 (NAME = [hkdb1_hk_fs_dir],  FILENAME = 'C:\hekaton_test\data\hkdb1_hk_fs_dir')

 LOG ON (name = [hktest_log], Filename='C:\hekaton_test\data\hkdb1_log.ldf', size=100MB)

go

-- disable auto-merge so that we can show the merge

-- in a predictable way

dbcc traceon (9851, -1)

-- set the database to full recovery. This is a common practice for production scenario

alter database hkdb1 set recovery full

go

use hkdb1

go

-- create a memory-optimized table with each row of size > 8KB

CREATE TABLE dbo.t_memopt (

       c1 int NOT NULL,

       c2 char(40) NOT NULL,

       c3 char(8000) NOT NULL,

 

       CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1) WITH (BUCKET_COUNT = 100000)

) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

go

-- do the database backup. We do this so that we can do transaction log backups

BACKUP DATABASE [hkdb1] TO  DISK = N'C:\hekaton_test\data\hk-empty-data.bak'

WITH NOFORMAT, INIT,  NAME = N'hkdb1-Full Database Backup', SKIP, NOREWIND, NOUNLOADSTATS = 10

GO

-- load 8000 rows. This should load 5 16MB data files on a machine with <= 16GB of storage

declare @i int = 0

while (@i < 8000)

begin

        insert t_memopt values (@i, 'a', replicate ('b', 8000))

        set @i += 1;

end

go

In this example, we will show the output of DMV sys.dm_db_xtp_checkpoint_files using the following query.

select file_type_desc, state_desc, internal_storage_slot,file_size_in_bytes, file_size_used_in_bytes,

inserted_row_count, deleted_row_count, lower_bound_tsn, upper_bound_tsn, last_backup_page_count, drop_table_deleted_row_count 

from sys.dm_db_xtp_checkpoint_files

order by container_id, file_type_desc, upper_bound_tsn

Here is the output after removing some columns and rows (just showing rows with DATA file).

  • There are 8 data files under ‘PRECREATED’ state. This demo was run on machine with 4 logical processors with 8GB memory. The size of the files is 16MB (supported in SQL Server 2014 RTM on a machine with <= 16GB memory)
  • There are 5 data files marked ‘UNDER CONSTRUCTION’ as checkpoint has not been taken but these files together store 8000 rows that we had inserted. If database is restarted, the data rows will loaded using transaction log and NOT from these files as they are not part of a durable checkpoint

Now, execute a manual checkpoint by executing the following command and see the state transition in the CFPs. You will note that UNDER CONSTRUCTION CFPs are now marked ACTIVE as they are part of a durable checkpoint that we have just completed. The manual checkpoint closed the data file with internal_storage_slot though it was only 25% full. This data file can potentially be merged in future as it qualifies under the merge policy.

checkpoint

go

Now, we will delete 50% rows

-- now delete 50% rows

declare @i int = 0

while (@i <= 8000)

begin

       delete t_memopt where c1 = @i

       set @i += 2;

end

go

checkpoint

go

Here is the output of the DMV for CFPs in ACTIVE state using the following query

select file_type_desc, state_desc, internal_storage_slot,file_size_in_bytes, file_size_used_in_bytes,

inserted_row_count, deleted_row_count, lower_bound_tsn, upper_bound_tsn, last_backup_page_count, drop_table_deleted_row_count 

from sys.dm_db_xtp_checkpoint_files

where state = 2

order by container_id, file_type_desc, upper_bound_tsn

Note that the delta files have reference to the deleted rows. There is a CFP with 0 data rows due to a manual checkpoint.

 For this example, we had disabled auto-merge so that we can see the state transitions with no surprises.

We will now force a manual merge (merging all the ACTIVE CFPs) by executing the following command

-- now do the manual merge

-- it generates merge-target and other files stay as regular files

exec sys.sp_xtp_merge_checkpoint_files 'hkdb1',  1877, 12004

go

The DMV shows that the merge was executed and there is new CFP marked as MERGE TARGET containing 4000 rows. Also, note the data file is around 33MB, much larger than 16MB. This happened because we forced the merge. The auto-merge would not have merged all ACTIVE CFPs because it won’t meet the merge policy.

At this time, the CFPs that are sources of MERGE are still marked as ACTIVE. This is because the merge has not been installed. You can query the state of merge request as follows

select request_state_desc, lower_bound_tsn, upper_bound_tsn

from sys.dm_db_xtp_merge_requests

go

request_state_desc   lower_bound_tsn      upper_bound_tsn

-------------------- -------------------- --------------------

PENDING              0                    12007

 

Now, we will force a checkpoint to install the merge.

checkpoint

go

Here is the output of the DMVs. It shows that the MERGE TARGET is now changed to ACTIVE state and all the CFPs that were used as source of the MERGE are now marked as MERGED SOURCE. Referring to the blog <reference>, these CFPs are now in category-3. 

The source CFPs will transition into category-4 automatically as automatic checkpoints and log backups are taken. For the example, here we will force the checkpoint and log backups as follows

checkpoint

go

-- do the log backup.

BACKUP LOG [hkdb1] TO  DISK = N'C:\hekaton_test\data\hk-log-1.bak'

WITH NOFORMAT, INIT,  NAME = N'hk-log Database Backup', SKIP, NOREWIND, NOUNLOADSTATS = 10

GO

Here is the output of the DMV showing that earlier CFPs have now transitioned to state REQUIRED FOR BACKUP/HA meaning that these CFPs can’t be garbage collected as they are needed for operational correctness of the database with memory-optimized table

 

After couple of manual checkpoint followed by log backup, the DMV output shows that CFPs are finally being handed off to the garbage collector for File Stream. At this stage, these CFPs are not needed for operational correctness for our database. To state differently, the full database backup of the database will NOT include these CFPs but they continue take storage space in the memory-optimized filegroup until they are garbage collected.

One more cycle of manual checkpoint followed by log backup, the CFPs are actually transitioned to TOMBSTONE state which indicates that these will get removed as part of File Stream Garbage Collection. Please refer to FS Garbage Collection for details

 

This concludes the state transition of CFPs. The main point to remember is that it takes few checkpoint/log-backups after a set of CFPs have been merged before the storage can be garbage collected. We expect these state transitions to occur automatically in production environment as checkpoints will occur automatically for each additional 512MB transaction log generation and the regular log backups that are taken as part of backup strategy. However, you will need to account for this extra storage as CFPs go through transition from MERGE SOURCE to finally getting garbage collected.

Merge Operation in Memory-Optimized Tables

In the Storage Allocation and Management for Memory-Optimized Tables, we had briefly touched upon the Merge operation. This blog describes Merge operation in detail. We also recommend you to refer to implementing durability for memory-optimized tables for a good understanding of data/delta files and CFPs

 The metadata of all Checkpoint File Pairs (i.e. CFP) that exist in storage is stored in an internal array structure referred to as Storage Array. It is a finitely sized (4096 entries in CTP2 and 8192 entries in RTM) array of CFPs to support a cumulative size of 256GB for durable memory-optimized tables in the database. The entries in the storage array ordered by transaction range. The CFPs in the storage array (along with the tail of the log) represent all the on-disk state required to recover a database with memory-optimized tables.

Running an OLTP workload overtime can lead to storage array to fill up leading to many inefficiencies as described here

  1. Deleted rows – The deleted rows are not actually removed from the data file but a reference to the deleted row is added to the corresponding delta file. This leads to wastage of storage which in turn impacts the recovery time negatively.
  2. Storage Array Manipulation Overhead - Internal processes search the storage array for operations such as finding the delta file to delete a row. The cost of these operations increases with the number of entries

To alleviate these inefficiencies, the older closed CFPs are merged based on a merge policy (described below) so the storage array is compacted to represent the same set of data, with reduced number of CFPs and the storage.

Merge Operation

The Merge operation takes one or more closed CFPs, called MERGE SOURCE(s), based on an internally defined merge policy, described later, and produces one resultant CFP, called MERGE TARGET. The entries in each delta file of source CFP(s) are used to filter rows from the corresponding data file to remove the data rows that are not needed anymore. The remaining rows in the source CFPs are consolidated into one target CFP.  After the merge is complete, the resultant CFP (i.e. the merge-target) replaces the source CFPs (i.e. the merge sources). The merge-source CFPs go through a transition phase before they are eventually removed from the storage.

In the example below, the memory-optimized table file group has four data and delta file pairs at timestamp 500 containing data from previous transactions. For example, the rows in the first data file correspond to transactions with timestamp > 100 and <=200 or alternatively represented as (100, 200].  The second and third data files are shown to be less than 50% full after accounting for the rows marked deleted. The merge operation combines these two CFPs and creates a new CFP containing transactions with timestamp > 200 and <=400, which is the combined range of these two CFPs.  You see another CFP with range (500, 600] and non-empty delta file for transaction range (200, 400] shows that Merge operation can be done concurrently with transactional activity including deleting more rows from the source CFPs.

 In-memory OLTP engine, a background thread evaluates all closed CFPs using a merge policy and then initiates one or more merge requests for the qualifying CFPs. These merge requests are processed by the offline checkpoint thread. The evaluation of merge policy is done periodically and also when a checkpoint is closed.

Merge Policy

SQL Server 2014 implements the following merge policy

  • A merge is scheduled if 2 or more consecutive CFPs can be consolidated, after accounting for deleted rows, such that the resultant rows can fit into 1 data of ideal size. The ideal size of data  is determined as follows
    • For machines < 16GB. The data file is 16MB and delta file is 1MB (not supported in CTP2)
    • For machines > 16GB, the data file is 128MB and delta file is 8MB

A maximum of 10 consecutive CFPs can be part of one merge operation.

  • A single CFP can be self-merged if the data file is larger than 256 MB and over half of the rows are marked deleted. A data file can grow larger than 128MB if, for example, a single transaction or multiple concurrent transactions insert/update large amount of data. This will result in the data file to grow beyond its ideal size because a transaction(s) cannot span multiple CFPs.   Note, Self-merge is not supported in CTP2.

Here are some examples that show the CFPs that will be merged under the merge policy.

Adjacent CFPs Source Files (% full)

Merge Selection

CFP0 (30%), CFP1 (50%), CFP2 (50%), CFP3 (90%)

(CFP0, CFP1). CFP2 is not chosen as it will make resultant data file > 100% of the ideal size

CFP0 (30%), CFP1 (20%), CFP2 (50%), CFP3 (10%)

(CFP0, CFP1, CFP2). Files are chosen starting from left. CTP3 is not combined as it will make resultant data file > 100% of the ideal size

CFP0 (80%), CFP1 (30%), CFP2 (10%), CFP3 (40%)

(CFP1, CFP2, CFP3). Files are chosen starting from left. CFP0 is skipped because if combined with CFP1, the resultant data file will be > 100% of the ideal size

 Not all CFPs with available space qualify for merge. For example, if two adjacent CFPs are 60% full, they will not qualify for merge which will result in 40% of wasted storage for these CFPs. In the worst case, all CFPs can be little over 50% full leading to storage utilization of only 50% approximately. It is important to note that deleted row(s) may exist in the storage, but they may have already been removed (i.e. garbage collected) from the memory. The management of storage and the memory is somewhat independent from garbage collection perspective. A thumb rule is that storage taken by ‘Active’ CFPs (note, it does not account for all CFPs) can be up to 2x larger than the size of durable tables in memory.

You can force merge using stored procedure sys.sp_xtp_merge_checkpoint_files (Transact-SQL) to override the merge policy. For details, please refer to http://msdn.microsoft.com/en-us/library/dn198330(v=sql.120).aspx

Life Cycle of a CFP

As mentioned earlier, once the CFPs are merged, they go through a state transition before they can be deallocated. At any given time, the CFPs data/delta file pairs can be in the following states

  • PRECREATED – A small set of CFPs are kept pre-allocated to minimize or eliminate any waits to allocate new files as transactions are being executed. These are full sized with data file size of 128MB and delta file size of 8 MB but contain no data. The number of CFPs is computed as the number of logical processors or schedulers with a minimum  of 8. This is a fixed storage overhead in databases with memory-optimized tables
  • UNDER CONSTRUCTION – Set of CFPs that store newly inserted and possibly deleted data rows since the last checkpoint.
  • ACTIVE - These contain the inserted/deleted rows from previous closed checkpoints. These CFPs contain all required inserted/deleted rows required before applying the active part of the transaction log at the database restart. We expect that size of these CFPs to be approximately 2x of the in-memory size of memory-optimized tables assuming merge operation is keeping up with the transactional workload.
  • MERGE TARGET – CFP stores the consolidated data rows from the CFP(s) that were identified by the merge policy. Once the merge is installed, the MERGE TARGET transitions into ACTIVE state
  • MERGED SOURCE – Once the merge operation is installed, the source CFPs are marked as MERGED SOURCE. Note, the merge policy evaluator may identify multiple merges a CFP can only participate in one merge operation.
  • REQUIRED FOR BACKUP/HA – Once the merge has been installed and the MERGE TARGET CFP is part of durable checkpoint, the merge source CFPs transition into this state. CFPs in this state are needed for operational correctness of the database with memory-optimized table.  For example, to recover from a durable checkpoint to go back in time.  A CFP can be marked for garbage collection once the log truncation point moves beyond its transaction range.
  • IN TRANSITION TO TOMBSTONE – These CFPs are not needed by in-memory OLTP engine can they can be garbage collected. This state indicates that these CFPs are waiting for the background thread to transition them to the next state TOMBSTONE
  • TOMBSTONE – These CFPs are waiting to be garbage collected by the filestream garbage collector.  Please refer to FS Garbage Collection for details

After accounting for the storage taken by CFPs in various states, the overall storage taken by durable memory-optimized tables can be much larger than 2x. The DMV sys.dm_db_xtp_checkpoint_files can be queried to list all the CFPs in memory-optimized filegroup including the phase they are in. Transitioning CFPs from MERGE_SOURCE state to TOMBSTONE and ultimately getting garbage collected can take up to 5 checkpoints and transaction log backup, if database is configured for full or bulk-logged recovery model, steps. You can, of course, manually force the checkpoint followed by log backup to expedite the garbage collection but then this will add 5 empty CFPs (i.e. 5 data/delta file pairs with data file of size 128MB each). In production scenarios, the automatic checkpoints and log backups taken as part of backup strategy will seamlessly transition CFPs through various states without requiring any manual intervention. The impact of the garbage collection process is that in-memory databases may have a disproportionate storage footprint compared to its size in memory. It is not uncommon to have the size storage occupied by memory-optimized tables to be 3x to their size in memory.

In the next blog, we will walk through an example showing transition of CFPs across all phases.

Storage Allocation and Management for Memory-Optimized Tables

As described in implementing durability for memory optimized tables, the storage requirements and its management for memory optimized tables is very different compared to the disk-based tables. If you are migrating a subset of your disk-based tables to memory-optimized tables or are just developing a new application with memory-optimized tables, you will notice these differences. Unlike disk-based tables in SQL Server that use 8k pages to store data in the same format both on disk and in memory, the  memory-optimized tables utilize a different size and  format both in-memory and on disk. This has been a point of concern by many customers as they have observed, sometimes, disproportionate storage consumed by memory optimized tables. The goal of this blog is to help alleviate this confusion by describing how storage is managed over the lifetime of data rows. 

The data for memory optimized tables is stored in one or more data/delta file pairs (also referred to as checkpoint file pairs or CFP) with data file(s) storing inserted rows and delta file(s) referencing deleted rows.  During the execution of an OLTP workload, as the DML operations update, insert, and delete rows, new data/delta files are created to persist the data. Also, existing delta files are updated to process delete of existing rows.  Over time, if the number of active rows, after accounting for deleted rows, in two or more consecutive CFPs falls below a threshold (usually < 50%) such that they can be merged into one CFP of 128 MB, they are merged automatically by a background merge process into a new CFP.  Once the merge operation is complete, the older CFPs go through a transition phase and are eventually removed (i.e. garbage collected) from the storage.  Note, SQL Server 2014 CTP2 supports up to 4096 CFPs within a database but this limit will be changed to 8192 CFPs in RTM

At any given time, the data/delta file pairs are in one of the following 4 categories

  1. Pre-allocated CFPs (data and delta): A small set of CFPs are kept pre-allocated to minimize or eliminate any waits to allocate new files as transactions are being executed. These are full sized with data file size of 128MB and delta file size of 8MB but contain no data. The number of CFPs is computed as the number of logical processors or schedulers with a minimum of 8. This is a fixed storage overhead in databases with memory-optimized tables
  2. Active CFPs: These contain the inserted/deleted rows for the last ‘durable’ checkpoint. These CFPs contain all inserted/deleted rows required before applying the active part of the transaction log at the database restart. We expect that the combined size of the ACTIVE CFPs to be at most 2x of the in-memory size of memory-optimized tables. However, there are cases, specifically with data files > 128MB which can happen due to large concurrent transactions or if the merge operation falls behind, when we may exceed this 2x limit. To simplify the discussion here, let us just assume all data files are 128MB and that the merge/checkpoint operations are keeping up.
  3. CFPs required for Operational Correctness: These represents files that were a source of MERGE operations where rows from these files were moved to a new data/delta files. These files (i.e. CFPs) are not needed for database restart.  However, these CFPs can’t be de-allocated or marked for deallocation (i.e.  Garbage collection) because they are needed if SQL Server needs to recover from a previous durable checkpoint.  A CFP can be marked for garbage collection once the log truncation point moves beyond its transaction range. This requires a combination of checkpoint operation and transaction log backups. Transaction log backups are not needed if the database has been configured in simple recovery model.
  4. CFPs that can be removed from storage: These are garbage collected similar to how file-stream files are garbage collected.  Please refer to FS Garbage Collection for details

Transitioning CFPs out of category-3 and category-4 can take up to 5 checkpoints and transaction log backup steps, if not running in simple recovery mode. For in-memory databases, the automatic checkpoint is taken when the size of transaction log exceeds 512MB since the last checkpoint.  You can, of course, manually force the checkpoint followed by log backup to expedite the garbage collection but then this will add 5 empty CFPs (i.e. 5 data/delta file pairs with data file of size 128MB each). Typically, in test environment, you may have to force transaction log backups and checkpoints to remove the CFPs that are not needed. In production scenarios, however, we don’t expect customers to do manual checkpoint but to rely on the automatic checkpoints and log backups that are taken as part of backup strategy. The impact of this garbage collection process is that in-memory databases may have a disproportionate storage footprint compared to its size in memory.

To clarify the point further, let us walk through some examples of in-memory databases under various scenarios and compare their size in-memory and on storage.  For these examples, we will assume we are running a SQL instance with 8 logical processor with 1 memory optimized table with following schema. Note, this has a row size of approximately 8KB. 

CREATE TABLE dbo.t_memopt (

       c1 int NOT NULL,

       c2 char(40) NOT NULL,

       c3 char(8000) NOT NULL,

 

       CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1) WITH

       (BUCKET_COUNT = 100000)

) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

As indicated earlier, the fixed storage overhead for this configuration will be 1 GB (i.e. 8 data files of 128MB each).

  • Example-1:
    • DML Operation: Insert 1 row
    • In-memory Size - you can get the size by querying sys.dm_db_xtp_table_memory_stats. The data row is taking 8KB while hash index is taking 1MB for allocating 100K buckets

Name       memory_used_by_indexes_kb memory_used_by_table_kb

---------- ------------------------- -----------------------

t1         1024                      8

    • Storage for memory-optimized table – The storage is more interesting. On the machine with 8 logical processors, there are 8 data and 8 delta files pre-allocated and there is 1 data/delta file pair used to store the currently active rows. So overall storage for memory optimized FG is (9*128MB) = 1.12 GB which is disproportionately larger than the size of in-memory table. This is expected because my configuration has a fixed overhead of 1 GB plus the granularity of storage allocation is 128MB. This is different than disk-based tables where unit of storage allocation is an 8K page for tables < 64k in size. This additional size of storage for memory-optimized tables may surprise you if you are just playing with small data sets. However, we don’t expect this to be a cause of concern as typical customer workload will have much larger dataset (e.g. 10s or 100s of GB) stored in memory-optimized tables.
  • Example-2
    • Operation: Insert 1-million rows
    • In-memory Size – 7.8GB
    • Storage for memory-optimized table – 8.9GB which is pretty reasonable compared to previous example. 1 million rows should take approx. 8GB of storage plus 1GB of fixed overhead which is now around 10%.
    • Storage for disk-based table – it is around 8GB so it is of the same order as memory optimized tables.
  • Example-3
    • Operation: update non-key values for the first 500K rows of the table loaded in the previous example.
    • In-memory Size (used) – 8 GB. When I measured this, most rows had been garbage collected
    • Storage for memory-optimized table – 13 GB. You may wonder why the size of storage has gone up for update operation.  An update operation on memory-optimized table is done as delete followed by an insert.  For the deleted rows, the delta file(s) are updated to reflect them and then the new row versions are inserted into another data file. So at this time, both old and new version of the data row are persisted and that is why you see that the storage consumption has gone up approximately 50%. The data/delta files will get merged eventually and get garbage collected as described earlier.
    • Storage for disk-based tables – it stays around 8GB. This is expected because the update of the non-key columns is done in-place for disk-based tables.
  • Example-4
    • Workload/Operations – Update only OLTP workload running against a database of size 150GB for over 24 hours to measure storage and memory in the steady state
    • In-memory size (used) – 167 GB. This shows that the stale row versions are getting garbage collected. Please refer to Memory Management for Memory-Optimized Tables for details.
    • Storage for memory-optimized table –320 GB. Given that the size of the database is 150GB, the storage in steady state is around 2.2 times.

Though these examples were over simplified but they do illustrate the difference in storage provisioning for memory optimized tables over disk-based tables. Your storage requirement will depend upon following key factors such

    • size of your durable memory optimized tables
    • DML operations on the durable tables. For example, if your workload is update heavy, it leads to MERGEs thereby inflating the storage requirement until these files get garbage collected
    • Frequency of checkpoint and log backups.

Boosting Transaction Performance in Windows Azure Virtual Machines with In-Memory OLTP

With the release of SQL Server 2014 CTP2, you can now significantly boost the performance of your OLTP workloads in Windows Azure Virtual Machines. By creating a new VM with our preloaded image of SQL Server 2014 CTP2 on Windows Server 2012 R2, or installing SQL Server 2014 CTP2 on your VM, In-Memory OLTP functionalities are immediately available to you. This blog post provides a good guide on how to create a Windows Azure VM.

However, since the transition to In-Memory OLTP is not as simple as flipping a switch, you must carefully evaluate your application scenario and see if it is the right solution for you.

Recommended Scenarios

For SQL Server 2014 CTP 2, we recommend the following scenarios for SQL Server In-Memory OLTP on a Windows Azure Virtual Machine:

  • Development and Test scenarios, e.g. familiarizing with In-Memory OLTP’s capabilities. The on-demand provisioning of Windows Azure VMs and its low cost make it easy to gain full access to In-Memory OLTP functionalities without a large capital investment. Once you are comfortable with its capabilities and understand its limitations, you can move to deploy In-Memory OLTP on your local server or keep using the VM if it suits your business needs.
  • Scenarios with relaxed data persistence requirements, e.g. web browser session state. SQL Server In-Memory OLTP provides non-durable tables that are perfect for transient data regardless of the transaction read/write mix. By completely eliminating I/O operations, non-durable tables could provide amazing boosts to performance for your non-persistent workload. If some persistence is still required but strict durability is not, you can leverage the new Delayed Durability features we have added to CTP2. Delayed Durability commits transactions but do not immediately write the log records to disk, lightening log I/O pressure by allowing larger and less frequent log writes to disk. For more details see the Books Online topic here.
  • Read-mostly and read-only scenarios, e.g. an online product catalogue. In-Memory OLTP provides extreme performance and parallelism in read-mostly and read-only scenarios due to new algorithms and data structures optimized for memory-resident data. Furthermore, native compilation for stored procedures can dramatically increase CPU efficiency and throughput.
  • Durable read-write scenarios not under log I/O pressure, e.g. workloads with heavy contention. In-Memory OLTP can also provide benefits for workloads with full durability requirements that is not under pressure from log I/O latency or throughput. Because In-Memory OLTP eliminates page contention by using an optimistic concurrency system, it could provide significant boosts to parallelism for your workload if it is suffering from contention problems. In addition, native compilation can improve the speed of a stored procedure with heavy business logic processing.

However, if your workload is suffering from long log I/O latency or if it is under pressure from log I/O throughput, and at the same time you require strict durability of your data, In-Memory OLTP on Windows Azure VM will not alleviate these problems.

SQL Server 2014 CTP2 on Windows Azure VM is not suitable for testing the performance of applications deployed in machines with similar configuration on premise.

Selecting the Right Windows Azure VM Size

The selection of VM sizes is important for any workload running in the IaaS space, whether you are provisioning a new VM using our preloaded CTP2 image or adding CTP2 onto an existing instance. Windows Azure provides a selection of virtual machine sizes for a variety of purposes, as listed in this article.

Since In-Memory OLTP is designed to serve extreme high session concurrency and the nature of memory-resident data requires sufficient memory space, we recommend the following Windows Azure VM sizes for adopting In-Memory OLTP:

Compute Instance Name

Virtual Cores

RAM

Extra Large (A4)

8

14 GB

A6

4

28 GB

A7

8

56 GB

The exact instance size that you will choose will depend on the scenario you wish to run and the size of data you wish to make memory-resident in SQL Server In-Memory OLTP. We recommend provisioning 100% more memory than the estimated size of data in memory and leave sufficient space for the buffer pool used by disk-based workloads on the same machine. This article on SQL Server Books Online has more information on how to estimate the size of a table in memory, and we have published a blog post on hardware considerations for In-Memory OLTP, some of which apply to provisioning VMs as well.

Virtual Machine Configuration

To configure a Windows Azure Virtual Machine for best performance in In-Memory OLTP, we suggest that you should follow the best practices outlined in this whitepaper. A summary of key considerations, plus some unique attributes for In-Memory OLTP, are listed below:

  • Use a single storage account for all disks attached to the same VM.
  • Do not turn on disk host caching for the data and log drives.
    When creating new drives for a Windows Azure VM, the disk host caching option is turned off by default. We do not recommend you change this option.
  • Do not use the Temporary Disk (D: Drive) for log storage.
    This will sacrifice durability for your database, as the content on the Temporary Disk is transient and may be lost on a VM failure or reboot. Please use a persistent Windows Azure Storage disk for your log storage needs.
  • Do not use native operating system striping for storing the log of an In-Memory OLTP-enabled database.
    Logging memory-optimized tables is latency-sensitive, and striping of drives increase the latency of I/O operations. If you wish to use any form of striping for your log drive, you should test your configuration to ensure that the I/O latency is sufficiently low.
  • Compression is not supported for In-Memory OLTP tables. Compression will still work with regular tables in an In-Memory OLTP-enabled database.

Boosting Transaction Performance in Windows Azure Virtual Machines with In-Memory OLTP

With the release of SQL Server 2014 CTP2, you can now significantly boost the performance of your OLTP workloads in Windows Azure Virtual Machines. By creating a new VM with our preloaded image of SQL Server 2014 CTP2 on Windows Server 2012 R2, or installing SQL Server 2014 CTP2 on your VM, In-Memory OLTP functionalities are immediately available to you. This blog post provides a good guide on how to create a Windows Azure VM.

However, since the transition to In-Memory OLTP is not as simple as flipping a switch, you must carefully evaluate your application scenario and see if it is the right solution for you.

Recommended Scenarios

For SQL Server 2014 CTP 2, we recommend the following scenarios for SQL Server In-Memory OLTP on a Windows Azure Virtual Machine:

  • Development and Test scenarios, e.g. familiarizing with In-Memory OLTP’s capabilities. The on-demand provisioning of Windows Azure VMs and its low cost make it easy to gain full access to In-Memory OLTP functionalities without a large capital investment. Once you are comfortable with its capabilities and understand its limitations, you can move to deploy In-Memory OLTP on your local server or keep using the VM if it suits your business needs.
  • Scenarios with relaxed data persistence requirements, e.g. web browser session state. SQL Server In-Memory OLTP provides non-durable tables that are perfect for transient data regardless of the transaction read/write mix. By completely eliminating I/O operations, non-durable tables could provide amazing boosts to performance for your non-persistent workload. If some persistence is still required but strict durability is not, you can leverage the new Delayed Durability features we have added to CTP2. Delayed Durability commits transactions but do not immediately write the log records to disk, lightening log I/O pressure by allowing larger and less frequent log writes to disk. For more details see the Books Online topic here.
  • Read-mostly and read-only scenarios, e.g. an online product catalogue. In-Memory OLTP provides extreme performance and parallelism in read-mostly and read-only scenarios due to new algorithms and data structures optimized for memory-resident data. Furthermore, native compilation for stored procedures can dramatically increase CPU efficiency and throughput.
  • Durable read-write scenarios not under log I/O pressure, e.g. workloads with heavy contention. In-Memory OLTP can also provide benefits for workloads with full durability requirements that is not under pressure from log I/O latency or throughput. Because In-Memory OLTP eliminates page contention by using an optimistic concurrency system, it could provide significant boosts to parallelism for your workload if it is suffering from contention problems. In addition, native compilation can improve the speed of a stored procedure with heavy business logic processing.

However, if your workload is suffering from long log I/O latency or if it is under pressure from log I/O throughput, and at the same time you require strict durability of your data, In-Memory OLTP on Windows Azure VM will not alleviate these problems.

SQL Server 2014 CTP2 on Windows Azure VM is not suitable for testing the performance of applications deployed in machines with similar configuration on premise.

Selecting the Right Windows Azure VM Size

The selection of VM sizes is important for any workload running in the IaaS space, whether you are provisioning a new VM using our preloaded CTP2 image or adding CTP2 onto an existing instance. Windows Azure provides a selection of virtual machine sizes for a variety of purposes, as listed in this article.

Since In-Memory OLTP is designed to serve extreme high session concurrency and the nature of memory-resident data requires sufficient memory space, we recommend the following Windows Azure VM sizes for adopting In-Memory OLTP:

Compute Instance Name

Virtual Cores

RAM

Extra Large (A4)

8

14 GB

A6

4

28 GB

A7

8

56 GB

The exact instance size that you will choose will depend on the scenario you wish to run and the size of data you wish to make memory-resident in SQL Server In-Memory OLTP. We recommend provisioning 100% more memory than the estimated size of data in memory and leave sufficient space for the buffer pool used by disk-based workloads on the same machine. This article on SQL Server Books Online has more information on how to estimate the size of a table in memory, and we have published a blog post on hardware considerations for In-Memory OLTP, some of which apply to provisioning VMs as well.

Virtual Machine Configuration

To configure a Windows Azure Virtual Machine for best performance in In-Memory OLTP, we suggest that you should follow the best practices outlined in this whitepaper. A summary of key considerations, plus some unique attributes for In-Memory OLTP, are listed below:

  • Use a single storage account for all disks attached to the same VM.
  • Do not turn on disk host caching for the data and log drives.
    When creating new drives for a Windows Azure VM, the disk host caching option is turned off by default. We do not recommend you change this option.
  • Do not use the Temporary Disk (D: Drive) for log storage.
    This will sacrifice durability for your database, as the content on the Temporary Disk is transient and may be lost on a VM failure or reboot. Please use a persistent Windows Azure Storage disk for your log storage needs.
  • Do not use native operating system striping for storing the log of an In-Memory OLTP-enabled database.
    Logging memory-optimized tables is latency-sensitive, and striping of drives increase the latency of I/O operations. If you wish to use any form of striping for your log drive, you should test your configuration to ensure that the I/O latency is sufficiently low.
  • Compression is not supported for In-Memory OLTP tables. Compression will still work with regular tables in an In-Memory OLTP-enabled database.