Tuesday, August 13, 2013

Performance Tuning SQL Server Cursors

If possible, avoid using SQL Server cursors. They generally use a lot of SQL Server resources and reduce the performance and scalability of your applications. If you need to perform row-by-row operations, try to find another method to perform the task.
Here are some alternatives to using a cursor:
  • Use temp tables
  • Use derived tables
  • Use correlated sub-queries
  • Use the CASE statement
  • Perform multiple queries
More often than not, there are non-cursor techniques that can be used to perform the same tasks as a SQL Server cursor. [2000, 2005, 2008] Updated 1-29-2009
If you do find you must use a cursor, try to reduce the number of records to process.
One way to do this is to move the records that need to be processed into a temp table first, then create the cursor to use the records in the temp table, not from the original table. This of course assumes that the subset of records to be inserted into the temp table are substantially less than those in the original table.
The lower the number of records to process, the faster the cursor will finish. [2000, 2005, 2008] Updated 1-29-2009
If the number of rows you need to return from a query is small, and you need to perform row-by-row operations on them, don’t use a server-side cursor. Instead, consider returning the entire rowset to the client and have the client perform the necessary action on each row, then return any updated rows to the server. [2000, 2005, 2008] Updated 1-29-2009
If you have no choice but to use a server-side cursor in your application, try to use a FORWARD-ONLY or FAST-FORWARD, READ-ONLY cursor. When working with unidirectional, read-only data, use the FAST_FORWARD option instead of the FORWARD_ONLY option, as it has some internal performance optimizations to speed performance. This type of cursor produces the least amount of overhead on SQL Server.
If you are unable to use a fast-forward cursor, then try the following cursors in this order, until you find one that meets your needs. They are listed in the order of their performance characteristics, from fastest to slowest: dynamic, static, and keyset. [2000, 2005, 2008] Updated 1-29-2009
Avoid using static/insensitive and keyset cursors, unless you have no other choice. This is because they cause a temporary table to be created in TEMPDB, which increases overhead and can cause resource contention issues. [2000, 2005, 2008] Updated 1-29-2009
If you have no choice but to use cursors in your application, try to locate the SQL Server tempdb database on its own physical device for best performance. This is because cursors may use the tempdb for temporary storage of cursor data. The faster your disk array running tempdb, the faster your cursor will be. [2000, 2005, 2008] Updated 1-29-2009
Using cursors can reduce concurrency and lead to unnecessary locking and blocking. To help avoid this, use the READ_ONLY cursor option if applicable, or if you need to perform updates, try to use the OPTIMISTIC cursor option to reduce locking. Try to avoid the SCROLL_LOCKS cursor option, which reduces concurrency. [2000, 2005, 2008] Updated 1-29-2009
When you are done using a cursor, don’t just CLOSE it, you must also DEALLOCATE it. Deallocation is required to free up the SQL Server resources used by the cursor. If you only CLOSE the cursor, locks are freed, but SQL Server resources are not. If you don’t DEALLOCATE your cursors, the resources used by the cursor will stay allocated, degrading the performance of your server until they are released. [2000, 2005, 2008] Updated 1-29-2009
If it is appropriate for your application, try to load the cursor as soon as possible by moving to the last row of the result set. This releases the share locks created when the cursor was built, freeing up SQL Server resources. [2000, 2005, 2008] Updated 1-29-2009
If you have to use a cursor because your application needs to manually scroll through records and update them, try to avoid client-side cursors, unless the number of rows is small or the data is static. If the number of rows is large, or the data is not static, consider using a server-side keyset cursor instead of a client-side cursor. Performance is usually boosted because of a reduction in network traffic between the client and the server. For optimum performance, you may have to try both types of cursors under realistic loads to determine which is best for your particular environment. [2000, 2005, 2008] Updated 1-29-2009
When using a server-side cursor, always try to fetch as small a result set as possible. This includes fetching only those rows and columns the client needs immediately. The smaller the cursor, no matter what type of server-side cursor it is, the fewer resources it will use, and performance will benefit. [2000, 2005, 2008] Updated 1-29-2009
If you need to perform a JOIN as part of your cursor, keyset and static cursors are generally faster than dynamic cursors, and should be used when possible. [2000, 2005, 2008] Updated 1-29-2009
If a transaction you have created contains a cursor (try to avoid this if at all possible), ensure that the number of rows being modified by the cursor is small. This is because the modified rows may be locked until the transaction completes or aborts. The greater the number of rows being modified, the greater the locks, and the higher the likelihood of lock contention on the server, hurting performance. [2000, 2005, 2008] Updated 1-29-2009
In SQL Server, there are two options to define the scope of a cursor. LOCAL and GLOBAL keywords in the DECLARE CURSOR statement are used to specify the scope of a cursor. A GLOBAL cursor can be referenced in any stored procedure or batch executed by a connection. LOCAL cursors are more secure as they cannot be referenced outside the procedure or trigger unless they are passed back to the calling procedure or trigger, or by using an output parameter. GLOBAL cursors must be explicitly deallocated or they will be available until the connection is closed. For optimum performance, you should always explicitly deallocate a cursor when you are done using it. LOCAL cursors are implicitly deallocated when the stored procedure, the trigger, or the batch in which they were created terminates. We can use LOCAL cursors for more security and better scope of the cursor in our application, which also helps to reduce resources on the serve.

Top 10 SQL Server Performance Tuning Tips

SQL Server is a very complex product. And when it comes to performance tuning, many DBAs simply don't know where to start. There many facets to the program and many things to consider if you want to correctly tune a SQL Server instance. Performance tuning is definitely one of those areas in which experience is the best teacher. But you need to start somewhere. And, as is common to many things, it all starts by having a good foundation to build upon.
We're going to get back to the basics of performance tuning by learning 10 tips that will help you get off on the right foot when you try to identify existing performance issues and prevent future ones. Before we start, take a look at the "SQL Server 2005 Waits and Queues" white paper on the SQL Server Customer Advisory Team (CAT) website. It contains example code and much more detail than I can provide here. I highly recommend that you read this white paper and use it as a reference for this article.

Tip 1: Stop Waiting Around

Every time that SQL Server tries to do something but gets held up for any reason, it tracks the incident in the form of something known as wait statistics. (See the CAT white paper for more information about wait statistics.) This is one of the areas of SQL Server that you must begin to understand to correctly determine the resources that SQL Server is waiting on at any given moment. For example, if you find that most of your waits are related to a page_IO_latch issue, you can be pretty sure that your bottleneck has to do with I/O. And, if you see many LCK_XX type waits occur, you're seeing a blocking issue. In this case, you would spend your time more wisely by investigating the cause of the blockage instead of by looking at the I/O issues.
The CAT white paper contains plenty of detail about the different wait types, and also code examples for capturing and deciphering the results. After you've read the white paper, examine your waits from time to time so that you can stop chasing your tail and start narrowing down where the real bottleneck lies. This is, without a doubt, where you should start your performance tuning day.

Tip 2: Locate I/O Bottlenecks

I/O bottlenecks are one of the key reasons why performance suffers in SQL Server. You have three relatively easy methods at your disposal to determine whether you have I/O issues:
  • Check whether you see high page_IO_latch waits or log_write waits in your wait statistics.
  • Use the DMF sys.dm_io_virtual_file_stats() to locate any areas in which you have excessive physical I/O or excessive stalls on that I/O. These issues can occur at the database level or even at the file level. 
  • Use the trusty PerfMon counters. At a minimum, use the Avg. Disk sec/Read and Avg. Disk sec/Write counters to see the latency of the reads and writes. On an OLTP system, you would, ideally, want to see log file latency to be just a few ms and data file latency to be less than 10ms. Remember that these are ideal values. Your system might tolerate larger latency and still be fine. Also keep in mind that many times when you find that the storage subsystem can't keep up with the current demand, the cause might not be an I/O bottleneck at all. It might, instead, be pooled.
When you find that you have many physical I/O bottlenecks occurring, your first instinct should be to find the queries that are causing all the physical I/O, and then try to tune them before you add more hardware. One performance aspect that you should never ignore is high latency for log writes. If you start to hold up the writing to the log file, all further DML operations can quickly become impeded, and they’ll remain so until you alleviate the bottleneck. High latency in log writes is a sure way to hinder performance in the database. For a better understanding of storage issues and I/O subsystems in a SQL Server environment, I recommend that you read the Microsoft articles "Storage Top 10 Practices" and "SQL Server Best Practices Article."

Tip 3: Root Out Problem Queries

In any given SQL Server instance, there are probably 8 to 10 queries or stored procedures that are responsible for 80 to 90 percent of the poorly tuned activity that you see throughout the day. Naturally, this isn't true for everyone, but it’s true often enough to matter. If you can identify these problem queries, and if you can prioritize tuning them, you can make a significant impact on the overall performance of your server.
The CAT white paper contains information about this subject. But one way that you can easily identify expensive statements is by using the code in Listing 1.
Listing 1: The sys.dm_exec_query_stats DMV Query for Performance Statistics
COALESCE(DB_NAME(t.[dbid]),'Unknown') AS [DB Name],
ecp.objtype AS [Object Type],
t.[text] AS [Adhoc Batch or Object Call],
            SUBSTRING(t.[text], (qs.[statement_start_offset]/2) + 1,
            ((CASE qs.[statement_end_offset]
                        WHEN -1 THEN DATALENGTH(t.[text]) ELSE qs.[statement_end_offset] END
                                    - qs.[statement_start_offset])/2) + 1) AS [Executed Statement]
            , qs.[execution_count] AS [Counts]
            , qs.[total_worker_time] AS [Total Worker Time], (qs.[total_worker_time] /
qs.[execution_count]) AS [Avg Worker Time]
            , qs.[total_physical_reads] AS [Total Physical Reads],
(qs.[total_physical_reads] / qs.[execution_count]) AS [Avg Physical Reads]
            , qs.[total_logical_writes] AS [Total Logical Writes],
(qs.[total_logical_writes] / qs.[execution_count]) AS [Avg Logical Writes]
            , qs.[total_logical_reads] AS [Total Logical Reads],
(qs.[total_logical_reads] / qs.[execution_count]) AS [Avg Logical Reads]
            , qs.[total_clr_time] AS [Total CLR Time], (qs.[total_clr_time] /
qs.[execution_count]) AS [Avg CLR Time]
            , qs.[total_elapsed_time] AS [Total Elapsed Time], (qs.[total_elapsed_time]
/ qs.[execution_count]) AS [Avg Elapsed Time]
            , qs.[last_execution_time] AS [Last Exec Time], qs.[creation_time] AS [Creation Time]
FROM sys.dm_exec_query_stats AS qs
    JOIN sys.dm_exec_cached_plans ecp ON qs.plan_handle = ecp.plan_handle
            CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS t
--    ORDER BY [Total Worker Time] DESC
--    ORDER BY [Total Physical Reads] DESC
--    ORDER BY [Total Logical Writes] DESC
--    ORDER BY [Total Logical Reads] DESC
--    ORDER BY [Total CLR Time] DESC
--    ORDER BY [Total Elapsed Time] DESC
            ORDER BY [Counts] DESC
The sys.dm_exec_query_stats DMV query contains aggregate performance statistics that are associated with each of the cached query plans in the SQL Server instance. This query easily enables you to order the results in several ways, depending on what type of resource usage you want to concentrate on. The key is to weigh the number of times that the statement was executed versus the average resource usage to better determine which statements would have the largest impact on performance if they were optimized better.
I will caution you not to put too much weight on the total elapsed time or on the overall duration of the statement because other factors, such as blocking, can influence the overall duration. But by using this query, you should be able to quickly identify the top offending statements in your system, and then prioritize the statements so that you can tune them as efficiently as possible.

Tip 4: Plan To Reuse

Today's database applications deal with an ever-increasing transaction rate as hardware gets more powerful and less expensive. For this reason, you are often trying to pack more and more into a single server, thus increasing the activity. But one of the most expensive operations in terms of memory and CPU usage in the SQL Server instance is the compiling of query plans. We have also seen an increase in applications that can speed development but that often result in calls to the database that make it nearly impossible to reuse the query plan. I'm talking about applications that don't use stored procedures or that don't correctly parameterize their queries. This can result in extremely poor performance, especially as the number of transactions increases. I highly recommend that you also read the "Plan Caching in SQL Server 2008" white paper as an excellent source for describing how SQL Server handles query plans and reuse.
The CAT white paper also goes into some detail about this subject. But the following code example is a quick and easy way to determine which statements aren't reusing query plans:
SELECT b.[cacheobjtype], b.[objtype], b.[usecounts], a.[dbid], a.[objectid], b.[size_in_bytes], a.[text]
FROM sys.dm_exec_cached_plans as b
CROSS APPLY sys.dm_exec_sql_text (b.[plan_handle]) AS a
ORDER BY [usecounts] DESC
This query sorts all of the plans in the procedure cache in descending order of use counts. The use counts column is incremented every time that a plan is reused, and it lets us easily identify which plans have reuse. You can also order the plans by the text column to determine which statements have many similar entries that have a use count of one. This value indicates statements that you call often but that don't reuse the existing plan. After you've identified these statements, you can prioritize which parts of the application you must work on first to get the biggest bang for your buck in terms of plan reuse. Don’t underestimate how seriously a lack of plan reuse can affect performance as the transaction rate increases.

Tip 5: Monitor Index Usage

The sys.dm_db_index_operational_stats() DMF is a widely underutilized source of information. It can provide you valuable information about your index usage. By using this DMF, you can decipher all kinds of information to determine not only which indexes are used but also how they're used. For example, are you scanning the index or are you using seeks? This DMF will tell you. It will even tell you things such as the time elapsed for processes, such as latching and locking. Have you ever asked yourself, "Is this index being used?" or, "Which indexes am I using for a given table?" We've all asked these questions at one time or another. So you can use this DMF to get a handle on your index usage.

Tip 6: Separate Data and Log Files

One of the most basic but often disregarded rules for good performance is to separate the data and the log files onto separate physical drive arrays whenever possible. This is especially true when you use DAS, but it also applies to a SAN. The key principle here is to separate the mostly random access of the data files from the sequential access that occurs by writing to the transaction logs. One aspect familiar to a SAN environment is that even though you're presented with different drive letters or LUNs, you can't be sure that these represent different physical drive arrays.  Often, these apparent drives are carved from the same larger disk array, and this will defeat the intent of separating them in the first place. So make sure that you know what you're really getting when you ask for your storage on a SAN. You'll be amazed at how much difference this can make as the volume of your transactions increases.

Tip 7: Use Separate Staging Databases

I see far too many production databases that are also used for scratch space when it comes to importing or exporting data. People frequently import data into a table that is real but temporary in nature. Then, they proceed to manipulate the data by performing extensive updates to prepare it for one of the final production tables. The problem with this is that most production databases are in full recovery mode. This means that practically all activity is fully logged in the transaction log, and these operations can be expensive.
By using a separate staging database within the SQL Server instance that's in simple recovery mode, you can achieve several performance benefits. One is that you can often get a minimally logged load instead of a fully logged load. This can dramatically speed up the import process to begin with. With a minimally logged load, the amount of data that's logged to the transaction log is very small in relation to what it would be for a fully logged load, so you have a much lighter burden on the server as a whole.
If you were to perform these operations in the production database, all that activity would have to use resources such as I/O, CPU, and memory. All the preparation would be fully logged in either database; but at least in the staging database, you wouldn't have to back up the log. If this were done in the production database instead, that activity would increase the time required to back up the log. Therefore, the resulting backup files would be larger than necessary. Create a staging database, and perform your initial importing and manipulation in that database to reduce the overall burden on the server and to minimize the impact to the production databases.

Tip 8: Pay Attention to Log Files

Too many people underestimate the importance of the transaction log file in relation to performance. The most common mistake people make is not leaving enough free space in the transaction log file for your normal operations to occur without forcing an auto-grow operation. Growing the log file can be extremely time-consuming. It can force any DML operations to wait until the growth is complete before the operation can proceed.
By making sure that you always have plenty of free space in the log file, you can avoid these performance hits altogether. The second most common mistake is having far too many virtual log files (VLFs) in the transaction log. For an excellent explanation of this practice, see Kimberly L.Tripp's blog post, "Transaction Log VLFs - too many or too few?" Personally, I prefer to have the VLFs set at 512MB each. You can obtain this value by growing the log file in 8GB increments.

Tip 9: Minimize tempdb Contention

If your application makes heavy use of tempdb, it's possible that you could run into some contention regarding internal structures that are associated with the tempdb files. There are ways to minimize the contention, outlined in the TechNet topic "Working with tempdb in SQL Server 2005."
This topic goes into great detail about many aspects of tempdb. However, I want to point out that the article recommends that you create one tempdb data file for each processor core. In many cases, this is excessive and can actually degrade performance. What I recommend is that you keep an eye on the page_latch_UP waits for tempdb and that you increase the number of files until the wait either ends or is close to zero. Just remember to make all the files the exact same size; otherwise, you don't experience any benefit because of the allocation algorithm that's based on the amount of free space in each file. Another thing to keep in mind is not to explicitly drop temporary tables in your code if you can help it. Let SQL Server do that on its own. In this way, you can maximize the effect of the caching enhancements that are available in recent versions of SQL Server.

Tip 10: Change the MAX Memory Limit

There have been improvements in the 64-bit versions of SQL Server regarding memory allocation and sharing with the OS and other applications, but I've yet to see where leaving the MAX Memory setting at the default is ideal in real life. Even though your host server might be dedicated to SQL Server, there are always other applications or parts of the OS that require memory from time to time or even all the time. Do yourself a favor and set the MAX memory setting to at least 1 to 2GB less than the total amount of memory on the server if this is a single instance. If you have multiple instances, or if you're in a multi-instance cluster, you also have to account for that. How much memory you leave depends on what else you have running and how much memory it requires to operate efficiently. But you can adjust this value up or down over time as you see fit.

Bonus Tip: Just Say No to Shrinking Data Files

OK, this makes 11 tips. But shrinking data files has been a general bad practice for a long time, and it can really impact performance in one of three ways. The shrinking can be very painful to begin with. But because it can cause a lot of fragmentation, your subsequent queries might suffer as a result. And if you don’t have Instant File Initialization turned on, the resultant growth later can also hinder performance and potentially cause timeouts. Although there are times when shrinking a file might be necessary, make sure that you know the impact before you try it. For more details, see Paul Randal's blog post "Why you should not shrink your data files."

Make These Tips Your Own

By following these guidelines and keeping this information in mind, you should be able to identify the most common performance issues in SQL Server—and prevent or minimize future ones, as well. By now, you will have noticed that some of these topics will require more reading and some actual experience for you to fully grasp the concepts and techniques. But none of these topics are out-of-reach for the average DBA. And even a beginner has to start somewhere. So why not here?

Popular Posts