Inside SQL Server: Stored Procedure Plans
Improve performance through caching
September 30, 1999
In SQL Server 7.0, reusing compiled plans for stored procedures can make a dramatic difference in overall system performance. (For more information on plan reuse, see "SQL Server 7.0 Plan Caching," September 1999.) The greater complexity of the Query Optimizer saves parse-and-compile time (which includes query optimization). But reuse of compiled plans isn't always the most important reason for using stored procedures. With earlier SQL Server releases, the savings aren't great, and SQL Server 7.0 includes other methods of saving and reusing query plans. So why should you use stored procedures?
Stored procedures let you develop SQL code in programmable units, which anyone working on the application can use. Stored procedures allow controlled access to data, so a user can have permission to run a procedure without having ad hoc access to the object that the stored procedure references. From a performance standpoint, the savings in network traffic of not needing to send the entire block of code from client to server can be an important reason for using stored procedures. So let's look closer at stored procedure plan caching and how best to use it.
Stored Procedure Caching vs. Plan Caching
SQL Server uses several mechanisms to save and reuse plans for ad hoc or autoparameterizable queries. Most of these techniques deal with optimizing a SQL statement. With stored procedure plan caching, you're dealing with multiple plans—one plan for each statement in the procedure. Generally, all statements in a procedure are compiled at the same time, so you can consider the procedure's plan to be the combination of the plans for all the statements.
Procedures can be parameterized, and in general, the plan for a procedure is based on the arguments that are passed the first time the procedure is called. Regardless of the value or values passed in, subsequent executions of the same procedure use the same plan, which isn't always a good idea. For example, suppose you have a stored procedure that computes information about customers' purchasing habits by country. If the procedure accepts a parameter for the country, it might contain a statement such as
SELECT country(*) FROM sales WHERE country = @country
Suppose you have a nonclustered index on the country column. If, the first time the procedure is called, the country you supply as the parameter occurs only a few times in the table, the optimizer might use the index. (Nonclustered indexes are typically a good choice only when the rows that need to be accessed form a small percentage of the table.) After that, the saved plan for this stored procedure always tells SQL Server to use the nonclustered index on the country column. If someone executes the procedure with a country value that occurs in many rows, the original plan is still used, even though it's not an optimal plan for this second case. The mechanisms that SQL Server uses to autoparameterize queries that aren't stored procedures are very conservative, and SQL Server doesn't autoparameterize any query in which a parameter change could lead to a different plan. However, when writing stored procedures, you can control when a stored procedure is recompiled.
The mechanisms for recompiling procedures are well-described in the online documentation, so I won't go into detail here. If you check the syntax for the CREATE PROCEDURE statement, you'll see that you can add the option WITH RECOMPILE to the procedure definition. This option causes SQL Server to compile and optimize the procedure every time it runs. If you know that different parameters might require vastly different plans, consider this option. Also, you can recompile one execution of a procedure by using the WITH RECOMPILE option with the EXECUTE statement. Any time a procedure is executed with this option, SQL Server doesn't check for any existing plans in cache, and the new plan developed for this single execution is saved in cache.
The option of recompiling a stored procedure is different from the new SQL Server 7.0 capability of automatically updating statistics. (See "SQL Server Statistics: A Useful Query Optimizer Tool," August 1999, for details on automatic updating of statistics.) Automatic updating of statistics doesn't mean that plans automatically recompile. The statistics update only if SQL Server recompiles and reoptimizes a plan or if the query optimizer detects that the statistics are out of date. However, to take advantage of updated statistics after SQL Server saves a procedure plan, you need to specify one of the recompile options to invoke the optimizer.
Another feature of stored procedures in SQL Server 7.0 sometimes can cause a recompilation. If you build a temporary table, then create an index on that table, and later in the same procedure add data to the table, your stored procedure might be compiled several times. If SQL Server tries to compile the entire plan at once before executing the statements in the procedure that populate the table, the optimizer has no way to know what the data will be when you execute the procedure. In fact, if a procedure contains the creation of a table, SQL Server compiles it only up to the first reference to the table. At execution time, it executes the plan up to the first reference to the table and recompiles from that point. As you add more data to the table, the procedure might be recompiled again. If SQL Server detects that a table was empty when the procedure was first compiled (e.g., if you'd just created the table), the query processor sets a low recompilation threshold for the table. At present, this threshold is five data-modification statements (INSERT, UPDATE, or DELETE). So every time five or more changes are made to the newly created table, the procedure recompiles.
You can use SQL Server Profiler to watch automatic recompilation occur: Trace the event called SP:Recompile in the Stored Procedures category. If you also trace the event called SP:StmtStarting, you can see at what points the procedure is recompiling. Note that this event shows up in the Profiler's event list only if you enable the Profiler option to show all events. One way to avoid repeated recompilation is to include, at the beginning of the procedure, all data-definition statements dealing with temporary tables and inserting rows into the temporary tables. So if the procedure must be recompiled, the compilation won't happen more than once. Another way to prevent recompilation is to include the query hint KEEP PLAN in your statements that access the temporary tables. This hint is part of the OPTION clause and looks something like
SELECT FROM #newly_built_temp_table OPTION (KEEP PLAN)
This hint tells SQL Server to use any plan that has been saved for this statement, even if you've made several changes to the temporary table.
Limiting the Number of Cached Plans
SQL Server limits the number of plans for each stored procedure. Because plans are reentrant in SQL Server 7.0, limiting plans is much easier than in previous releases. The online documentation states that any procedure can have, at most, two compiled plans (one for parallel plans—those that execute on multiple processors—and one for nonparallel plans), but there are exceptions. Certain situations cause SQL Server to store multiple plans for the same procedure. The most likely situation is a difference in certain SET options, database options, or configuration options. For example, a stored procedure that concatenates strings might compile the concatenation differently depending on whether the option CONCAT_NULL_YIELDS_NULL is on, or whether the corresponding database option is true. If a user executes the procedure with an option on, that execution uses a different plan from the one it would use if the option were off.
As with ad hoc and autoparameterized queries, the system table syscacheobjects can help you see when new plans are created and when existing ones are reused. You can get this information from the following query:
SELECT sql, cacheobjtype, usecounts FROM master..syscacheobjects WHERE objtype = 'Proc'
The value in the sql column is the name of the stored procedure. The value of the cacheobjtype column identifies whether the plan is a compiled plan (the part of a plan that is reentrant) or an executable plan (an instance of the compiled plan that contains information describing a process that is executing the query). In most cases, both compiled and executable plans remain in the memory cache, but you'll find exceptions if you run tests. In particular, if your stored procedure involves any sorting operations, you'll see the compiled plans in cache, but the executable plans aren't saved. Watching the usecounts result column from this query shows you how often a particular plan is reused. In addition, if running this query shows you multiple plans for the same procedure (i.e., the same procedure name shows up more than once with the same cacheobjtype value), you might want to return the value of the setopts column. This value is a bitmap that tells you which session and database options are enabled for the procedure. Two plans for the same procedure usually have different values for setopts.
Although many people assume that recompilation is something to avoid, this assumption is not always valid. If you know that updated statistics can improve the query plan, or if you know that you have wildly different possible parameter values, recompilation can be a good thing. Using stored procedures when developing your applications lets you determine when SQL Server reuses an existing query plan and when SQL Server creates a new one.
About the Author
You May Also Like