What You Need to Know about Adaptive Joins over Rowstore

In this article, I focus on adaptive joins. I first demonstrate an example for the classic plan reuse shortcoming with parameter-sensitive queries. I then explain how adaptive joins can alleviate the problem. I also explain how you can enable adaptive joins over rowstore despite the fact that, formally, in the first cycle of this feature, it is supported only as a batch mode operator over columnstore.

Itzik Ben-Gan

May 9, 2017

15 Min Read
What You Need to Know about Adaptive Joins over Rowstore

SQL Server 2017 CTP 2.0 introduces support for adaptive query processing capabilities. Traditionally, the query optimizer made all of its plan choices ahead of query execution, and SQL Server wasn’t able to change those during execution. With adaptive query processing, SQL Server is able to dynamically adapt its optimization choices to actual run time conditions such as cardinality misestimations.

The first set of improvements includes batch mode adaptive memory grant feedback, batch mode adaptive joins and interleaved execution for multi-statement table valued functions—all of which are eloquently described by Microsoft’s Joe Sack. (Watch out for this guy; his stuff is gold!)

In this article, I focus on adaptive joins. I first demonstrate an example for the classic plan reuse shortcoming with parameter-sensitive queries. I then explain how adaptive joins can alleviate the problem. I also explain how you can enable adaptive joins over rowstore despite the fact that, formally, in the first cycle of this feature, it is supported only as a batch mode operator over columnstore.

Parameter-Sensitive Join Queries

We’ve all been there: You have a stored procedure with a parameter-sensitive query with a join. SQL Server optimizes the query for the sniffed parameter value, and caches the plan. There’s a classic loop versus hash join algorithm choice that the optimizer needs to make (say the conditions are not adequate for merge) based on a cardinality estimate for the outer input of the join. With a small number of outer rows, usually a loop algorithm is optimal. Beyond a certain threshold, a hash algorithm is optimal. Whichever choice, the optimizer makes for the sniffed parameter value when it optimizes the query, and that plan is cached and reused until a recompile event takes place—even if the procedure is executed regularly with widely varying input values.

To demonstrate this, I’ll create a stored procedure in the sample database PerformanceV3. You can download the source code to create the sample database from here.

I’ll use the following parameter-sensitive query in my stored procedure:

SELECT C.custid, C.custname, O.orderid, O.empid, O.shipperid, O.orderdateFROM dbo.Customers AS C  INNER JOIN dbo.Orders AS O    ON O.custid = C.custidWHERE C.custname LIKE @custprefix + N'%'  AND O.orderdate BETWEEN @fromdate AND @todate;

The user passes a prefix for the customer name and a date range as inputs and gets back orders placed by customers with the specified prefix in their company name during the input period.

Assuming that you’re currently not considering using columnstore indexing on the tables in question, consider what the optimal rowstore indexes to support this query are. The optimal index against Customers is one with custname as the key (to support the filter) and custid as an included column (for coverage). As for the recommended indexing on Orders, it depends on whether you get a small number of qualifying customers or a large one. With a small number, a nested loops algorithm would be ideal given an index with custid and orderdate forming the index key list (to support the join plus filter predicates), and orderid, empid, shipperid included (for coverage). With a large number of qualifying customers, a hash match algorithm would be ideal given an index with orderdate as the key (to support the date range filter), and the rest of the columns from Orders included (for coverage).

The Orders table already has a clustered index with orderdate as the key. Run the following code to create the remaining recommended indexes to support optimal nested loops and hash match algorithms, depending on the cardinality of the filter against Customers:

SET NOCOUNT ON;USE PerformanceV3; -- http://tsql.solidq.com/SampleDatabases/PerformanceV3.zipDROP INDEX IF EXISTS idx_nc_cn_i_cid ON dbo.Customers;DROP INDEX IF EXISTS idx_nc_cid_od_i_oid_eid_sid ON dbo.Orders;DROP INDEX IF EXISTS idx_yesplease ON dbo.Orders;CREATE INDEX idx_nc_cn_i_cid   ON dbo.Customers(custname) INCLUDE(custid);CREATE INDEX idx_nc_cid_od_i_oid_eid_sid  ON dbo.Orders(custid, orderdate)  INCLUDE(orderid, empid, shipperid);

Run the following code to create the GetOrders procedure with the aforementioned query:

CREATE OR ALTER PROC dbo.GetOrders  @custprefix NVARCHAR(200) = N'',  @fromdate AS DATE = '19000101',  @todate AS DATE = '99991231'ASSET NOCOUNT ON;SELECT C.custid, C.custname, O.orderid, O.empid, O.shipperid, O.orderdateFROM dbo.Customers AS C  INNER JOIN dbo.Orders AS O    ON O.custid = C.custidWHERE C.custname LIKE @custprefix + N'%'  AND O.orderdate BETWEEN @fromdate AND @todateOPTION (LABEL = 'F1352E2F-866A-4A10-B660-2875D18B05D8');GO

Note that I added a label with a specific GUID to the query just for troubleshooting purposes in order to be able to easily track down the plan in cache using the following query:

SELECT CP.usecounts, CP.plan_handle, ST.textFROM sys.dm_exec_cached_plans AS CP  CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS STWHERE ST.text LIKE '%F1352E2F-866A-4A10-' + 'B660-2875D18B05D8%';

Run the following code to enable I/O and time statistics to measure the performance of the procedure executions:

SET STATISTICS IO, TIME ON;

Run the following code to execute the procedure for the first time (execution 1) given a customer prefix that is highly selective:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]1000',  @fromdate = '20140101',  @todate = '20140430';

The plan for execution 1 is shown in Figure 1.

Figure 1 – Plan for execution 1

The purpose of the four operators at the top-right part of the plan is to obtain the qualifying customers from the Customers table, to be used as the outer input of the join with Orders. The Constant Scan and Compute Scalar operators convert the LIKE pattern to two delimiters of a range, and then the top-middle Nested Loops operator applies a seek plus range scan to obtain the qualifying customers from the covering index on the Customers table. Notice that the estimated number of customer rows is very small (2), and similarly the actual number is quite small as well (11). With such a small number of estimated customer rows, the optimal join algorithm is a loop algorithm, as evidenced by the optimizer’s choice (the top-left Nested Loops operator).

I got the following I/O and CPU stats for execution 1 on my laptop: logical reads: 35, cpu: 0 ms.

This query plan is now cached.

Run the following code to execute the procedure for the second time (execution 2) given a customer prefix that results in returning all 20,000 existing customers:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]',  @fromdate = '20140101',  @todate = '20140430';

The plan for execution 2 is shown in Figure 2.

Figure 2 – Plan for execution 2

Had there not been a cached plan, normally, the optimizer would have chosen a hash join algorithm. However, since there was a cached plan, SQL Server simply reused it, resulting in a large I/O hit. I got the following I/O and CPU stats for this execution: logical reads: 60613, cpu: 250 ms.

You can verify that plan reuse occurred by using the following query:

SELECT CP.usecounts, CP.plan_handle, ST.textFROM sys.dm_exec_cached_plans AS CP  CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS STWHERE ST.text LIKE '%F1352E2F-866A-4A10-' + 'B660-2875D18B05D8%';

I got the following output on my machine (your plan handle will naturally be different):

usecounts   plan_handle                                         text----------- ------------------------------------------------------------------------------------------- ------------------------------2       0x05000600B982EE5A304AC1580002000001000000000000000000000000000000000000000000000000000000  CREATE   PROC dbo.GetOrders...

Notice the reported use counts is 2 indicating the plan was reused.

Of course, you could have faced the opposite situation if the first execution was with a nonselective input and a subsequent execution with a selective one. To demonstrate this, first run the following code to free the cached plan (replace the handle in my code with the one you got):

DBCC FREEPROCCACHE ( 0x05000600B982EE5A304AC1580002000001000000000000000000000000000000000000000000000000000000 );

Verify there’s no plan for our query in cache:

SELECT CP.usecounts, CP.plan_handle, ST.textFROM sys.dm_exec_cached_plans AS CP  CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS STWHERE ST.text LIKE '%F1352E2F-866A-4A10-' + 'B660-2875D18B05D8%';

You should get an empty output:

usecounts   plan_handle                                         text----------- ------------------------------------------------------------------------------------------- ------------------------------

Execute the procedure for the third time, providing the nonselective customer prefix:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]',  @fromdate = '20140101',  @todate = '20140430';

The plan I got for execution 3 is shown in Figure 3.

Figure 3 – Plan for execution 3

This time the optimizer aptly chose a hash join algorithm. Also notice that since the cardinality estimate against Customers was for 20,000 rows (all customers), the optimizer didn’t bother using a seek against the index on Customers, rather a scan. I got the following performance statistics for execution 3: logical reads: 2163, cpu: 141 ms.

Now the plan with the hash join is cached.

Run the following code to execute the procedure for the fourth time, providing the selective customer prefix as input:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]1000',  @fromdate = '20140101',  @todate = '20140430';

Figure 4 shows the plan that I got for execution 4.

Figure 4 – Plan for execution 4

The plan was reused, resulting in a performance penalty. I got the following performance stats for this execution: logical reads: 2163, cpu: 63 ms.

Use the following code to verify that the cached plan was reused:

SELECT CP.usecounts, CP.plan_handle, ST.textFROM sys.dm_exec_cached_plans AS CP  CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS STWHERE ST.text LIKE '%F1352E2F-866A-4A10-' + 'B660-2875D18B05D8%';

This query generated the following output on my system:

usecounts   plan_handle                                         text----------- ------------------------------------------------------------------------------------------- ------------------------------2       0x05000600B982EE5A304AC1580002000001000000000000000000000000000000000000000000000000000000  CREATE   PROC dbo.GetOrders...

Use the following code to free the cached plan (again, replace the plan handle with the one that you got):

DBCC FREEPROCCACHE ( 0x05000600B982EE5A304AC1580002000001000000000000000000000000000000000000000000000000000000 );

Verify that there’s currently no plan is in cache for the query:

SELECT CP.usecounts, CP.plan_handle, ST.textFROM sys.dm_exec_cached_plans AS CP  CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS STWHERE ST.text LIKE '%F1352E2F-866A-4A10-' + 'B660-2875D18B05D8%';

This query should generate an empty output:

usecounts   plan_handle                                         text----------- ------------------------------------------------------------------------------------------- ------------------------------

Adaptive Joins over Rowstore

A common way to handle procedures with parameter-sensitive queries like in our example is to force a recompile of the query. This results in optimization that targets the current input values in every execution. However, it comes at the cost of a recompilation in every execution. In SQL Server 2017, instead of forcing a recompile in every execution, if your query qualifies for adaptive join optimization you could allow plan reuse, and at the same time have the optimal join strategy chosen at run time based on actual number of rows found. If the right conditions are met, SQL Server considers using an adaptive join operator that has an outer input and two inner inputs—one for a hash join and another for a loop join. The operator has a property with a row threshold that is computed based on cost estimates for the possible strategies that define the tipping point for choosing a hash versus a loop strategy. If at run time the outer input’s row count is greater than or equal to the threshold, it will stick to a hash strategy. If it’s lower than the threshold, it will go with a loop strategy.

So what are the conditions that need to be met for SQL Server to consider using an adaptive join? Joe Sack details those in his blog on the topic:

  • The database compatibility level is 140.

  • The join is eligible to be executed both by an indexed nested loop join or a hash join physical algorithm.

  • The hash join uses batch mode – either through the presence of a Columnstore index in the query overall or a Columnstore indexed table being referenced directly by the join.

  • The generated alternative solutions of the nested loop join and hash join should have the same first child (outer reference).

Notice the third item in Joe’s list indicating the batch mode over columnstore requirement. Later in the blog, in the Q&A part, Joe also addresses the possibility for supporting adaptive joins to include row mode:

“Will you be expanding the scope of batch mode adaptive joins to include row mode?

This first version supports batch mode execution, however we are exploring row mode as a future possibility as well.”

So, formally, adaptive joins are initially supported only over columnstore. But I would like to remind you of a trick that I covered in my column on the batch mode Window Aggregate operator, showing how to enable batch processing over rowstore by creating a dummy filtered empty columnstore index. As it turns out, the same trick can also enable the use of adaptive joins over rowstore already in SQL Server 2017 CTP 2.0. The point is, as long as a columnstore index is present on one of the tables participating in the query, the optimizer can consider using adaptive joins even if it doesn’t actually use a columnstore index in the plan. So, with this in mind, run the following code to create such an index on the Orders table to enable adaptive joins with our procedure:

CREATE NONCLUSTERED COLUMNSTORE INDEX idx_yesplease  ON dbo.Orders(orderid) WHERE orderid = -1 AND orderid = -2;

Now execute the procedure for the fifth time, with a selective input customer prefix:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]1000',  @fromdate = '20140101',  @todate = '20140430';

Figure 5 shows the plan that I got for this execution on my system.

Figure 5 – Plan for execution 5

Observe the new Adaptive Join operator. The outer input of the join is the same as shown earlier in Figure 1—the relevant customer rows from the Customers table. The middle branch of the plan represents the inner input of the join in case a hash algorithm is chosen. The bottom branch of the plan represents the inner input of the join in case a loop algorithm is chosen.

Observe the Adaptive Threshold Rows property, which, based on the cost estimates for our query, defines 932.911 rows as the threshold for determining which join algorithm to use. An actual number of rows from the outer input that is greater than or equal to this number would cause the operator to use a hash algorithm. Notice that since in our case the actual number of rows is 11, the inner input for the hash join is not activated at all (number of executions 0). In our case the operator switches to a loop algorithm, activating the inner input for the loop join 11 times since there are 11 outer rows. You can see that, based on the initial cardinality estimate, the property Estimated Join Type is Nested Loops. Currently the property Actual Join Type says AdaptiveJoin, but the plan is to show the correct actual join type chosen in a later build post CTP 2.0.

I got the following I/O and CPU stats for this execution: logical reads: 35, cpu: 0 ms.

This plan is now cached.

Execute the procedure for the sixth time, using a nonselective input:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]',  @fromdate = '20140101',  @todate = '20140430';

The cached plan is reused. Figure 6 has the actual plan that I got for this execution on my system.

Figure 6 – Plan for execution 6

This time, since the actual number of outer rows did not fall below the threshold, the adaptive join applied a hash algorithm. Observe that the inner input of the hash join is activated (number of execution of the middle branch in the plan is 0), and the inner input of the loop join isn’t activated at all (number of execution of the bottom branch in the plan is 0).

I got the following performance stats for this execution on my system: logical reads: 2163, cpu: 203 ms.

Run the following code to verify that plan reuse occurred:

SELECT CP.usecounts, CP.plan_handle, ST.textFROM sys.dm_exec_cached_plans AS CP  CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS STWHERE ST.text LIKE '%F1352E2F-866A-4A10-' + 'B660-2875D18B05D8%';

I got the following output on my system, showing a usecounts value of 2, indicating that plan reuse occurred:

usecounts   plan_handle                                         text----------- ------------------------------------------------------------------------------------------- ------------------------------2       0x05000600B982EE5A304AC1580002000001000000000000000000000000000000000000000000000000000000  CREATE   PROC dbo.GetOrders...

Use the following code to free the cached plan (again, replace the plan handle in the code with the one that you got):

DBCC FREEPROCCACHE ( 0x05000600B982EE5A304AC1580002000001000000000000000000000000000000000000000000000000000000 );

Run the following code to verify that there’s currently no plan is in cache for our query:

SELECT CP.usecounts, CP.plan_handle, ST.textFROM sys.dm_exec_cached_plans AS CP  CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS STWHERE ST.text LIKE '%F1352E2F-866A-4A10-' + 'B660-2875D18B05D8%';

This query should return an empty set.

Keep in mind that adaptive joins address only the loop versus hash algorithm choice in parameter-sensitive queries; this feature is not a magic bullet that will solve all parameter-sensitive query performance problems. As an example, consider the following seventh execution of our procedure with a nonselective input, when there’s currently no plan in cache:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]',  @fromdate = '20140101',  @todate = '20140430';

Figure 7 shows the plan that I got for this execution.

Figure 7 – Plan for execution 7

The Adaptive Join operator aptly picks a hash join algorithm here. But also observe that due to the fact that the estimated number of rows from Customers is 20,000, which is the total number of customers in the table, the optimizer chose to scan the index on customers instead of using a seek plus a range scan. Now this plan is cached.

Execute the procedure for the eight time, passing a selective input:

EXEC dbo.GetOrders  @custprefix = N'Cust[_]1000',  @fromdate = '20140101',  @todate = '20140430';

Figure 8 shows the plan that I got for this execution.

Figure 8 – Plan for execution 8

The cached plan is reused. The Adaptive Join operator does its job well in picking a loop algorithm here, which is fine. However, due to the fact that the plan was reused, the plan still scans the index on Customers, even though clearly a seek plus range scan is more optimal for our selective input. You could, for instance, consider tackling this by adding a FORCESEEK hint against Customers, which would be a reasonable option for both selective and nonselective inputs since the index in question is a covering one. The point being, other than the specific problem that adaptive joins--and the other adaptive query processing enhancements--solve, you’ll need to tackle other parameter-sensitive query performance problems just like you used to until now.

When you’re done, run the following code to clean up the objects that you created for the examples in this article:

DROP INDEX IF EXISTS idx_nc_cn_i_cid ON dbo.Customers;DROP INDEX IF EXISTS idx_nc_cid_od_i_oid_eid_sid ON dbo.Orders;DROP INDEX IF EXISTS idx_yesplease ON dbo.Orders;DROP PROC IF EXISTS dbo.GetOrders;

Conclusion

In this article, I focused on enabling adaptive joins over rowstore in SQL Server 2017 CTP 2.0. Chances are that future versions of SQL Server will not require you to use any tricks, rather will support adaptive joins over rowstore, as well as batch processing over rowstore, naturally.

Adaptive query processing is such a great addition to SQL Server that will doubtless improve the performance of our queries. Thanks to Joe Sack and the team who worked on these features. Let’s hope that this first set of adaptive query processing improvements is just the tip of the iceberg and that we’ll see many more added in the future.

 

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like