Solve Blocking Problems
Alleviate contention between reading and writing processes
June 27, 2011
In any multi-user database, data integrity must be maintained by preventing more than one process from updating the same data at the same time. Relationaldatabases all implement some kind of transaction isolation model to ensure data consistency. The better-known database vendors use two isolation models:optimistic and pessimistic.
The pessimistic isolation model assumes that many data conflicts will be encountered. It preserves data integrity by avoiding the possibility of conflicts.It does this by locking database objects while a process is changing them. This prevents an object from being changed by more than one process at a time.The pessimistic isolation mode is largely invisible to the application, although it can be tuned a bit using table-level locking in T-SQL code. There'snothing inherently wrong with the pessimistic isolation model, except perhaps the name.
The optimistic isolation model assumes that few conflicts will occur. It maintains data consistency by forcing conflicting transactions to roll back. Itdoes this by giving processes versioned copies of the data they're requesting. If two or more processes update the same data at the same time, the databaseengine uses the version date stamps to decide which change will be accepted. The other processes will get a concurrency error when they try to commit theirchanges. The optimistic isolation model requires that the application gracefully trap and handle concurrency errors.
By default, SQL Server implements the pessimistic isolation model. However, in SQL Server 2005 and later, you can choose to use the optimistic isolationmodel, which Microsoft refers to as snapshot isolation. SQL Server offers two options when using snapshot isolation:
Statement-level row versioning implementation of Read Committed isolation, which is often referred to as Read Committed Snapshot Isolation (RCSI)
Transaction-level row versioning isolation level, which is often referred to as transaction-level snapshot isolation
You might be wondering which of the three models -- pessimistic isolation, RCSI, or transaction-level snapshot isolation -- is better. It's not a questionof better or worse. It's a question of what will work best for your particular database and the applications that use it.
If many data modification statements occur in your database and the potential for data conflicts is high, the default pessimistic isolation model mightwork better for you. If there are many more reading processes than statements that modify data, snapshot isolation might work better for you. Keep in mind,however, that your application must be written to trap concurrency errors if you choose to use transaction-level snapshot isolation. If it wasn't writtento do that, it might require significant effort to rewrite the code.
If your blocking problems come from conflicts between reading and writing processes (which is common in many databases), RCSI might work well for you. Animportant feature of RCSI is that it doesn't require any code changes to legacy applications that have been running under SQL Server's default pessimisticisolation model (except in rare cases in which the application was written to depend on a certain locking behavior). Let's take a closer look at RCSI byexamining how it deals with a common problem, compared with the behavior of the default pessimistic isolation model.
The Problem with Dirty Reads
Conflicts between reading and writing processes can lead to a situation in which data is being changed at the same time it's being read. This is oftencalled a dirty read.
Let's look at an example of what can happen if data is changed while being read. Suppose you have a table named Accounts, which contains information abouta number of different bank accounts. Process A is involved in a transaction to deduct $100 from a savings account and then add it to a checking account. Atthe same time, process B is reading the balances in all accounts in the table and summing the total balance across all accounts. Without some mechanism tokeep the data from being changed during the read operation, process B might read the table after the deduction from the savings account is made but beforeit's added to the checking account. As a result, the balance computed by process B will be off by $100.
The Default Behavior
The default pessimistic isolation mode in SQL Server prevents this problem by requiring a reading process to get a shared lock on the data, preventing anychanges to the data while it's being read. When there's a shared lock on the data, other processes can read that data but no process can get the exclusivelock necessary to change it. Conversely, no process can get a shared lock to read data that's locked by a process changing the data. This is the source ofmany of the blocking problems that occur in the default pessimistic isolation mode. A long-running report or a big update statement can block manyprocesses and slow database operations to a crawl.
To fix this blocking problem, DBAs often apply a READUNCOMMITTED table-level locking hint. (This hint was formerly called NOLOCK, but Microsoft deprecatedthis terminology in UPDATE and DELETE statements, although it's allowed in SELECT statements. For consistency, you might consider using the READUNCOMMITTEDhint in new work.) However, using these hints endangers data consistency by allowing a dirty read.
Although many reading processes don't need perfect consistency in the data, many times DBAs explicitly set the transaction isolation level to ReadCommitted in their code or sprinkle READUNCOMMITTED or NOLOCK hints throughout their code without sufficiently examining the possible data consistencyproblems. You can't really blame them. They're caught between the proverbial rock and hard place. The application must function at a certain minimumperformance level. The choice is often between a READUNCOMMITTED hint or a massive redesign of the application and maybe even the database schema.
RCSI Behavior
The Read Committed isolation level is responsible for controlling data consistency in reading processes. It prevents data from being read while it's beingchanged and prevents data from being changed while it's being read. RCSI applies snapshot isolation only at the Read Committed isolation level. This meansthat:
Writing processes will continue to use the default locking model, but their exclusive locks won't affect reading processes. The locks only affect other processes that try to change the same data.
Reading processes are supplied with versioned copies of the data they need to read. This ensures that the data won't be changed while a process is reading it.
With RCSI, the underlying data might change, but the versioned copy ensures that the reading process has a consistent statement-level view of the data atthe time it was read. A dirty read can't occur, and the blocking situation just described can't happen. Besides mitigating the contention between readingand writing processes, RCSI helps reduce the risk of deadlocks for many of the same reasons it reduces blocking.
The Downside to RCSI
Like transaction-level snapshot isolation, RCSI uses tempdb to store the data versions it manages. RCSI doesn't affect tempdb nearly as much astransaction-level snapshot isolation does, but the effect is significant. If you enable RCSI, you need to watch disk space carefully on the tempdb driveuntil tempdb stabilizes at the necessary size.
There's also a cost for creating and managing version stores. Your server hardware must have some CPU, memory, and disk resources to spare. If your currenthardware is marginal, you might want to upgrade before implementing RCSI. Note that in some circumstances, you might see slower performance on some readsand update statements because of the overhead in managing the version stores.
How to Enable RCSI
Implementing RCSI is simple. All you need to do is run this ALTER DATABASE statement:
ALTER DATABASE database_nameSET READ_COMMITTED_SNAPSHOT ON;
This command won't complete until there are no other active connections in the database.
The database doesn't have to be in single-user mode to execute this command. However, for a busy database, it might be better to set the database to thismode by using one of the available rollback options. Otherwise, your command might take a long time to complete. Here's an example of the T-SQL codeinvolved:
ALTER DATABASE MyDB SET SINGLE_USER WITH ROLLBACK AFTER 20;ALTER DATABASE MyDB SET READ_COMMITTED_SNAPSHOT ON;ALTER DATABASE MyDB SET MULTI_USER;
This form of the rollback command will wait 20 seconds for open transactions to complete before it rolls back any remaining open transactions. If youprefer, you could use ROLLBACK IMMEDIATE, which will start rolling back transactions immediately. There's no point in using the NO_WAIT option because,like READ_COMMITTED_SNAPSHOT, it too waits until there are no active sessions running in the database.
Great Potential
RCSI has great potential for fixing systems that have blocking problems due to contention between reading and writing processes. RCSI was introduced in SQLServer 2005, but in my experience as a consultant, I haven't seen it widely adopted. This is puzzling because RCSI eliminates a common blocking problemthat can be quite severe. Although there's a cost to using RCSI, in most cases the cost is small compared with the potential benefit.
Before implementing RCSI, consider whether fixing your blocking problems outweighs the potential cost. As always, test carefully before applying the changeto a production environment.
About the Author
You May Also Like