Preventing Runaway Connection Growth
Microsoft's Patrick Conlan describes how to keep your connection pool size and number of connections from runaway growth.
April 18, 2005
I'm running SQL Server 2000 Standard Edition on a Windows Server 2003 system with 2GB of RAM, and one of my SQL Servers is showing a large number of connections, hundreds of which have a connection status of EXECUTE. We are using Microsoft's JDBC driver to connect to this server, which started out with 600 connections but now has more than 1000; our other SQL Servers have fewer than 650 connections each.
In addition, sp_who active shows that 260 to 270 of the 1000 connections all have a cmd status of EXECUTE. Response to the Web server is very slow, and the backup database command failed, showing error message 15444: "Cannot make worker threads." I thought our max worker threads value might give us the answer, but the default max value of 255 is reasonable for the database. Why is the number of connections to this server growing abnormally?
Here's a description of what might be happening on your server. This problem is common in the kind of system you're running.
First, a user hits a Web page that calls into the database. Under the covers, the database middleware gets a connection from the connection pool (sp_reset_connection is called to clean up the connection so that it's ready for use). Then, the call encounters a problem, such as a command timeout caused by blocking or by the sheer amount of work it's being asked to do or a non-fatal application error. But the application doesn't handle the problem properly; it ignores the error code, continues processing, and eventually completes, leaving a connection in the pool with a "dead" transaction and, thus, locking in an active state.
Because the dead transaction will hold resources until it times out or is killed, transactions from other users will start to block on the same resources, causing users to hit Refresh on their browsers—and spin up yet more connections. This is when you'll see pool sizes and the number of active connections grow. Another user's call can then pick up the dead connection, which is fatal: Resources that this new call touches will lock and, unless the application is testing for transaction nesting, will stay locked even when the application commits and exits. Unlike rollback, a commit really commits only at the root level of nesting.
Eventually, the dead transaction will time out or be killed. Then, your number of connections will start to decrease. However, during this period, you'll likely see deadlocks in your SQL Server ERRORLOG file caused by blocked transactions holding resources longer than normal and exposing conflicting resource requests that you don't normally see because the locks aren't held long enough to be a problem. When the load on the connection pool decreases, the pool will spin down connections to conserve resources.
To fix this situation, first make sure that you handle any code that touches the database within a Try-Catch block and that the Catch code cleans up any open transactions and database resources by rolling back and closing all open database resources. You can use SQL Server Profiler to look for transactions that never commit and try to find the code path with the missing error handling, but this can be a tedious process. Second, look at the deadlocks in your ERRORLOG, and minimize the chance of them recurring by always accessing data in order through specific access paths (indexes). Keep an eye open for edge conditions such as foreign key locks in tables you might not even be updating directly; these can be difficult to find, but reducing them often leads to big performance improvements by minimizing blocking.
Finally, it's possible that Service Pack 3 (SP3) of the JDBC driver has a bug. I didn't find any bug reports that addressed your situation, but I'd recommend that you discuss the problem with your Microsoft support contact just in case.
—Patrick Conlan
Platform Program Manager
Microsoft Project Team
About the Author
You May Also Like