Spring Cleaning with SQL Server Stored Procedures
While browsing SQL Server’s documentation recently, I found two stored procedures for removing ghost records: sp_clean_db_free_space and sp_clean_db_file_free_space.
April 21, 2011
While browsing through SQL Server’s documentation looking for supplied stored procedures related to one of my current writing projects, I noticed a couple of procedures I had never noticed before: sp_clean_db_free_space and sp_clean_db_file_free_space. Since it looks like spring might be just about here in the Northwest corner of the continental US—which means it’s time to start thinking about spring cleaning—I thought I’d take a look at these procedures.
Related: Debugging Stored Procedures in Visual Studio 2010
It turns out that both of these stored procedures remove ghost records, which are deleted rows that haven’t actually been physically removed. For performance reasons, rows sometimes aren’t removed during an actual DELETE operation but are removed by a background process called Ghost Cleanup. These procedures are provided for use in cases in which security is sometimes of equal importance to performance, and you don’t want the data left on the pages in case someone has access to the physical files and can use particular tools to read the bytes that haven’t been physically deleted. The procedure sp_clean_db_free_space cleans all the ghost records from a database, and sp_clean_db_file_free_space cleans up all the ghost records from a single file.
In most cases, SQL Server is quite efficient at removing ghosts, especially on small test databases with little concurrent activity—which is what I have access to right now on my laptop, and which is why I couldn’t come up with an example to show you some ghost records, then show you that these procedures would remove them. The Ghost Cleanup process is just too efficient on my own SQL Server machines!
I looked around for other commands related to cleaning, and I found DBCC CLEANTABLE, which, according to the documentation, “reclaims space from dropped variable-length columns in tables or indexed views.” I remembered a blog post I had written quite a few years ago that discussed SQL Server not reclaiming space after modifying a column’s length, and I thought perhaps I could rework some of the code in order to come up with an example for how DBCC CLEANTABLE might be useful. The original blog post, titled “Altering the length of a fixed-length column,” discussed increasing the length of a fixed-length column in a table.
I created a table called change_var, in which I was going to create some variable-length columns, INSERT a row, change the maximum length of one of the columns, and then UPDATE the value to use the whole changed length.
USE TestDB -- you can use any test or sample databaseIF EXISTS (SELECT 1 FROM sys.tables WHERE name = 'change_var') DROP TABLE change_var;GOCREATE TABLE change_var(col1 smallint, col2 varchar(2000), col3 varchar(1000));GOINSERT INTO change_var SELECT 1, REPLICATE('a', 2000), REPLICATE('b', 1000);GO--Now alter the second column to increase its max length--to 3000 bytes. The maximum length of a row should now be--just a bit more than 4000 bytes.ALTER TABLE change_var ALTER COLUMN col2 char(3000);GOUPDATE change_var SET col2 = REPLICATE('a', 3000);SELECT col1, col2, DATALENGTH(col2), col3, DATALENGTH(col3) FROM change_var;GO
Everything worked as expected, but I wondered what would happen if I dropped one of the columns and then added another, similar to what I did in my blog post.
ALTER TABLE change_var DROP COLUMN col2;GOALTER TABLE change_var ADD col4 varchar(4000);GO
Since these are variable-length columns, I actually have to put a value into the column to use the whole length, so I then updated the new column.
UPDATE change_var SET col4 = REPLICATE('c', 4000) WHERE col1 = 1;
But things didn’t work as expected. When I first tested this statement, I received the following very unpleasant error: Msg 682, Level 22, State 214, Line 2 Internal error. Buffer provided to read column value is too small. Run DBCC CHECKDB to check for any corruption.
When I tried to reproduce the problem so that I could describe it for this article, the behavior was even stranger. Every time I ran the UPDATE, SQL Server would switch to the master database before trying to perform the UPDATE, and then I would get a 208 error because the change_var table was unknown in master. I found that even though I was in my testdb database when I tried to run the UPDATE, I had to include the USE command in the batch in order to get the 682 message shown above.
USE TestDB;UPDATE change_var SET col4 = REPLICATE('c', 4000) WHERE col1 = 1;
I was getting more and more confused by the strange behavior I was seeing, and trying to figure out how to reproduce it, as well as trying to determine how to avoid it. I had completely forgotten the command I had originally started trying to find a use for. But I finally remembered—and it turned out to be just the command I needed.
DBCC CLEANTABLE(TestDB, change_var);
Now the final UPDATE works just fine.
UPDATE change_var SET col4 = REPLICATE('c', 4000) WHERE col1 = 1;
This certainly doesn’t explain why I was getting the strange behavior, and there obviously is more research to be done. However, often the best immediate solution is just to have the problem go away. And I found that a bit of cleanup was just what I needed. I wonder if cleaning up my desk will also help me solve any outstanding troublesome problems?
About the Author
You May Also Like