Gathering Statistics
Readers help Dan the DBA write the SQL code required to gather statistics on filegroups and files in his databases.
June 9, 2003
Congratulations to Brian Andrews, a systems developer in Alexandria, Virginia, and Nathan Hassan, a training manager for Inforsys PLC, in Ababa, Ethiopia. Brian won first prize of $100 for the best solution to the June Reader Challenge, "Gathering Statistics." Nathan won second prize of $50. Here’s a recap of the problem and the solution to the June Reader Challenge.
Problem:
Dan manages databases hosted on SQL Server 2000 and needs to write a script that gathers statistics on files and filegroups in every database. Dan wants to obtain filegroup attributes and statistics on the disk space that each filegroup uses. He’s also interested in the files that contain each filegroup, their attributes and space used, growth parameters for each file, reads performed on a file, and so on. He wants to use this information for capacity-planning and monitoring purposes, such as keeping track of file growth. Help Dan write SQL code that will let him easily obtain this information.
Solution:
Dan queries the system tables sysfilegroups and sysfiles to get details about filegroups and files in a database, respectively. He creates his own views on these tables to easily query them from his script. In addition to using the information that the system tables provide, Dan uses the VIRTUALFILESTATS() system function to get I/O statistics for each database file.
Using the system tables’ details as documented in SQL Server Books Online (BOL), Dan obtains the information he needs. For example, Dan can create a view called DB_FILEGROUPS that provides attributes of each filegroup and the space used by each filegroup, as the following definition shows:
IF OBJECTPROPERTY( OBJECT_ID('dbo.DB_FILEGROUPS') , 'IsView') = 1 DROP VIEW dbo.DB_FILEGROUPS GO CREATE VIEW dbo.DB_FILEGROUPS ( FILEGROUP_ID, FILEGROUP_NAME, IS_PRIMARY, IS_USERDEFINED, IS_READONLY, IS_DEFAULT, SPACE_USED_MB, SPACE_USED_PAGES)ASSELECT fg.groupid, fg.groupname, CASE fg.groupname WHEN 'PRIMARY' THEN 'YES' ELSE 'NO' END, CASE FILEGROUPPROPERTY ( fg.groupname, 'IsUserDefinedFG' ) WHEN 1 THEN 'YES' ELSE 'NO' END, CASE FILEGROUPPROPERTY ( fg.groupname, 'IsReadOnly' ) WHEN 1 THEN 'YES' ELSE 'NO' END, CASE FILEGROUPPROPERTY ( fg.groupname, 'IsDefault' ) WHEN 1 THEN 'YES' ELSE 'NO' END, CAST( (f.TotalFileSize * 8) / 1024. AS numeric( 10, 2 ) ), f.TotalFileSize FROM ( SELECT fg1.groupid, fg1.groupname FROM sysfilegroups AS fg1 UNION ALL SELECT 0, 'LOG' ) AS fg JOIN ( SELECT f1.groupid, SUM( f1.size ) FROM sysfiles AS f1 GROUP BY f1.groupid ) AS f( groupid, TotalFileSize ) ON f.groupid = fg.groupidGO
In the DB_FILEGROUPS view, Dan can use the built-in system function FILEGROUPPROPERTY() to obtain attributes such as whether the filegroup is user-defined, read-only, or the default.
Next, to obtain the attributes of each file in a filegroup, Dan queries the sysfiles system table. By joining the table with the VIRTUALFILESTATS() system function, he obtains additional usage characteristics for the files. First, he creates a view called DB_FILES that provides the details of each file in a filegroup:
IF OBJECTPROPERTY( OBJECT_ID('dbo.DB_FILES') , 'IsView') = 1 DROP VIEW dbo.DB_FILES GO CREATE VIEW dbo.DB_FILES ( FILEGROUP_ID, FILEGROUP_NAME, FILE_ID, FILE_LOGICAL_NAME, FILE_PHYSICAL_NAME, IS_PRIMARY_FILE, IS_READONLY, IS_LOGFILE, SIZE_MB, SPACE_USED_MB, MAX_SIZE_MB, GROWTH, [#READS], [#WRITES], BYTES_READ, BYTES_WRITTEN, IO_STALL_MILLISECONDS, SIZE_PAGES, SPACE_USED_PAGES, MAX_SIZE_PAGES)ASSELECT COALESCE( fg.groupid, 0 ), COALESCE( fg.groupname, 'LOG' ), f.fileid, f.name, f.filename, CASE FILEPROPERTY ( f.name, 'IsPrimaryFile' ) WHEN 1 THEN 'YES' ELSE 'NO' END, CASE FILEPROPERTY ( f.name, 'IsReadOnly' ) WHEN 1 THEN 'YES' ELSE 'NO' END, CASE FILEPROPERTY ( f.name, 'IsLogFile') WHEN 1 THEN 'YES' ELSE 'NO' END, CAST( ( f.size * 8 ) / 1024. AS numeric( 10, 2 ) ), CAST( ( FILEPROPERTY( f.name, 'SpaceUsed' ) * 8 ) / 1024. AS numeric( 10, 2 ) ), CASE WHEN f.maxsize > 0 THEN STR( ( f.maxsize * 8 ) / 1024. , 20, 2 ) + 'MB' WHEN f.maxsize = 0 THEN 'NO GROWTH' WHEN f.maxsize = -1 THEN 'UNLIMITED' END, CASE WHEN f.growth = 0 THEN 'NONE' WHEN f.status & 0x100000 > 0 THEN STR( f.growth, 20 ) + '%' ELSE STR( ( f.growth * 8 ) / 1024., 20, 0 ) + 'MB' END, fs.NumberReads, fs.NumberWrites, fs.BytesRead, fs.BytesWritten, fs.IOStallMS, f.size, FILEPROPERTY( f.name, 'SpaceUsed' ), f.maxsize FROM sysfiles AS f LEFT JOIN sysfilegroups AS fg ON fg.groupid = f.groupid JOIN ::fn_virtualfilestats( DEFAULT, DEFAULT ) AS fs ON fs.fileid = f.fileid AND fs.DbId = DB_ID() GO
After creating these views on each database that he monitors, all Dan has to do when he wants to check the health of his database files and filegroups is run the following commands:
SELECT * FROM db_filegroupsGOSELECT * FROM db_filesGO
Dan can alternatively create a stored procedure, such as sp_UtilFileInfo (which Listing 1 shows), in the master database to return information from the sysfilegroups and sysfiles system tables and the FILEGROUPPROPERTY() and FILEPROPERTY() functions. He can mark this stored procedure as a system object by using the undocumented stored procedure sp_MS_marksystemobject, enabling the stored procedure to be invoked within the context of any database just as system stored procedures are invoked. For example, you could run EXEC pubs..sp_UtilFileInfo to return information from the Pubs database, but you should use this technique with care because it isn’t documented and might not work in future versions of SQL Server. The stored-procedure approach limits the searching and querying ability that the views approach easily provides.
To consolidate information from all databases, Dan can then create a stored procedure such as sp_UtilFileInfoAllDB (which Listing 2 shows) that can execute sp_UtilFileInfo in each database and return the information.
JULY READER CHALLENGE:
Now, test your SQL Server savvy in the July Reader Challenge, "Writing a Stored Procedure" (below). Submit your solution in an email message to [email protected] by June 19. SQL Server MVP Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.
George is a database analyst for a company that provides realtime reporting applications. His databases run on SQL Server 2000 and 7.0. George is receiving timeouts from the stored procedures that the Web pages call. Upon investigation, he determines that a particular query on large aggregation tables under heavy loads is causing the timeouts. All these tables have a primary key or composite index consisting of several columns—the tables of interest are partitioned, and new tables are created. The timeouts are happening on these newly created tables, resulting in the creation of auto-statistics on secondary columns (any column other than the first.) When the database server is under heavy load, multiple queries execute for the first time against the new tables, and the auto-statistics creation process can’t complete quickly. Subsequently, users either get tired of waiting and cancel the execution of the Web page or the pages timeout.
George temporarily circumvents the problem by manually executing the stored procedure from Query Analyzer, where the query finishes without intervention. By executing the stored procedure manually, George makes sure the auto-statistics are created on the secondary columns of the index, and subsequent executions of the stored procedure from the Web page work within the timeout interval. To automate the process of creating statistics on the secondary columns of the index, George decides to write a stored procedure that does the following:
Retrieves a list of tables based on a pattern of text or specific search value (For convenience, partitioning of tables is done by using a common prefix, hence the need for this parameter.)
Determines the index that contains a given set of columns
Creates statistics on the secondary columns of the index if none exist
Help George write this stored procedure.
About the Author
You May Also Like