SSIS Package Documentor
May 13, 2015
SSIS Package Documenter
Creating an SSIS Package is easy and even lot of reference is available for the same, however when it comes to read or let me say document the content of the package it is a night mare. Especially, when you have many packages in case you are trying to study them or you want to review the coding standards.
I have to deal with a similar requirement and created a utility to read the package code. An SSIS package code is made up of XML made up from a specific XSD. We can read this code using SQL server xquery.
Let me take an example here. We can fetch all the task name details used in the package by querying the executables path till Object name.
value('declare namespace p1="www.microsoft.com/SqlServer/Dts";//DTS:Executable/DTS:Executables/@p1:ObjectName[1]', 'nvarchar(max)')
To query the xml code of the package, we have to load the complete package xml into a table, which can be done using cmdshell command.
First of all load all the package names with path in a temporary table.
SELECT @CommandLine = LEFT('dir "' + @Path + '" /A-D /B /S ', 8000);
INSERT INTO @MyFiles (FullPath)
EXECUTE xp_cmdshell @CommandLine;
Load the package code in the table now one by one
DECLARE @sql NVARCHAR(max);
SET @sql = '
INSERT INTO pkgStats (PackagePath,PackageXML)
select ''@FullPath'' as PackagePath
,cast(BulkColumn as XML) as PackageXML
from openrowset(bulk ''@FullPath'',single_blob) as pkgColumn';
SELECT @sql = REPLACE(@sql, '@FullPath', @FullPath)
EXEC sp_executesql @sql;
The main categories of the package code can be divided into four major parts:
Connection objects
Variables
Control flow tasks
Data flow tasks
The attached stored procedure script queries the complete package xml and generates an output report. The procedure accepts the input path for the package files and iterates over the files to document the package code. It uses a connection to a sql server database which creates temporary tables for the processing.
The output of the procedure returns the complete package details in the form of a grid report.
Report Column Headers | Description |
---|---|
Row ID, Table Row ID | These are all columns used to sequence the report output. RowID is unique for a single package. TableRowID is an incremental sequence number for all the rows of a package. |
Package Path, Package Name | Package Name and Package Path on the drive. |
Category | Describes Package categories into four types - Connection, Variable, Control Flow Task, Data Flow Task. |
Task Path | Path of the Task within a package. |
Object Name, Object Type | Object Name is Name of the task or object as given in package. Object Type is type of task, transformation or variable data type. |
ObjectValue | This is the most critical column showing the value of the Object such as queries, connections. |
Connection Manager | Describes the connection details used inside any task or transformation. |
Source Column, Destination Column | Gives the source and destination column name used inside all the Data flow transformations. |
Derived Value | Gives the derived column transformation expressions used in Derived Column Transformation. |
Is Disabled | Shows True / False for all the Control flow tasks if the task is disabled. |
Parameter Binding Sequence, Parameter Binding Parameter Name, Result Set Parameter Name | Describes all the property values used in Execute SQL Task or OLEDB Source. |
Execute Package Expression, Execute Package Name | Shows the name or the execute package expression used in Execute Package Task. |
Script | Shows the script written in a Script Task. |
Lookup Joins | Details of Lookup Joins used in Lookup Transformation. |
About the Author
You May Also Like