Reporting Book Sales

Readers help Phil import data into the StoreSalesSummary table by unpivoting the month columns in an Excel spreadsheet.

Umachandar Jayachandran

February 10, 2004

7 Min Read
ITPro Today logo

Congratulations to Narasimhan Jayachandran, a database management consultant for HTC Global Services in Troy, Michigan, and John Hanson, vice president of operations for MEDePass, Inc.. Narasimhan won first prize of $100 for the best solution to the February Reader Challenge, "Reporting Book Sales." John won second prize of $50. Here’s a recap of the problem and the solution to the February Reader Challenge.

Problem:


Phil generates reports for a sales team in a company that sells books. The company stores publication data for its books in a SQL Server 2000 database. Phil receives sales data updates in a Microsoft Excel file that has the following header labels for columns: stor_id, yr, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, and Dec. The spreadsheet contains quantity of books sold in each store by year and month. You can generate sample data for the Excel spreadsheet from the Sales table in the Pubs database by using the following code:

SELECT s.stor_id, year(s.ord_date) AS yr,    SUM(CASE month(s.ord_date) WHEN 1 THEN s.qty ELSE 0 END) AS Jan,    SUM(CASE month(s.ord_date) WHEN 2 THEN s.qty ELSE 0 END) AS Feb,    SUM(CASE month(s.ord_date) WHEN 3 THEN s.qty ELSE 0 END) AS Mar,    SUM(CASE month(s.ord_date) WHEN 4 THEN s.qty ELSE 0 END) AS Apr,    SUM(CASE month(s.ord_date) WHEN 5 THEN s.qty ELSE 0 END) AS May,    SUM(CASE month(s.ord_date) WHEN 6 THEN s.qty ELSE 0 END) AS Jun,    SUM(CASE month(s.ord_date) WHEN 7 THEN s.qty ELSE 0 END) AS Jul,    SUM(CASE month(s.ord_date) WHEN 8 THEN s.qty ELSE 0 END) AS Aug,    SUM(CASE month(s.ord_date) WHEN 9 THEN s.qty ELSE 0 END) AS Sep,    SUM(CASE month(s.ord_date) WHEN 10 THEN s.qty ELSE 0 END) AS Oct,    SUM(CASE month(s.ord_date) WHEN 11 THEN s.qty ELSE 0 END) AS Nov,    SUM(CASE month(s.ord_date) WHEN 12 THEN s.qty ELSE 0 END) AS Dec    FROM Sales AS s    GROUP BY s.stor_id, year(s.ord_date)

Phil needs to import the data from the Excel file into a SQL Server table called StoreSalesSummary by unpivoting the month columns from the spreadsheet. You can create the StoreSalesSummary table by using the following code:

CREATE TABLE StoreSalesSummary (    stor_id int NOT NULL,    qty int NOT NULL,    yr smallint NOT NULL,    mn tinyint NOT NULL,    PRIMARY KEY(stor_id, yr, mn)    )

Help Phil import only the Excel spreadsheet data into the StoreSalesSummary table, then insert and update each store’s sales from the spreadsheet. Import only the stores with a nonzero quantity value for any month.

Solution:


Phil can use the OPENDATASOURCE() rowset function in SQL Server 2000 to read the Excel file data as a table. If the Excel file contains a worksheet called YearlySales, he can use the following code to read the Excel spreadsheet as a table:

SELECT *    FROM OpenDataSource( 'Microsoft.Jet.OLEDB.4.0',    'Data Source="C:DataYearlySalesSummary.xls";User ID=Admin;Password=    ;Extended properties=Excel 5.0')...[YearlySales$] s

Phil could also set up a linked server connection to the Excel file or use the OPENROWSET() function to read the data from the Excel file as a table.

Now Phil can manipulate the data from the Excel spreadsheet in a SQL Server table. The following code puts the Excel data into a temporary table called #Sales in the desired format:

SELECT s1.stor_id, s1.qty, s1.yr, s1.mn    INTO #Sales    FROM (    SELECT s.stor_id,    CASE m.mn    WHEN 1 THEN s.[Jan]    WHEN 2 THEN s.[Feb]    WHEN 3 THEN s.[Mar]    WHEN 4 THEN s.[Apr]    WHEN 5 THEN s.[May]    WHEN 6 THEN s.[Jun]    WHEN 7 THEN s.[Jul]    WHEN 8 THEN s.[Aug]        WHEN 9 THEN s.[Sep]    WHEN 10 THEN s.[Oct]    WHEN 11 THEN s.[Nov]    WHEN 12 THEN s.[Dec]    END AS qty,    s.yr,    m.mn    FROM OpenDataSource( 'Microsoft.Jet.OLEDB.4.0',    'Data Source="C:DataYearlySalesSummary.xls";User ID=Admin;Password=        ;Extended properties=Excel 5.0')...[YearlySales$] s    CROSS JOIN (    SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4    UNION ALL    SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8    UNION ALL    SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12    ) AS m(mn)    ) s1    WHERE s1.qty > 0.0SELECT * FROM #Sales

The query’s cross join lets Phil generate a row for each month column. Once Phil converts the columns to rows, he can obtain the correct quantity value by using a CASE expression in the SELECT list, based on the pseudo month column, mn. Finally, the code’s WHERE clause filters the rows that have a nonzero qty value for each year and month combination.

Now Phil can use the data from the temporary table #Sales to update the data in the StoreSalesSummary table, as the following code shows:

-- Update the existing rows first:        UPDATE s2        SET qty = s2.qty + s1.qty        FROM #Sales AS s1        JOIN StoreSalesSummary AS s2        ON s2.stor_id = s1.stor_id        AND s2.yr = s1.yr        AND s2.mn = s1.mn
-- Add the new rows next:INSERT INTO storesalessummarySELECT s1.stor_id, s1.qty, s1.yr, s1.mn    FROM #Sales AS s1    WHERE NOT EXISTS(SELECT *    FROM StoreSalesSummary AS s2    WHERE s2.stor_id = s1.stor_id    AND s2.yr = s1.yr    AND s2.mn = s1.mn)SELECT * FROM #Sales

Phil could also unpivot the Excel spreadsheet’s month columns by using a UNION ALL query. The following example assumes that Phil has dumped the data from the worksheet into a temporary table called #ExcelSales:

SELECT stor_id, yr, 1 AS mn, Jan AS qty    FROM #ExcelSales    WHERE Jan > 0    UNION ALLSELECT stor_id, yr, 2 AS mn, Feb AS qty    FROM #ExcelSales    WHERE Feb > 0    UNION ALLSELECT stor_id, yr, 3 AS mn, Mar AS qty    FROM #ExcelSales    WHERE Mar > 0    UNION ALLSELECT stor_id, yr, 4 AS mn, Apr AS qty    FROM #ExcelSales    WHERE Apr > 0    UNION ALLSELECT stor_id, yr, 5 AS mn, May AS qty    FROM #ExcelSales    WHERE May > 0    UNION ALLSELECT stor_id, yr, 6 AS mn, Jun AS qty    FROM #ExcelSales    WHERE Jun > 0    UNION ALLSELECT stor_id, yr, 7 AS mn, Jul AS qty    FROM #ExcelSales    WHERE Jul > 0    UNION ALLSELECT stor_id, yr, 8 AS mn, Aug AS qty    FROM #ExcelSales    WHERE Aug > 0    UNION ALLSELECT stor_id, yr, 9 AS mn, Sep AS qty    FROM #ExcelSales    WHERE Sep > 0    UNION ALLSELECT stor_id, yr, 10 AS mn, Oct AS qty    FROM #ExcelSales    WHERE Oct > 0    UNION ALLSELECT stor_id, yr, 11 AS mn, Nov AS qty    FROM #ExcelSales    WHERE Nov > 0    UNION ALLSELECT stor_id, yr, 12 AS mn, Dec AS qty    FROM #ExcelSales    WHERE Dec > 0

The cross-join technique typically performs better than the UNION ALL query for unpivoting operations. The cross-join involves less code and fewer joins on the main table. You can compare the performance of both solutions by looking at their execution plan costs, I/O statistics, and the time SQL Server takes to execute the queries.

MARCH READER CHALLENGE:


Now, test your SQL Server savvy in the March Reader Challenge, "Restoring a Database" (below). Submit your solution in an email message to [email protected] by February 19. Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.

Keith is the DBA for a company that runs several SQL Server 2000 data warehouses. He has a crucial database that contains the fact and dimension tables for the data warehouse. Keith performs full backups every week and periodic log backups after the database backup. The company’s development team has requested the latest copy of the database along with any log backups for testing purposes. The team needs to run ad hoc queries against the database at different points in time to collect statistics. Keith uses the following sequence of steps to create the database and log backups:

CREATE DATABASE DWALTER DATABASE DW SET recovery bulk_loggedRAISERROR ('-- Performing full backup...', 0, 1) WITH nowait-- Full backup of databaseBACKUP DATABASE DW TO DISK = 'c:tempDW.bak' WITH initGO-- Create table t1CREATE TABLE DW..t1 ( i int IDENTITY )INSERT INTO DW..t1 DEFAULT VALUES-- Initial log backupRASIERROR ('-- Initial log backup...', 0, 1) WITH nowaitBACKUP log DW TO DISK = 'c:tempDW.trn.1' WITH init-- Create table t2 for bulk loadingCREATE TABLE DW..t2 ( c char( 8000 ) DEFAULT 'x' )INSERT DW..t2 DEFAULT VALUES-- Add new log file on a different volume because of space constraintsALTER DATABASE DW ADD log FILE ( name = 'DW_TempLog' , filename = 'c:tempDW_TempLog.ldf' )-- Bulk inserts and other operations here-- Log backup after first ALTER DATABASE commandRAISERROR ('-- Log backup after first ALTER DATABASE...', 0, 1) WITH nowaitBACKUP log DW TO DISK = 'c:tempDW.trn.2' WITH init-- Remove temporary log fileALTER DATABASE DW REMOVE FILE 'DW_Templog'-- Log backup after second ALTER DATABASE commandRAISERROR ('-- Log backup after second ALTER DATABASE...', 0, 1) WITH nowaitBACKUP log DW TO DISK = 'c:tempDW.trn.3' WITH initDROP DATABASE DWGO

Keith also needs to provide the commands for restoring the database (in read-only format) up to and including the latest log backup, DW.trn.3. Help Keith write the script to restore a read-only copy of the database after different backups have been restored on a development server.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like