Reporting Book Sales
Readers help Phil import data into the StoreSalesSummary table by unpivoting the month columns in an Excel spreadsheet.
February 10, 2004
Congratulations to Narasimhan Jayachandran, a database management consultant for HTC Global Services in Troy, Michigan, and John Hanson, vice president of operations for MEDePass, Inc.. Narasimhan won first prize of $100 for the best solution to the February Reader Challenge, "Reporting Book Sales." John won second prize of $50. Here’s a recap of the problem and the solution to the February Reader Challenge.
Problem:
Phil generates reports for a sales team in a company that sells books. The company stores publication data for its books in a SQL Server 2000 database. Phil receives sales data updates in a Microsoft Excel file that has the following header labels for columns: stor_id, yr, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, and Dec. The spreadsheet contains quantity of books sold in each store by year and month. You can generate sample data for the Excel spreadsheet from the Sales table in the Pubs database by using the following code:
SELECT s.stor_id, year(s.ord_date) AS yr, SUM(CASE month(s.ord_date) WHEN 1 THEN s.qty ELSE 0 END) AS Jan, SUM(CASE month(s.ord_date) WHEN 2 THEN s.qty ELSE 0 END) AS Feb, SUM(CASE month(s.ord_date) WHEN 3 THEN s.qty ELSE 0 END) AS Mar, SUM(CASE month(s.ord_date) WHEN 4 THEN s.qty ELSE 0 END) AS Apr, SUM(CASE month(s.ord_date) WHEN 5 THEN s.qty ELSE 0 END) AS May, SUM(CASE month(s.ord_date) WHEN 6 THEN s.qty ELSE 0 END) AS Jun, SUM(CASE month(s.ord_date) WHEN 7 THEN s.qty ELSE 0 END) AS Jul, SUM(CASE month(s.ord_date) WHEN 8 THEN s.qty ELSE 0 END) AS Aug, SUM(CASE month(s.ord_date) WHEN 9 THEN s.qty ELSE 0 END) AS Sep, SUM(CASE month(s.ord_date) WHEN 10 THEN s.qty ELSE 0 END) AS Oct, SUM(CASE month(s.ord_date) WHEN 11 THEN s.qty ELSE 0 END) AS Nov, SUM(CASE month(s.ord_date) WHEN 12 THEN s.qty ELSE 0 END) AS Dec FROM Sales AS s GROUP BY s.stor_id, year(s.ord_date)
Phil needs to import the data from the Excel file into a SQL Server table called StoreSalesSummary by unpivoting the month columns from the spreadsheet. You can create the StoreSalesSummary table by using the following code:
CREATE TABLE StoreSalesSummary ( stor_id int NOT NULL, qty int NOT NULL, yr smallint NOT NULL, mn tinyint NOT NULL, PRIMARY KEY(stor_id, yr, mn) )
Help Phil import only the Excel spreadsheet data into the StoreSalesSummary table, then insert and update each store’s sales from the spreadsheet. Import only the stores with a nonzero quantity value for any month.
Solution:
Phil can use the OPENDATASOURCE() rowset function in SQL Server 2000 to read the Excel file data as a table. If the Excel file contains a worksheet called YearlySales, he can use the following code to read the Excel spreadsheet as a table:
SELECT * FROM OpenDataSource( 'Microsoft.Jet.OLEDB.4.0', 'Data Source="C:DataYearlySalesSummary.xls";User ID=Admin;Password= ;Extended properties=Excel 5.0')...[YearlySales$] s
Phil could also set up a linked server connection to the Excel file or use the OPENROWSET() function to read the data from the Excel file as a table.
Now Phil can manipulate the data from the Excel spreadsheet in a SQL Server table. The following code puts the Excel data into a temporary table called #Sales in the desired format:
SELECT s1.stor_id, s1.qty, s1.yr, s1.mn INTO #Sales FROM ( SELECT s.stor_id, CASE m.mn WHEN 1 THEN s.[Jan] WHEN 2 THEN s.[Feb] WHEN 3 THEN s.[Mar] WHEN 4 THEN s.[Apr] WHEN 5 THEN s.[May] WHEN 6 THEN s.[Jun] WHEN 7 THEN s.[Jul] WHEN 8 THEN s.[Aug] WHEN 9 THEN s.[Sep] WHEN 10 THEN s.[Oct] WHEN 11 THEN s.[Nov] WHEN 12 THEN s.[Dec] END AS qty, s.yr, m.mn FROM OpenDataSource( 'Microsoft.Jet.OLEDB.4.0', 'Data Source="C:DataYearlySalesSummary.xls";User ID=Admin;Password= ;Extended properties=Excel 5.0')...[YearlySales$] s CROSS JOIN ( SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12 ) AS m(mn) ) s1 WHERE s1.qty > 0.0SELECT * FROM #Sales
The query’s cross join lets Phil generate a row for each month column. Once Phil converts the columns to rows, he can obtain the correct quantity value by using a CASE expression in the SELECT list, based on the pseudo month column, mn. Finally, the code’s WHERE clause filters the rows that have a nonzero qty value for each year and month combination.
Now Phil can use the data from the temporary table #Sales to update the data in the StoreSalesSummary table, as the following code shows:
-- Update the existing rows first: UPDATE s2 SET qty = s2.qty + s1.qty FROM #Sales AS s1 JOIN StoreSalesSummary AS s2 ON s2.stor_id = s1.stor_id AND s2.yr = s1.yr AND s2.mn = s1.mn
-- Add the new rows next:INSERT INTO storesalessummarySELECT s1.stor_id, s1.qty, s1.yr, s1.mn FROM #Sales AS s1 WHERE NOT EXISTS(SELECT * FROM StoreSalesSummary AS s2 WHERE s2.stor_id = s1.stor_id AND s2.yr = s1.yr AND s2.mn = s1.mn)SELECT * FROM #Sales
Phil could also unpivot the Excel spreadsheet’s month columns by using a UNION ALL query. The following example assumes that Phil has dumped the data from the worksheet into a temporary table called #ExcelSales:
SELECT stor_id, yr, 1 AS mn, Jan AS qty FROM #ExcelSales WHERE Jan > 0 UNION ALLSELECT stor_id, yr, 2 AS mn, Feb AS qty FROM #ExcelSales WHERE Feb > 0 UNION ALLSELECT stor_id, yr, 3 AS mn, Mar AS qty FROM #ExcelSales WHERE Mar > 0 UNION ALLSELECT stor_id, yr, 4 AS mn, Apr AS qty FROM #ExcelSales WHERE Apr > 0 UNION ALLSELECT stor_id, yr, 5 AS mn, May AS qty FROM #ExcelSales WHERE May > 0 UNION ALLSELECT stor_id, yr, 6 AS mn, Jun AS qty FROM #ExcelSales WHERE Jun > 0 UNION ALLSELECT stor_id, yr, 7 AS mn, Jul AS qty FROM #ExcelSales WHERE Jul > 0 UNION ALLSELECT stor_id, yr, 8 AS mn, Aug AS qty FROM #ExcelSales WHERE Aug > 0 UNION ALLSELECT stor_id, yr, 9 AS mn, Sep AS qty FROM #ExcelSales WHERE Sep > 0 UNION ALLSELECT stor_id, yr, 10 AS mn, Oct AS qty FROM #ExcelSales WHERE Oct > 0 UNION ALLSELECT stor_id, yr, 11 AS mn, Nov AS qty FROM #ExcelSales WHERE Nov > 0 UNION ALLSELECT stor_id, yr, 12 AS mn, Dec AS qty FROM #ExcelSales WHERE Dec > 0
The cross-join technique typically performs better than the UNION ALL query for unpivoting operations. The cross-join involves less code and fewer joins on the main table. You can compare the performance of both solutions by looking at their execution plan costs, I/O statistics, and the time SQL Server takes to execute the queries.
MARCH READER CHALLENGE:
Now, test your SQL Server savvy in the March Reader Challenge, "Restoring a Database" (below). Submit your solution in an email message to [email protected] by February 19. Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.
Keith is the DBA for a company that runs several SQL Server 2000 data warehouses. He has a crucial database that contains the fact and dimension tables for the data warehouse. Keith performs full backups every week and periodic log backups after the database backup. The company’s development team has requested the latest copy of the database along with any log backups for testing purposes. The team needs to run ad hoc queries against the database at different points in time to collect statistics. Keith uses the following sequence of steps to create the database and log backups:
CREATE DATABASE DWALTER DATABASE DW SET recovery bulk_loggedRAISERROR ('-- Performing full backup...', 0, 1) WITH nowait-- Full backup of databaseBACKUP DATABASE DW TO DISK = 'c:tempDW.bak' WITH initGO-- Create table t1CREATE TABLE DW..t1 ( i int IDENTITY )INSERT INTO DW..t1 DEFAULT VALUES-- Initial log backupRASIERROR ('-- Initial log backup...', 0, 1) WITH nowaitBACKUP log DW TO DISK = 'c:tempDW.trn.1' WITH init-- Create table t2 for bulk loadingCREATE TABLE DW..t2 ( c char( 8000 ) DEFAULT 'x' )INSERT DW..t2 DEFAULT VALUES-- Add new log file on a different volume because of space constraintsALTER DATABASE DW ADD log FILE ( name = 'DW_TempLog' , filename = 'c:tempDW_TempLog.ldf' )-- Bulk inserts and other operations here-- Log backup after first ALTER DATABASE commandRAISERROR ('-- Log backup after first ALTER DATABASE...', 0, 1) WITH nowaitBACKUP log DW TO DISK = 'c:tempDW.trn.2' WITH init-- Remove temporary log fileALTER DATABASE DW REMOVE FILE 'DW_Templog'-- Log backup after second ALTER DATABASE commandRAISERROR ('-- Log backup after second ALTER DATABASE...', 0, 1) WITH nowaitBACKUP log DW TO DISK = 'c:tempDW.trn.3' WITH initDROP DATABASE DWGO
Keith also needs to provide the commands for restoring the database (in read-only format) up to and including the latest log backup, DW.trn.3. Help Keith write the script to restore a read-only copy of the database after different backups have been restored on a development server.
About the Author
You May Also Like