Starting Your Modeling Career with Analysis Services Tabular Models Part 1

This is the first of a four-part series about getting started with Tabular model design using SQL Server Analysis Services in SQL Server 2012 and 2014. You will learn the basics from start to finish and build a complete solution. A sample project is provided for for each stage of the solution so you can follow-along with each article.

Paul Turley

January 12, 2015

15 Min Read
Starting Your Modeling Career with Analysis Services Tabular Models Part 1

Getting Started with SSAS Tabular

Microsoft's introduction of the xVelocity in-memory analytics engine in 2010 is a significant game-changer for business intelligence and data management.  The availability of this impressive technology and its future application in new product features will change the way you store and consume data, and it's at the heart of SSAS Tabular.  In this first in a series of four articles about designing and using Tabular models, you will experience how to create a model and analyze data.  In the articles that follow, we will add more sophisticated features to deliver production-scale solutions.  Watch for these articles in the continuation of this series:

Part 1 - Getting Started with SSAS Tabular
Part 2 - Easy DAX – Getting Started with Data Analysis Expressions
Part 3 -  Tabular Model Administration
Part 4 - Deep Dive DAX – Solving Complex Business Problems with Data Analysis Expressions

Why Choose Tabular?

On a server equipped with sufficient memory and CPU horsepower, Tabular SSAS can be very fast.  xVelocity, also known as VertiPaq, is the engine behind Tabular and Power Pivot.  In simple terms, it applies some foundational data principles elegantly to achieve high performance.  If you work with SQL Server databases, you know that table lookups are much faster when the columns used in a query are indexed.  Tabular take this principle to the next level.  The xVelocity column store essentially creates a separate index for every column and then compresses the data, eliminating redundant and sparse values.  Unlike other compression methods, the storage for each column is unaffected by adjacent columns.  For numeric, date and short text values; this approach can drastically reduce the size of the stored data.  All of the compressed data is loaded into memory so queries incur no disk IO cost and return results quickly.

In the sixteen year history of multidimensional Analysis Services, many features and capabilities were added to the platform to address some specific business needs and "newer" is not always "better".  SSAS Tabular is not designed to be a comprehensive replacement for multidimensional Analysis Services.  It is, however, a streamlined platform for analytic reporting and data exploration.  Tabular is a ground-up re-engineered platform with a lot of solid business value today and even more promise for the future.  Think if it as a Formula One race car without cup holders and seat warmers; just designed to go really fast.  A number of publications and resources are at your disposal to compare both options and determine which best meets your requirements.  I encourage you to take Tabular for a test drive.

Create a Tabular Project

The Developer, Business Intelligence and Enterprise editions of SQL Server 2012 and 2014 support SSAS Tabular projects.  SQL Server Data Tools for BI installs with SQL Server 2012.  For SQL Server 2014, install any edition of Visual Studio 2012 or 2013 and then install the SSDT-BI add-in.  You should also have Excel 2010 or 2013 installed.  To follow-along, you'll need the AdventureWorksDW2012 or AdventureWorksDW2014 sample database.  Links to these downloads and more background information are included in the Resources at the end of this article.

Open SSDT and create a new Analysis Services Tabular Project.

Figure 1 – New Analysis Services Tabular Project in Visual Studio 2013

A unique facet of the Tabular design experience is that you are always connected to a live Tabular model.  This differs from the disconnected experience in multidimensional cube projects because you are working with data and immediately see the result of your design changes.  You can work with a small subset of production data by filtering large tables and then designing partitions when you deploy to production.  I'll introduce partitioning in a latter article in the series.

When you create the project, you are prompted for a Tabular instance of Analysis Services to store your workspace database.  Ideally, this should be a local instance or a non-production server where you have exclusive permission to create and manage your workspace database.  The Compatibility level options are currently SQL Server 2012 (1100) and SQL Server 2012 SP1 (1103).  Choose the latter option if your SSAS server supports it.

Figure 2- Workspace server

Every object in the model: tables, measures, relationships, KPIs – everything is stored in a single definition file named Model.BIM.  Double-click this file in the Solution Explorer to open the model designer.

Data Source Schemas and Table Data Sources

Tabular affords a great deal of model design flexibility which can be a curse or a blessing.  Since the tabular designer doesn't force you to differentiate between facts and dimensions, every entity is simply called a table.  You can design your model as a conventional star or snowflake schema with clear separation between facts and dimensions, or you can design hybrid models with measures intermingled with dimensional attributes.  With careful planning, the latter option can actually be simple and efficient but it can also get unruly and out-of-control if you don't plan well.  If you're not sure, I suggest that you follow the rules of dimensional design prescribed by The Kimball Group in The Data Warehouse Toolkit, by organizing data into a relational data warehouse or data mart to support the model.  Some of the hard-and-fast rules enforced by conventional SSAS no longer limit certain design options but the essential rules of relational model behavior still apply.  The point is that you don't have to use a relational data mart and the data for each table can come from a different data source.  Tables don't have to be related in the source database but they should have matching key values to form relationships in the model.  Consider this freedom to be a set of powerful tools to be used by a skilled and disciplined craftsman only after gaining the necessary experience.

Import Data into the Tabular Model

To create tables for the model, use the Import From Data Source… option on the MODEL menu to open the Table Import Wizard.  You'll see that you can import data from just about any standard data source.  Choose Microsoft SQL Server, choose the server instance and connect to the AdventureWorksDW2014 database.  You can connect to the data source using your own credentials or with the identity of the service account running the Analysis Service Tabular instance.  This requirement is the same as it's always been.  If in doubt, connect to your own local instance using your Windows user name and password as a local administrator.  In production, you will typically configure the SSAS service account or a service principal with read permission to the database.  Before you take a lot of time to import all the tables, go through the following steps with only one table to test connectivity.  You can always delete a table and import it again.  I recommend that you review this blog post before you continue: http://sqlserverbiblog.wordpress.com/2015/01/08/how-to-resolve-connection-errors-when-loading-data-in-ssas-tabular-project

The next few steps could take several pages to explain in detail but the routine is actually quite simple.  Pay attention to detail as you select and rename tables and select and rename columns.  It's common to iterate through this process.  You can check your design against the sample project I'll make available for download.

Figure 3 – Table Import Wizard

Select and check each table shown in Figure 3.  Use the Friendly Name column to rename the table and then click the Preview & Filter button and deselect the tables that are not included in the following list:

 

Source Table

Friendly Name

Selected Columns

DimCustomer

Customer

CustomerKey, GeographyKey, FirstName, LastName, BirthDate, MaritalStatus, Gender, YearlyIncome, TotalChildren, EnglishEducation, EnglishOccupation, HouseOwnerFlag, NumberCarsOwned, CommuteDistance

DimDate

Order Date

FullDateAlternateKey, DayNumberOfWeek, EnglishDayNameOfWeek, DayNumberOfMonth, EnglishMonthName, MonthNumberOfYear, CalendarQuarter, CalendarYear

DimGeography

Geography

GeographyKey, City, StateProvinceCode, EnglishCountryRegionName

DimProduct

Product

ProductKey, EnglishProductName, ProductSubcategoryKey

DimProductCategory

Product Category

ProductCategoryKey, EnglishProductCategoryName

DimProductSubcategory

Product Subcategory

ProductSubcategoryKey, ProductCategoryKey, EnglishProductSubcategoryName

DimReseller

Reseller

ResellerKey, GeographyKey, BusinessType, ResellerName, NumberOfEmployees, AnnualSales, AnnualRevenue

DimSalesTerritory

Sales Territory

SalesTerritoryKey, SalesTerritoryRegion, SalesTerritoryCountry, SalesTerritoryGroup

FactInternetSales

Internet Sales

ProductKey, CustomerKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, OrderQuantity, UnitPrice, ExtendedAmount, TotalProductCost, SalesAmount, TaxAmount, Freight, OrderDate, DueDate, ShipDate

FactResellerSales

Reseller Sales

ProductKey, ResellerKey, EmployeeKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, OrderQuantity, UnitPrice, ExtendedAmount, TotalProductCost, SalesAmount, TaxAmount, Freight, OrderDate, DueDate, ShipDate

 

Apply Friendly Column Names

Now, select each table in the model, rename the columns according to the following guidelines; and hide the columns as directed.  To select a table, use the tabs at the bottom of the designer, right-click the column heading and choose Rename Column and then press Enter after making changes.  For most of these columns, you're simply adding spaces or simplifying the name to make the column name more readable.  To hide a column, right-click the column and choose Hide from Client Tools.  Don't change column names that indicate and don't make any changes to the tables not listed below.

Customer table

Field Name

New Field Name

CustomerKey

Yes

GeographyKey

Yes

FirstName

First Name

LastName

Last Name

BirthDate

Birth Date

MaritalStatus

Marital Status

Gender

Gender

YearlyIncome

Yearly Income

TotalChildren

Total Children

EnglishEducation

Education

EnglishOccupation

Occupation

HouseOwnerFlag

House Owner Flag

NumberCarsOwned

Number Cars Owned

CommuteDistance

Commute Distance

 

Order Date table

Field Name

New Field Name

FullDateAlternateKey

Date

DayNumberOfWeek

Yes

EnglishDayNameOfWeek

Day Of Week

DayNumberOfMonth

Day Of Month

EnglishMonthName

Month Name

MonthNumberOfYear

Yes

CalendarQuarter

Calendar Quarter

CalendarYear

Calendar Year

 

Geography table

Field Name

New Field Name

GeographyKey

Yes

City

StateProvinceCode

State Code

EnglishCountryRegionName

Country Region

SalesTerritoryKey

Yes

 

Product table

Field Name

New Field Name

ProductKey

Yes

ProductSubcategoryKey

Yes

EnglishProductName

Product Name

 

Reseller table

Field Name

New Field Name

ResellerKey

Yes

GeographyKey

Yes

BusinessType

Business Type

ResellerName

Reseller Name

NumberEmployees

Number Of Employees

AnnualSales

Annual Sales

AnnualRevenue

Annual Revenue

 

Sales Territory table

Field Name

New Field Name

SalesTerritoryKey

Yes

SalesTerritoryRegion

Region

SalesTerritoryCountry

Country

SalesTerritoryGroup

Group

 

Internet Sales table

Field Name

New Field Name

ProductKey

Yes

CustomerKey

Yes

SalesTerritoryKey

Yes

SalesOrderNumber

Sales Order Number

SalesOrderLineNumber

Sales Order Line Number

OrderQuantity

Order Quantity

UnitPrice

Unit Price

ExtendedAmount

Extended Amount

TotalProductCost

Total Product Cost

SalesAmount

Sales Amount

TaxAmount

Tax Amount

Freight

Yes

OrderDate

Yes

DueDate

Yes

ShipDate

Yes

 

Reseller Sales table

Field Name

New Field Name

ProductKey

Yes

ResellerKey

Yes

EmployeeKey

Yes

SalesTerritoryKey

Yes

SalesOrderNumber

Sales Order Number

SalesOrderLineNumber

Sales Order Line Number

OrderQuantity

Order Quantity

UnitPrice

Unit Price

ExtendedAmount

Extended Amount

TotalProductCost

Total Product Cost

SalesAmount

Sales Amount

TaxAmount

Tax Amount

Freight

Yes

OrderDate

Yes

DueDate

Yes

ShipDate

Yes

 

Use the MODEL menu to switch the Model View to Diagram View.  There are also two small icons in the bottom, right-hand side of the model designer to switch views. 

Create Relationships

Depending on how you imported the tables, some relationships are created for you.  In the diagram view shown in Figure 4, drag and drop columns from the source table to the target table to create relationships as indicated in the table below.  Check Figure 4 to make sure all the relationship are in your model.

Source Table

Source Column

Target Table

Target Column

Internet Sales

CustomerKey

Customer

CustomerKey

Internet Sales

OrderDate

Order Date

Date

Internet Sales

ProductKey

Product

ProductKey

Reseller Sales

OrderDate

Order Date

Date

Reseller Sales

ProductKey

Product

ProductKey

Reseller Sales

ResellerKey

Reseller

ResellerKey

Reseller Sales

SalesTerritoryKey

Sales Territory

SalesTerritoryKey

Customer

GeographyKey

Geography

GeographyKey

Product

ProductSubcategoryKey

Product Subcategory

ProductSubcategoryKey

Product Subcategory

ProducyCategoryKey

Product Category

ProducyCategoryKey

Reseller

GeographyKey

Geography

GeographyKey

 

Add Calculated Columns

In the Product table, you will add the Category and Subcategory columns using simple DAX expressions.  The right-most column in the grid is titled Add Column.  Right-click and rename the column Category.  Press Enter and then type the following text into the formula bar above the grid (the box to the right of the label fx). 

=RELATED('Product Category'[EnglishProductCategoryName])

The DAX expression designer is going to try to help you.  With some practice, you'll learn to work with it but the auto-completion feature doesn't always get things right.  Double-check the expression before you press Enter to save the column.  You should see new product category values in the column cells.

Repeat the last step to add another column named Subcategory.  Type the following into the formula bar and press Enter:

=RELATED('Product Subcategory'[EnglishProductSubcategoryName])

Create Hierarchies

Hierarchies provide a path for drill-down reporting and help users navigate the model.  This is an easy task in the model designer.  Right-click the Calendar Year column in the Order Date table and choose Create Hierarchy.  Rename the new hierarchy from Hierarchy1 to Calendar and press Enter.  Right-click the Month Name column and choose Add to Hierarchy and then choose the Calendar hierarchy.  Repeat the same step to add the Date column to the Calendar hierarchy.  Note that you can also drag and drop columns under the hierarchy but make sure that the cursor is indented before you release the mouse button.  In the Product table, right-click the Category column and add a new hierarchy name Product Hierarchy.  Add the Subcategory and Product Name columns below the Category.  Figure 4 shows the final result.

Figure 4 – Model diagram

Add Measures

You can switch to grid view by right-clicking a table in diagram view and select Go To.  Use this method to go to the Internet Sales table in grid view.  The following can be performed on each of column separately or as a group by selecting the columns in a range (use Shift or drag across the column headings).  Select these columns and click the Sum (∑) button on the toolbar.  For each column, a new measure will be created in the Calculation area grid below with the "Sum of" prefix added to the column name.  Do this for the following columns in the Internet Sales table and then repeat the step for the same columns in the Reseller Sales table.

 

Order Quantity

Unit Price

Extended Amount

Total Product Cost

Sales Amount

Tax Amount

Freight

 

Browsing the Model

To query and explore the model, you will use Excel as an integrated data browser.  On the toolbar, click the green Excel icon.  In the Analyze in Excel window, click OK.  Excel opens, creates a connection to the workspace model and adds an empty PivotTable to a worksheet.  From the PivotTable field list on the right side, check to select the Sum of Sales Amount measure in the Internet Sales table.  From the Geography table, drag the Country Region field to the ROWS list below.  From the Product table drag the Product Hierarchy to the COLUMNS list below.  In the Order Date table, expand the Calendar hierarchy, right-click the Calendar Year field and choose Add as Slicer

Figure 5 – Browsing the workspace model in Excel

Notice how easy and fast it is to navigate this data.  Use the drill-down buttons for each product category on columns to explore sales for the Product hierarchy, and use the slicer buttons to select a year to analyze sales in the PivotTable.

In this brief introduction, we've just scratched the surface.  This model is the foundation for future enhancement so save your project and be prepared to continue in the next article in the series.

continue to part 2

Resources

Download sample project for this article

Microsoft SQL Server Data Tools - Business Intelligence for Visual Studio 2012
http://www.microsoft.com/en-us/download/details.aspx?id=36843

Microsoft SQL Server Data Tools - Business Intelligence for Visual Studio 2013
http://www.microsoft.com/en-us/download/details.aspx?id=42313

AdventureWorks sample databases
http://msftdbprodsamples.codeplex.com/

Choosing a Tabular or Multidimensional Modeling Experience in SQL Server 2012 Analysis Services
http://msdn.microsoft.com/en-us/library/hh994774.aspx

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like