Synchronizing Disparate Data with Master Data Services
How to implement a basic MDS system
April 25, 2011
Microsoft first introduced Master Data Services (MDS) in SQL Server 2008 R2 to solve a problem that large companies often have: Multiple systems have the same kinds of records but they contain slightly different values, making it difficult to keep the data synchronized. For example, the states in which customers live might be stored in two systems, with one system using the two-letter abbreviation (e.g., CA) and the other system spelling out the state (e.g., California). MDS helps solve this problem by allowing data mapping between systems and providing a mechanism to keep the data between these different systems in sync, even when the systems have different schemas.
To create a basic MDS implementation, you need to perform the following steps:
Install MDS on the server that will be the MDS server.
Create the database that the MDS service will use to store its information.
Create the web interface that will be used to create the model and later manage the data.
Create a data model for each system whose data you want to synchronize.
Load each system’s data into the MDS database.
Step 1: Installing MDS
Installing MDS in SQL Server 2008 R2 is easy. You simply run the installer found in the MasterDataServicesx641033_ENU folder. Before you run the installer, though, you need to make sure that Windows PowerShell 2.0 and Microsoft IIS are installed. (In addition to the core IIS configuration pieces, the ASP.NET components and the Windows Authentication component of IIS are needed.) After the installer finishes, the Master Data Services Configuration Manager launches automatically.
Installing MDS is even easier in the next release of SQL Server, which is code-named Denali. Launching the standard SQL Server installer is all it takes. When the list of services is shown, you just scroll down to the bottom of the list and select the Master Data Services check box. Like the MDS version in SQL Server 2008 R2, the MDS version in SQL Server Denali requires that PowerShell and IIS be preinstalled.
After the installer finishes, you can bring up the Master Data Services Configuration Manager by selecting Program Files on the Start menu, choosing Microsoft SQL Server Denali CTP 1, selecting Master Data Services, then clicking Configuration Manager. (Although these instructions are for the Community Technology Preview 1—CTP 1—version of SQL Server Denali, they’ll likely be similar in subsequent versions.)
From this point on, the MDS implementation steps are the same for SQL Server 2008 R2 and SQL Server Denali. So, I provide just one set of instructions in the sections that follow.
Step 2: Creating the MDS Database
To create the MDS database, select the Databases tab in the Master Data Services Configuration Manager. Click the Create Database button to launch a wizard that will walk you through configuring the database. This wizard will prompt you for a variety of information, including the name of the MDS database, the name of the SQL Server instance on which that database resides, and the authentication information needed to access that SQL Server instance. The wizard will also prompt you for the name of the Windows account that will run the MDS service and the Windows account that will be the default administrator for the MDS system.
After you provide the necessary information, the wizard will create
the MDS database on the specified instance
a database login and user for the service account
the database records that grant the permissions
Figure 1 shows the configured database for this example.
Figure 1: The configured MDS database
In true Microsoft fashion, the MDS database’s size will be around 9MB with growth in 1MB increments and the transaction log’s size will be around 2MB with a growth rate of 10 percent. In a production system, you’ll likely need to increase the size of the database and transaction log so that they’re sized appropriately for the amount of data that will be loaded into the MDS database. There’s no way to provide specific guidance here on what those files should be resized to, because every installation will be different depending on the amount of data that’s being loaded into the MDS database.
Step 3: Creating the Web Interface
After creating the database, you need to create the web interface. This interface is called the Master Data Manager Web application, which I’ll simply refer to as the web application. Begin by clicking the Web Configuration button in the Master Data Services Configuration Manager. In the Web Configuration page, you need to specify the IIS website in which you want the MDS installer to create the web application and the application pool that will contain the web application. It can be a new website or an existing website.
For this example, let’s create the web application on a new website. To do so, select Create Site and specify the settings that MDS will need to create not only the website but also the web application and application pool. For this example, you can leave the settings at their default values. However, you might need to change the TCP port if you receive an error message saying that the port is already in use.
If you want the web application on an existing website, you need to select that website, click Create Application, and specify the settings that MDS will need to create the web application and application pool. You can find more information about those settings in the SQL Server MDS team's blog article “Creating Web Sites and Applications in Master Data Services Configuration Manager”.
Next, you need to select the MDS database that you want the web application to connect to. You also need to provide the connection information for the SQL Server instance on which that database resides, which was created in the previous steps. For instructions on how to do so, see the Microsoft article “How to: Associate a Master Data Services Database and Web Application”.
If your application will be accessing MDS programmatically, you need to enable the Web services for the MDS system. This will provide HTTP APIs that can be used to query the data from the MDS application. Although the use of these APIs is beyond the scope of the sample application you’re creating here, knowledge that these APIs exist is important so that you know when to enable the Web services in the Web Configuration page in the Master Data Services Configuration Manager.
After you’ve created the MDS database and web interface, you’re done using the Master Data Services Configuration Manager. To create the data model, you need to use the web application you just created.
Step 4: Creating the Data Model
To connect to the web application’s Management Page, navigate to http://ServerName/ApplicationName, where ServerName is the name of the server on which your web application resides and ApplicationName is the name of the application that you specified during configuration. For this example, the URL is http://sql2008r2mds/MDS.
Click the System Administration button to go to the Model Explorer page, where you can create the models for each system whose data you want to synchronize (hereafter referred to as simply the source system). A model is simply a collection of objects, or entities. An entity can be thought of as a SQL Server table. An entity contains other objects, such as attributes and hierarchies. An attribute can be thought of as a column in a SQL Server table. A hierarchy provides a way to navigate through related data, such as navigating from region to state to city.
The entities in a model map to the tables in the source database. The attributes in the entities map to the columns in those tables.
To add a new model, move the mouse cursor over the Manage menu item at the top of the web application and select Models. On the page that loads, click the green plus sign (which is the add button). Specify the name of the model. For this example, name the model Adventure Works.
When you create a new model, you have the option of creating an entity and the option of creating a hierarchy that have the same name as the model. For this example, though, don’t select those options. Instead, move the mouse cursor over the Manage menu item, select Entities, and click the green plus sign to go to the screen in which you can add new entities.
You need to create an entity for each table (assuming that you want to include that table’s data in the MDS system) in the source database. For this example, you just need to create one entity. Name the entity Employee and select No from the Enable explicit hierarchies and collections drop-down list. (Although you can enable explicit hierarchies and collections for an entity, they aren’t needed for this example.) Click the disk icon (which is the save button) to save the entity and go to the list of entities. Select Employee from the list of entities, then click the pencil icon (which is the edit button) to edit the entity.
To add the attributes, click the green plus sign in the Leaf attributes section. In the Add Attribute page that appears, you can enter the name, size, data type, and display size of the new attribute. You need to create an attribute for each column (assuming that you want to include that column’s data in the MDS system) in the source table. For this example, create three free-form attributes named LastName, FirstName, and EmployeeId. The LastName and FirstName attributes should have the data type of Text. The EmployeeId attribute should have the data type of Number. Each attribute’s size (i.e., length) and display size (i.e., display pixel width) should be set to the size of its respective column in the source table. Figure 2 shows the completed Add Attribute page for the LastName attribute.
Figure 2: The configured LastName attribute
Step 5: Loading Data into the MDS Database
Loading data from source systems into the MDS database is pretty straightforward. For this example, you’ll be loading data from the source system into the Employee entity by means of staging tables. When the MDS database is created, several staging tables are automatically created:
Members staging table (mdm.tblStgMember)
Attributes staging table (mdm.tblStgMemberAttribute)
Parent-child relationship staging table (mdm.tblStgRelationship)
These staging tables are used to load data into production tables.
First, you need to add records to the mdm.tblStgMember table. This tells the import process the entities into which it will load data. For this example, you’ll be loading data into three attributes (LastName, FirstName, and EmployeeId), so you need to add three records to the mdm.tblStgMember table, as Listing 1 shows. As you can see in Listing 1, the attribute names are listed within the EntityName column. This is because you can designate which specific entity is to receive data during the specific data feed. After putting these records into the mdm.tblStgMember table, you’ll have a StatusId value of 0, which indicates that the data load hasn’t been processed.
Next, you need to add records to the mdm.tblStgMemberAttribute table, which is basically a name/value pair table. In this table, you need to add six records (three for each employee record that will be stored in the MDS system), as Listing 2 shows. After inserting these values, the StatusId value will still be 0.
If a model and entity are defined with hierarchies (which are optional), the mdm.tblStgRelationship table needs to have data loaded into it. For this example, hierarchies weren’t defined, so you don’t need to load that table.
You process the data by running the mdm.udpStagingSweep stored procedure in the MDS database. Listing 3 shows the call to mdm.udpStagingSweep for this example. Note the @Process parameter in the last line. When this parameter is set to 1, the data is processed in real time from the staging table to the production table. If the data needs to be processed in the background at the next regularly scheduled interval, simply change the @Process parameter to 0. Any errors that occur during background processing will be logged to the ERRORLOG file. The SQL Server Service Broker performs the background processing, so any error messages would be output in the style of a Service Broker output message. If the @Process input parameter’s value is set to 1, the StatusId value should change from 0 to 1, indicating a success. A value of 2 indicates a failure.
Complex But Robust
As this brief look into MDS demonstrates, setting up an MDS system is very complex. However, MDS can be a robust platform for synchronizing data between disparate systems if you’re willing to work through the clunky interface that you need to use to manage the system.
Listing 1: Code That Adds Three Records to the Member Staging Table
INSERT INTO mdm.tblStgMember
(ModelName, HierarchyName, EntityName,
MemberType_ID, MemberName, MemberCode)
VALUES
('Adventure Works', NULL, 'FirstName', 1, '', 1),
('Adventure Works', NULL, 'LastName', 1, '', 2),
('Adventure Works', NULL, 'EmployeeId', 1, '', 3)
Listing 2: Code That Adds Six Records to the Member Staging Table
INSERT INTO mdm.tblStgMemberAttribute
(ModelName, EntityName, MemberType_ID,
MemberCode, AttributeName, AttributeValue)
VALUES
('Adventure Works', 'Employee', 1, 1, 'FirstName', 'Dan'),
('Adventure Works', 'Employee', 1, 2, 'LastName', 'Jones'),
('Adventure Works', 'Employee', 1, 3, 'EmployeeId', '1')
INSERT INTO mdm.tblStgMemberAttribute
(ModelName, EntityName, MemberType_ID,
MemberCode, AttributeName, AttributeValue)
VALUES
('Adventure Works', 'Employee', 1, 1, 'FirstName', 'Donald'),
('Adventure Works', 'Employee', 1, 2, 'LastName', 'Farmer'),
('Adventure Works', 'Employee', 1, 3, 'EmployeeId', '2')
Listing 3: Code That Executes the mdm.udpStagingSweep Stored Procedure
DECLARE @ModelName NVARCHAR(50) = 'Adventure Works'
DECLARE @UserName NVARCHAR(50) = 'SQL2008R2MDSAdministrator'
DECLARE @UserId INT
DECLARE @VersionId INT
SET @UserId = (SELECT ID
FROM mdm.tblUser u
WHERE u.UserName = @UserName )
SET @VersionId = (SELECT MAX(ID)
FROM mdm.viw_SYSTEM_SCHEMA_VERSION
WHERE Model_Name = @ModelName)
EXEC mdm.udpStagingSweep @UserId=@UserId,
@VersionId=@VersionId, @Process=1
About the Author
You May Also Like