Break Through the Data-Access Maze
Choose the smartest path for your data-access challenges.
October 30, 2009
asp:cover story
LANGUAGES: VB .NET
ASP.NET VERSIONS: 1.0 | 1.1
Break Through the Data-Access Maze
Choose the smartest path for your data-accesschallenges.
By Jeffrey Hasan and Kenneth Tu
ASP.NET applications present unique challenges for dataaccess, particularly when they include or access XML Web Services. ADO.NET is adata-access technology that's, by now, familiar to most .NET developers. Itsusage is well documented in numerous books and articles, and there's little newto add as far as introducing the technology. What's less documented, though, ishow to choose the best data-access option among the often confusing array ofchoices ADO.NET provides. The best choice usually depends on two factors: thetype of data being retrieved (or updated), and the required transport mechanismfor returning query (or update) results back to the calling application.
For example, consider one simplescenario: code behind a Web page that queries a table and binds the results toa DataGrid on the page. In this case, the query result may be retrieved in anyway conducive to binding to a DataGrid, including as a DataSet, DataReader, orXML. The query result doesn't need to be transported or serialized, so thedata-access options are flexible.
Now consider a more complex scenario: A Web method queriesa table and returns the results over the wire as XML. In this case, the resultmust be serialized to XML, so the data-access options are more limited; aDataSet is a good choice because it serializes easily to XML. But a DataReaderdoes not, so it's not an option.
At the simplest level, you can optimize data access byconsidering how data needs to be retrieved, used, and transported. Later inthis article, we'll provide a decision-support flow diagram for data access,which can help you choose the best data-access approach for a given scenario.
ASP.NET applications - especially XML Web Services -present challenging data-access scenarios that require difficult designs. Inthis article, we'll discuss the best ways to access data from SQL Serverdatabases and present a decision-support system for choosing optimaldata-access approaches.
Connect to SQL Server 2000
The .NET Framework provides several data providersoptimized for specific data sources. The SQL Server .NET Data Provider is bestfor SQL Server 7.0 and later because it communicates using SQL Server's nativeTabular Data Stream protocol. This data provider is 30 to 40 percent fasterthan a comparable OLE DB data provider because it avoids using an intermediateOLE DB layer. The SQL Server data provider is included in theSystem.Data.SqlClient namespace. (For earlier versions of SQL Server you mustuse .NET's standard OLE DB data provider, which is included in theSystem.Data.OleDbClient namespace.)
Before you can access data, you must open an optimizedconnection to the data source. For SQL Server data sources, you can do this intwo ways. You could set connection-string parameters that specify the datasource address and the transaction context, or you could implement connectionpooling to reuse existing connections. Here's an example of an optimizedconnection string for the Northwind database in a SQL Server instance namedMySQLDB:
server=192.168.1.1MySQLDB;uid=sa;pwd=jy87s5;
database=northwind;enlist=false;
packet size=8192;connect timeout=300;
Figure 1 provides optimizationdetails for several important connection-string parameters.
Name | Default | Description |
---|---|---|
Server | - | The name or network address of the SQL Server instance to which you connect. Use an IP address instead of a DNS name; this avoids name resolution, which increases the time necessary to make the connection. |
Enlist | 'true' | When 'true', the connection pooler enlists the connection automatically in the creation thread's current transaction context. If the application is not using transactions, set to 'false' for better performance. |
Packet Size | 8192 | Size, in bytes, of the network packets used to communicate with an instance of SQL Server. When used with SQL Server, you can fine-tune this parameter for communication performance by adjusting this value based on the size of the data packets being transferred between client and server. Range is 512-32767 bytes. |
Connect Timeout | - | This parameter limits the wait time when a connection attempt fails. Also improves overall performance by avoiding deadlocked connections that are unavailable to participate in a connection pool. |
Figure 1. Here's a summary of optimized parametersettings for a SQL Server database connection string.
Some debate exists about the bestlocation for storing database connection strings in ASP.NET applications. Ourpreference is to store them as custom application settings in the web.configfile:
value="server=192.168.1.1MySQLDB;uid=sa;
pwd=jy87s5;database=northwind;
enlist=false;packet size=8192;
connecttimeout=300;" />
Then you can retrieve the connection string from the codebehind:
Imports System.Data.SqlClient
Dim strConn As String = _
ConfigurationSettings.AppSettings("ConnectionString")
Dim sqlConn As SqlConnection = New SqlConnection(strConn)
sqlConn.Open()
This approach provides the best flexibility because youcan customize the connection string for separate deployments of the sameapplication without rebuilds. For example, the staging copy of an applicationcan point to a development database, whereas the production copy of anapplication can point to the production database. ASP.NET caches the web.configfile, which minimizes lookup times on the connection string.
Pool Your Connections
ASP.NET applications use database connections inherentlyfor small durations of time. These applications typically open a connection,execute a query, return a response quickly, and release the connection back tothe pool for other instances to use as soon as possible. The default number ofconnections is 100, so if the connections are released back to the poolproperly after usage, the default setting should be more than enough for anASP.NET application. You can adjust this number to meet the needs of theapplication. Be forewarned - setting this number too low could cause an emptyconnection pool, such that further requests for a connection will hang yourapplication until one becomes available in the pool. To release a connectionproperly, invoke the Connection object's Close method, which releases theresources used by the connection and returns the connection back to the pool.Figure 2 provides optimization details for several important connection-stringparameters related to connection pooling.
Name | Default | Description |
---|---|---|
Connection Lifetime | 0 | The connection lifetime value for each connection returned to the pool. A value of zero (0) causes pooled connections to have the maximum timeout. |
Connection Reset | 'true' | When set to 'true', the database connection is reset when removed from the pool. |
Max Pool Size | 100 | The maximum number of connections allowed in the pool. |
Min Pool Size | 0 | The minimum number of connections maintained in the pool. |
Pooling | 'true' | When 'true', the connection is drawn from the appropriate pool or, if necessary, created and added to the appropriate pool. |
Figure 2. Here's a summary of optimizedconnection-string parameter settings for SQL Server database-connectionpooling.
Now that you can optimize your database connections, let'slook at the two main data-access objects the .NET Framework provides: theDataReader and DataSet objects. We'll discuss their relative advantages and howto pick the right object for a specific ASP.NET data-access scenario.
DataReader vs. DataSet
The DataReader object provides an unbuffered, sequentialdata stream for fast, efficient, read-only access to a set of records. TheDataSet object takes a virtual snapshot of a database and stores it as adisconnected, in-memory representation of a collection of data tables,relations, and constraints. Most importantly, the DataSet is integrated tightlywith XML, and it can serialize relational data to and from XML while preservingboth data and schema information.
The DataSet object handles more complex data-accessscenarios and provides more data-transport options compared to a DataReaderobject. A big advantage of the DataSet is it may be disconnected from theunderlying data source and operated on without a persistent databaseconnection. Also, it can perform complex operations on data - such as sorting,filtering, and tracking updates - without requiring custom code. Unlike theDataReader, the DataSet may be cached in memory and retrieved at any time. Italso can provide multiple, alternative views of the same data, where eachalternate view is stored as a subset of records in separate DataView objects.Finally, the DataSet offers the distinct advantage of serializing itsrelational data and structures to XML automatically. This lets a clientapplication receive serialized relational data from a Web service method asXML. The data then can be transported across the wire and hydrated on theclient into a new DataSet object.
The main disadvantage of the DataSet object is itsrelatively heavy use of server resources, including memory and processing time,compared to the DataReader. The DataSet can't compete with the efficiency ofthe DataReader for read-only operations, but it does provide multiplecapabilities that the DataReader does not.
The DataReader and DataSet objects provide distinctadvantages for different data-access scenarios. Figure 3 highlights the threemost common data-access scenarios in ASP.NET applications: read-only, update,and data delivery. The DataReader object is appropriate for the first scenario,which requires fast, read-only data access without caching or serialization.DataReader objects aren't designed for transport, nor can they operate without apersistent database connection. The DataSet object is appropriate for thesecond and third scenarios because it supports update data access and enablesrelational data to be represented and transported as XML.
Figure 3. Here are the three most common data-access scenarios inASP.NET applications: read-only data access, update data access, and datadelivery from a Web Service method. This diagram shows the interactions betweendatabase, Web server, and client.
XML Support in the DataSet Object
Three important scenarios might require you to serializerelational data from a DataSet to XML:
Exchanging relational data with XML Web Services, whichincludes both receiving XML from a Web method as well as passing XML back to aWeb method;
validating data against a specific XSD schema;
and generating typed DataSets.
The DataSet object providessupport for all of these scenarios. We'll briefly consider each in turn.
Serialize a DataSet as XML: The .NET Frameworkmakes it trivial to exchange a DataSet between a Web service and a clientapplication because it serializes the DataSet to XML automatically. The Webmethod's return argument simply needs to be a DataSet type:
Public Function GetProductList() As DataSet
' Code to generate a hydrated DataSet
End Function
This approach works if you don't need to work directlywith the DataSet's XML. But if you do, the DataSet provides good support forserializing its data out to XML. The DataSet provides a method named GetXml,which generates an XML representation of the relational data automatically in ahydrated DataSet. In addition, the DataSet object provides a method namedGetXmlSchema, which generates schema-only information without the actual data.This listing shows an example of how to write a DataSet's schema information toa file:
Imports System.IO
Sub GenerateSchemaFile(sqlDS As DataSet)
Try
Dim xsdFile AsFile
IfxsdFile.Exists("C:tempproducts.xsd") Then _
xsdFile.Delete("C:tempproducts.xsd")
Dim sw As StreamWriter
sw = NewStreamWriter("C:tempproducts.xsd")
sqlDS.Namespace ="urn:products-schema"
sw.Write(sqlDS.GetXmlSchema)
sw.Close
Finally
End Try
End Sub
Validate a DataSet against a specific XSD schema:An XSD schema file describes the structure of a relational data set fully,including the data types it contains. XSD schema files are excellent validationtools to ensure a DataSet contains valid relational data. This is especiallyuseful when you are adding additional records to a DataSet and need to verifythat both the data types and field names are valid. The DataSet provides amethod named ReadXmlSchema for importing an XSD schema file. If an invalid datatype or field name is assigned, the DataSet will raise a specific SqlException.Figure 4 shows an example based on the Products table in the Northwinddatabase.
Imports System.Data
Imports System.Data.SqlClient
Imports System.Xml
Imports System.IO
Sub ValidateSchema()
' Purpose: Validate aDataSet using an XSD schema file
Dim sqlDS As DataSet
Try
' Step 1: Createa blank DataSet
sqlDS = NewDataSet()
' Step 2: Import an XSD schema file into the DataSet
Dim myStreamReader As StreamReader = New _
StreamReader("c:tempproducts.xsd")
sqlDS.ReadXmlSchema(myStreamReader)
' Step 3: Manually add a product correctly
Dim dr As DataRow= sqlDS.Tables(0).NewRow()
dr("ProductID") = 200
dr("ProductName") = "Red Hot Salsa"
dr("Discontinued") = False
sqlDS.Tables(0).Rows.Add(dr)
' Step 4: Manually add a product incorrectly
' by setingProductID to a string (expects Int32)
Dim dr As DataRow= sqlDS.Tables(0).NewRow()
dr("ProductID") = "XJ8" ' Invalid assignment
dr("ProductName") = "Red Hot Salsa"
dr("Discontinued") = False
sqlDS.Tables(0).Rows.Add(dr)
Catch sqlError As SqlException
Response.Write(sqlError.Message)
End Try
End Sub
Figure 4. This listing shows how you can validate aDataSet using an XSD schema file.
Step 4 in Figure 4 generates a SqlException because thedata type of the Product ID is set incorrectly, based on the schema definition.The exception message will be:
Couldn't store in ProductID Column.
Expected type is Int32.
Generate typed DataSets: A typed DataSet is a classthat inherits from the DataSet object and incorporates a specific XSD schema.It provides strongly typed accessor methods that correspond to the fields inthe underlying recordset. Typed DataSets reduce programming errors because theyprovide specific field references and enforce the data types for each field.For example, typed DataSets prevent you from assigning a string value to a datefield.
The sample ASP.NET project that accompanies this articleprovides code listings for each of the data-access scenarios outlined in thisarticle, including an example of a typed DataSet.
Use Typed Accessors
We've focused most of our attention on the DataSet object,but the DataReader object deserves one more mention so we can highlight aspecific optimization for reading data from SQL server. If you know theunderlying data type for a column, you can achieve a performance gain by usingthe DataReader's typed accessor methods. These methods - GetSqlDateTime,GetSqlInt32, GetSqlString, to name a few - access the column values in theirnative data types to offer the best performance because they reduce the amountof type-data conversion required to retrieve a column value. Also, they preventcommon type-conversion errors and help keep the data precise. Here is a codeexample of how to use the GetSqlInt32 method:
While (sqlDR.Read())
strResults =sqlDR.GetSqlInt32(0).ToString()
End While
For a complete list of typed accessor methods, refer tothe SqlDataReader Members documentation in the .NET Framework Class LibraryDocumentation or in MSDN Library | .NET Framework SDK | .NET Framework |Reference | Class Library | System.Data.SqlClient | SqlDataReader Class |SqlDataReader Members.
In closing, see Figure 5 for a decision-flow diagram thatcaptures a broad range of data-access scenarios. It breaks down thedecision-making process for choosing the appropriate data-access approach. Designdecisions largely are dictated by the format in which data must be exchanged(that is, XML vs. non-XML), as well as whether the data access must beread-only or requires updates. This diagram can't capture all factors, but itdoes give some guidance when you make a final data-access design decision.
Figure 5. This decision-flow diagram for data access suggests primaryand alternate data-access methods based on a particular scenario.
We've reviewed the most relevant ASP.NET data-accessscenarios and demonstrated how to choose the best approach. The best way tooptimize data access is to recognize the nature of the scenario at hand andhave a full understanding of the relative advantages and disadvantages of agiven approach. For more information, we recommend you reference the chapter onData Access in our recent book, Performance Tuning and Optimizing ASP.NETApplications (Apress).
The sample code in thisarticle is available for download.
Jeffrey Hasan and Kenneth Tu are technicalarchitects and software developers who specialize in Microsoft technologies atInfoQuest Systems (http://www.infoquest.tv),a leading provider of business intelligence applications and services for thetelecommunications and broadband industries. Their primary expertise is in .NETenterprise application development, with a special focus on developingenterprise Web applications using ASP.NET. They recently co-authored Performance Tuning and Optimizing ASP.NETApplications (Apress). Read more about this and other publications at http://www.asptechnology.net. E-mail Jeffrey Hasan at mailto:[email protected].
Free Bonus Content
You can learn more about optimizing your data-access codein Chapter 3 of Jeffrey Hasan and Kenneth Tu's Performance Tuning andOptimizing ASP.NET Applications. Thischapter is available free online (Adobe Acrobat format).
About the Author
You May Also Like