Solving a Common .NET Scalability ProblemSolving a Common .NET Scalability Problem
Michael K. Campbell shows readers how to solve a very common .NET scalability problem that involves the use of .NET applications that use ORMs such as LINQ to SQL and Entity Fraemwork.
September 24, 2013
In the past year or so, I’ve had a handful of clients fall into a very common trap that causes some ugly scalability problems with .NET applications that make use of ORMs such as LINQ to SQL (L2S) and Entity Framework (EF). What makes this problem particularly nefarious is the fact that it only manifests when a site or application comes under heavy load and starts processing lots of simultaneous requests. At this point, exceptions start getting thrown with messages reporting about, "There is already an open DataReader associated with this Command which must be closed first…," along with similarly perplexing problems.
Anatomy of a Problem
In the wild, this problem almost always revolves around a simple encapsulation problem. This is when a specific class or object is tasked with handling database interactions and then developers must use member variables to represent the context for their connection to the database. For example, let’s assume that a very simple repository grabs user information from a L2S context. Note that I could have used an EF, NHibernate, or another ORM for this example. We’d have code similar to Listing 1 where the repository is handling basic database interactions and where the actual LINQ to SQL DataContext is being created in the constructor, which is then used or accessed as a private member variable.Listing 1: Code for simple repository that grabs user information
public class NonScalableUserRepostory : IUserRepository{ private UsersDataContext _userContext; public NonScalableUserRepostory(IServiceConfig config) { this._userContext = new UsersDataContext(config.ConnectionString); } public User GetUserByEmailAddress(string emailAddress) { // check cache/etc. if not in cache, load and put into cache, etc. // db interaction code: User output = this._userContext.LoadUserByEmailAddress(emailAddress) .Single() // watch for nulls - sample code only... .ToUser(); // extension method / translation. return output; } // other methods here/etc - also using _userContext as above}
Although this is a common solution that I see developers using (I even made this mistake once upon a time), the biggest problem with this solution is that it works without any problems when the application is being developed. However, once the application starts to come under load, then you’ll start running into cases where Elmah or other logging software that you’re using starts sending messages about reporting errors such as, “Open DataReaders associated with this Command which must be closed first.” These messages tend to pop up sporadically at first. However, as the system is put under increased load, you’ll see these type of messages with much greater frequency to the point where they can become very problematic.
Correcting the Problem
Unfortunately, one way that I’ve seen this problem corrected in some cases is a kind of stab in the dark approach that involves turning on Multiple Active Result Sets (MARS). This approach basically lets you start nesting multiple operations and activities into a single connection and start overloading TDS packets back and forth between your application and SQL Server. Ultimately, MARS is very cool, but it wasn’t designed to correct this problem. Also, using MARS to correct this problem isn’t just a question of a code smell (i.e., where it could be argued that using this MARS approach would be a bit like using recursion where it wasn’t needed), instead it could also end in disaster if you’ve got explicit transactions going on as MARS can and will cause a connection to lose track of which transaction is which when MARS is ongoing.
Consequently, the correct solution to this problem is to simply stop using context objects (or whatever your ORM is presenting as a black box) as member variables. This results in a much cleaner solution and makes perfect sense when you realize that a member variable effectively acts like a singleton when used in the manner shown above. This approach is also much easier to do than enabling a complex feature like MARS that can introduce serious side effects.
To stop reusing your context objects, you just need to make changes as shown in Listing 2:
public class ScalableUserRepository : IUserRepository{ private string _connectionString; public ScalableUserRepository(IServiceConfig config) { this._connectionString = config.ConnectionString; } public User GetUserByEmailAddress(string emailAddress) { User output = null; // check cache/etc as before // spin up a new 'context' for every thread making // its way through your code: using (UsersDataContext context = new UsersDataContext(this._connectionString)) { output = context.LoadUserByEmailAddress(emailAddress) .Single() // same issue/problem - watch for nulls. .ToUser(); // add to cache if needed } return output; } // other methods here - also using the using() approach... }
Granted, this change might require several changes to your code such as needing to swap out calls to a member object with new using() operations, but that’s really only because your code was effectively busted previously. This fix does stop those dreaded errors and results in much better and more stable operations than using MARS as a hack around this problem.
To be fair to the StackOverflow response listed in the link above, it’s hard to exactly tell what the person requesting help was doing. It’s also not certain that they were running into this problem because they potentially might have implemented the worst practice of reusing their context object. I’m positive, however, that gobs of the people who have upvoted that question and response were running into the problem I’ve outlined and have, sadly, corrected it the wrong way using MARS instead of simply fixing their code to make it more scalable.
Read more about:
MicrosoftAbout the Author
You May Also Like