Understanding Parallel LINQ (PLINQ)
Parallel LINQ (PLINQ) is a concurrency executionengine for executing Language-Integrated Query (LINQ) queries. PLINQ is actuallya part of the Parallel Extensions library (previously known as ParallelFramework Extensions PFX), which is a managed concurrency library thatcomprises two parts: Task Parallel Library (TPL) and PLINQ. The former is atask parallelism component, and the latter is a concurrency execution enginebuilt on top of the CLR. This article takes a look at PLINQ and its features.
October 30, 2009
Parallel LINQ (PLINQ) is a concurrency execution engine for executing Language-Integrated Query (LINQ) queries. PLINQ is actually a part of the Parallel Extensions library (previously known as Parallel Framework Extensions PFX), which is a managed concurrency library that comprises two parts: Task Parallel Library (TPL) and PLINQ. The former is a task parallelism component, and the latter is a concurrency execution engine built on top of the CLR. This article takes a look at PLINQ and its features. For more information about LINQ, see "LINQed & Layered" and "Understanding the LinqDataSource Control."
PLINQ Prerequisites
To work with PLINQ, you should have one of the following installed in your system:
Visual Studio 2008 with the Parallel Extensions Library
Visual Studio 2010 Beta 1 or later
Also, you should have a good understanding of LINQ and how to use LINQ queries.
What Is PLINQ?
Simply put, PLINQ is a parallel execution engine for executing your LINQ queries on multicore systems. The MSDN article, "ParallelLINQ: Running Queries On Multi-Core Processors," states: "PLINQ is a query execution engine that accepts any LINQ-to-Objects or LINQ-to-XML query and automatically utilizes multiple processors or cores for execution when they are available."
PLINQ is a programming model that you can use to build applications that can take advantage of parallel hardware for improved performance and scalability without the need to go deep into the intrinsic details of what data parallelism is and how it all works. The key to PLINQ is parallel execution using multiple threads, which execute concurrently. Note that a thread is the path of execution within a process and is also the smallest unit of execution within a process. PLINQ is based on extension methods and can be used to take advantage of multiple processors in your system.
Parallelizing Your LINQ Queries
When you're writing your LINQ queries, to parallelize those queries you should either reference the System.Concurrency.dll assembly at compilation time or the System.Linq.ParallelEnumerable.AsParallel extension method on your data.
Consider the following code:
var integerList = Enumerable.Range(1, 100);var data = from x in integerList.AsParallel()where x <= 25select x;foreach (var v in data){Console.WriteLine(v);}
Notice the usage of the AsParallel() statement. This would return and object of type ParallelQuery.
The AsParallel extension method is defined as shown in the following example:
public static class System.Linq.ParallelEnumerable {public static IParallelEnumerable AsParallel(this IEnumerable source);//Other Standard Query Operators}
Note that the AsParallel method is overloaded and can accept variable integer arguments and also a ParallelQueryOptions enumeration as parameters. The first argument that is, the integer argument denotes the degree of parallelism. The degree of parallelism is given by the number of threads in use. The other parameter, ParallelQueryOptions, is an enumeration that can have one of the two values: None and PreserveOrdering. The PreserveOrdering value is used to preserve the order of the elements.
Under the Covers
Note that any PLINQ query that can be parallelized is based on partitioning. What PLINQ does is breaks the input data into pieces and then distributes it to the processing cores on your system. Partitioning is of the following types:
range partitioning
chunk partitioning
striped partitioning
hash partitioning
Processing the Data in Parallel
PLINQ allows you to process parallel items in a collection using Parallel.For and Parallel.ForEach loops. Here is an example that illustrates how you can use the ForAll() loop to process items:
IEnumerable integerList = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };var data = from i in integerList.AsParallel()where i <= 5 select i;data.ForAll(i => Console.WriteLine(i));
Here is another example that shows the elasped time taken by the AsParallel() method to perform a particular task.
int[] myList = new int[90000];Random randomListInstance = new Random();for (int i = 0; i < myList.Length; i++)myList[i] = randomListInstance.Next(90000);Stopwatch stopWatch = new Stopwatch();stopWatch.Start();var results = from n in myList.AsParallel() select n;stopWatch.Stop();Console.WriteLine("Time Elasped is: "+stopWatch.Elapsed.Milliseconds.ToString()+" milliseconds");Console.Read();
You can also handle exceptions thrown by your PLINQ queries. To do so, you need to use the System.Threading.AggregateException class. You can retrieve the details of the actual exceptions using the InnerException property of the System.Threading.AggregateException class.
About the Author
You May Also Like