Jon Rumsey

An online markdown blog and knowledge repository.


Project maintained by nojronatron Hosted on GitHub Pages — Theme by mattgraham

DotNET Task Parallel Library Notes

Notes taken while reviewing MSFT Learn documentation of Task Parallel Library (TPL).

Table of Contents

Overview

The Task parallel Library and PLINQ Execution Engine enable developing scalable code without having to work directly with threads or the thread pool.

TPL: Task Parallel Library. Provides parallel versions of for and forEach constructs, and the Task class.

PLINQ: Parallel LINQ. Parallel implemntation of 'LINQ to Objects' with performances enhancements.

Data Structures for Parallel Programming: Provides thread-safe collections, lightweight synchronization types, and types for Lazy initialization.

Parallel Diagnostic Tools: Enables visualization into parallel tasks in Visual Studio and the Concurrency Visualizer.

Data Parallelism: Same operations performed concurrently on elements of a source collection/array.

Interlocked: A System.Threading library, providing atomic operations for variables shared by multiple threads Interlocked Class.

Local State: A Func<TResult> variable that exists just before a Parallel operation, until just after execution, for capturing thread-local state e.g. an accumulating value resulting from the iterations.

Loop Logic: A delegate or lambda expressing e.g. Func<int, ParallelLoopState, long, long> that will be provided to each occurrence of the loop, and defines the conditions of the loop such as counter, last index, a thread local variable, and a return type. The last param is passed to localFinally (type: Action<TLocal>) delegate after last iteration completes.

Thread Local: TBD.

Partition Local: Similar to Thread Local but multiple partitions can run within a single Thread.

Reminder: Imperative and Declarative Programming

Imperative Programming: Directs teh control flow of the program.

Declarative Programming: Specifies logic and result without directing program flow.

TPL - Task Parallel Library

Public Types and APIs in System.Threading and System.Threading.Tasks namespaces.

Help developers simplify adding parallelism and concurrency to applications.

TPL:

TPL - For and ForEach

Data from the source is partitioned so the iteration processes can happen concurrently.

Task Scheduler manages concurrency and system resources based on workload, and can redistribute work among multiple threads/processors.

// the collection is partitioned and the delegate
// passed in to individual threads for processing
Parallel.ForEach(collection, (arg) => { ... });
Parallel.For(int, (arg) => { ... });

// non-generic option:
Parallel.ForEach(nonGenericCollection.Cast<object>(),
  currentElement => {
    // process currentElement
  });

A custom Partitioner or Scheduler can be implemented and supplied to the TPL methods.

When accumulating a result from these looping structures, write Interlocked methods like Add(ref total, obj subtotal) to ensure thread-safe, atomic accumulator operations.

For and ForEach Cancellation

Cancellation is supported using Cancellation Tokens.

Supply CancellationToken to the method in ParallelOptions parameter and enclose the call within a try-catch block.

Catching an OperationCanceledException allows implementation of cleanup code and any notifications before returning from the parallel operation.

Note: If some other Exception is thrown before Cancel() is called, an AggregateException will be thrown from the loop, encompassing any pre-cancelation Exception and the OperationCanceledException as instances in an inner exception collection.

For and ForEach Exceptions

There is no special Exception handling within these methods in the TPL.

TPL - Create And Run Tasks Implicitly

Use Parallel.Invoke:

Parallel.Invoke(() => GetTheseThings(), () => SetThoseThings());

TPL - Create and Run Tasks Explicitly

Note: Calling Task.Result property will ensure the async task will become blocking while the caller waits for the Task to return if it is not in Completed state!

TPL - Accessing Values From Lambda Tasks

TPL: Potential Pitfalls

TPL - Concurrent Collections

Namespace System.Collection.Concurrent.

Provide thread-safe Add() and Remove() methods.

Avoids locks.

Leverages fine-grained locking when necessary.

User code does not need to explicitly take locks to get these to work.

Multiple threads can add and remove items from these collections.

Existing Concurrent Collection Classes

System.Collection.Concurrent.BlockingCollection<T>:

System.Collections.Concurrent.ConcurrentBag<T>:

System.Collections.Concurrent.ConcurrentDictionary<TKey,TValue>:

System.Collection.Concurrent.ConcurrentQueue<T>:

System.Collections.Concurrent.ConcurrentStack<T>:

System.Collection.Concurrent.IProducerConsumerCollection<T>:

Thread Safe Collections.

TAP - Task Creation Options

Long running task expected? Use new Task(() => MyLongRunningMethod(), TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);.

Each thread has an associated Thread.CurrentCulture and Thread.CurrentUICulture. This affects formatting, parsing, sorting, and string comparison operations, and is used in resource lookups. The default setting is inherited from the Default System Culture. Threads launched by other Threads with specific Culture settings do not inherit the parent Thread's setting.

TAP - Creating Detached Child Tasks

Any Task created without specifying AttachedToParent will not be synchronized with the parent Task in any way.

AKA 'Detached Nested Task' or 'Detached Child Task'.

The parent Task does not wait for the detached child Task to finish (assume Started but not Completed nor Faulted).

The option TaskCreationOptions.DenyChildAttach prevents other Tasks from attaching to the configured (parent) task.

TAP Attaching State To A Task

Do not inherit from System.Threading.Tasks.Task to do this.

Do use AsyncState property to associate the data with the Task.

PLINQ - Overview

A paralell implementation on LINQ (Language-integrated Query).

PLINQ - ParallelEnumerable Class

Class System.Linq.ParallelEnumerable.

Opt-in to TPL by invoking ParallelEnumerable.AsParallel extension method on the data source.

// example from Parallel LINQ (PLINQ) documentation on Learn.Microsoft.com
// in subsection "The Opt-in Model"
var source = Enumerable.Range(1, 10000);

// opt-in using AsParallel
var evenNums = from num in source.AsParallel()
               where num % 2 == 0
               select num;

Console.WriteLine("{0} even numbers out of {1} total",
                  evenNums.Count(), source.Count());

// expected output: 5000 even numbers out of 10000 total

PLINQ - Behaviors Overview

PLINQ is conservative and will pick safety over speed, and LINQ over PLINQ if parallel operation appears to be expensive compared to sequential execution.

PLINQ will try to use all processors on the host PC. This can be limited to 'no more than n' using WithDegreeOfParallelism(n).

AsOrdered() might still operate in Parallel but will maintain the sorting order rules on the source iterable. This will be slower than AsUnordered().

PLINQ will run sequential queries when specific operations require it. Use AsSequential() to ensure sequential operation on the PLINQ operation.

PLINQ can be configured to mix sequential and parallel processing.

ParallelMergeOptions enumeration is used to tell PLINQ how to merge results from each Thread back into the main thread result variable.

Execution is deferred until a query is enumerated unless a method like ToList(), ToArray(), or ToDictionary are used.

Cancellation is supported by PLINQ by using WithCancellation(CancellationToken token). Setting token to true will cause the PLINQ to cancel execution on all threads and throw OperationCanceledException. Note: Remaining thread in-execution will run to completion!

PLINQ uses AggregateException to capture Exceptions thrown on threads other than the querying thread. Use a single try-catch block and capture AggregateException just like when using TPL and TAP.

Custom Partitioners can be written for PLINQ., and instantiated with the item source, then executed with AsParallel().Select(func<T>) etc.

PLINQ - Measuring Performance

Overhead of setting up partitioning might not be worth running the work in parallel, so it could be executed sequentially.

Use Parallel Performance Analyzer in Visual Studio Team Server (ed: HA!) to compare query performance, locate bottlenecks.

Also checkout Concurrency Visualizer.

References

Parallel Programming article on MSFT Learn.

Data Structures for Parallel Programming.

Return to Conted Index.

Return to Root README.