2011-09-16

Rx and Async

I saw some rather shocking tweets yesterday from the BUILD conference:


The author of that original tweet followed up with a blog post with some interesting Rx-related quotes from Anders Hejlsberg: "I don't know if we've decided [whether Rx will be included in future versions of .NET]." and "Personally I've found the stuff we've done with async allows you to do a lot more [than Rx]." (emphasis mine).

Interesting. I tried to take a listen for myself, but the Channel9 live interview was no longer available. Note that these remarks were made during a live interview, and were not part of a presentation; I'm hoping that Anders just answered off the cuff and didn't mean it.

One reason I found those quotes controversial is because parallel programming (TPL/PLINQ), background operations (async/await), and asynchronous streams (Rx) all address different problems. In particular, Async only supports background operations and does not support asynchronous streams. Rx supports both, but Async will become the default solution for background operations because it's easier to use than Rx.

So, I agree with Anders that Async is easier to use, but I totally disagree that Async is more powerful. Rx can do everything Async can do, and can do some things that Async can't do.

It comes down to the difference between asynchronous operations and asynchronous events. An asynchronous operation is something that my program can start, and it will complete some time later. An asynchronous event stream is something that is happening all the time independent of my program; it can subscribe and unsubscribe, but does not cause the events. This is an important distinction if you consider an event stream that produces in quick bursts (e.g., mouse movement); Rx allows collating all of those events, but an async-based solution may miss some (because it has to restart the operation each time it completes).

Historically, asynchronous events have been a blind spot for Microsoft. Consider a condensed history of asynchronous support:

  1. Asynchronous Programming Model (APM). In the beginning, there was only IAsyncResult (Begin/End). The APM was everywhere, even baked into delegate types. The thing to note about APM is that it is purely an asynchronous operation; no asynchronous events are supported. The program starts the operation, which has a single point of completion.
  2. Event-Based Asynchronous Pattern (EAP). Way back in .NET 2.0, the EAP was introduced. EAP works by capturing the current SynchronizationContext and then raising events on that context. This was the first asynchronous pattern that supported both asynchronous operations and asynchronous events. Unfortunately, the documentation assumed that EAP objects are only implementing asynchronous operations, and completely ignored the EAP support for asynchronous events. In addition, the most famous EAP implementation (BackgroundWorker) was just an asynchronous operation. However, the Nito.Async library included some helpers for EAP components, and included sample socket components using EAP in an asynchronous event fashion.
  3. Rx. Supporting .NET 3.5 and up, the Rx libraries are all about asynchronous events (and they also support asynchronous operations, which are just a singleton asynchronous event). Rx is also more powerful than EAP because it has a very flexible execution context, while EAP ties everything through a single SynchronizationContext. However, the learning curve for Rx is steep.
  4. Async/await and the Task-Based Asynchronous Pattern (TAP). These extensions to the language allow for a very natural and easy way to deal with asynchronous operations, but they do not support asynchronous events.

In terms of power and flexibility, TAP is approximately equivalent to APM (less powerful than EAP and Rx). The only reason it's a step forward is because it is so easy to learn and use. Some simple programs may use only TAP, but other programs will need both TAP and Rx.

Rx is a very welcome (and necessary) addition to our toolset. Async does not and can not replace it.

(P.S. All of this - and much more - is covered in my "Thread is Dead" talk, which has been submitted for consideration at a couple of conferences in the next few months. I'll update this space when it's accepted.)

2011-09-09

Nito.AsyncEx Available

Nito.AsyncEx is now available.

Just as the Nito.Async library helps you work with the Event-Based Asynchronous Pattern (and its underlying concepts such as SynchronizationContext), the Nito.AsyncEx library helps you work with the Task-Based Asynchronous Pattern (and its underlying concepts such as Task).

2011-09-01

The Async CTP "Why Do the Keywords Work THAT Way" Unofficial FAQ

There's a lot of interest in the Async CTP, with good reason. The Async CTP will make asynchronous programming much, much easier than it has ever been. It's somewhat less powerful but much easier to learn than Rx.

The Async CTP introduces two new keywords, async and await. Asynchronous methods (or lambda expressions) must return void, Task, or Task<TResult>.

This post is not an introduction to the Async CTP; there's plenty of tutorial resources available out there. This post is an attempt to bring together the answers to a few common questions that programmers have when they start using the Async CTP.

Inferring the Return Type

When returning a value from an async method, the method body returns the value directly, but the method itself is declared as returning a Task<TResult>. There is a bit of "disconnect" when you declare a method returning one type and have to return another type:

// Actual syntax
public async Task<int> GetValue()
{
  await TaskEx.Delay(100);
  return 13; // Return type is "int", not "Task<int>"
}

Question: Why can't I write this:

// Hypothetical syntax
public async int GetValue()
{
  await TaskEx.Delay(100);
  return 13; // Return type is "int"
}

Consider: How will the method signature look to callers? Async methods that return a value must have an actual result type of Task<TResult>. So GetValue will show up in IntelliSense as returning Task<TResult> (this would also be true for the object browser, Reflector, etc).

Inferring the return type was considered during the initial design, but the team concluded that the keeping the "disconnect" within the async method was better than spreading the "disconnect" throughout the code base. The "disconnect" is still there, but it's smaller than it could be. The consensus is that a consistent method signature is preferred.

Consider: There is a difference between async void and async Task.

An async Task method is just like any other asynchronous operation, only without a return value. An async void method acts as a "top-level" asynchronous operation. An async Task method may be composed into other async methods using await. An async void method may be used as an event handler. An async void method also has another important property: in an ASP.NET context, it informs the web server that the page is not completed until it returns (see my MSDN article for more information on how this works).

Inferring the return type would remove the distinction between async void and async Task; either all async methods would be async void (preventing composability), or they would all be async Task (preventing them from being event handlers, and requiring an alternative solution for ASP.NET support).

Async Return

There is still a "disconnect" between the method declaration return type and the method body return type. Another option that has been suggested is to add a keyword to return to indicate that the value given to return is not really what's being returned, e.g.:

// Hypothetical syntax
public async Task<int> GetValue()
{
  await TaskEx.Delay(100);
  async return 13; // "async return" means the value will be wrapped in a Task
}

Consider: Converting large amounts of code from synchronous to asynchronous.

The async return keyword was also considered, but it wasn't compelling enough. This is particularly true when converting a lot of synchronous code to asynchronous code (which will be common over the next few years); forcing people to add async to every return statement just seemed like "needless busy-work." It's easier to get used to the "disconnect".

Inferring "async"

The async keyword must be applied to a method that makes use of await. However, it also gives a warning if it is applied to a method that does not make use of await.

Question: Why can't async be inferred based on the presence of await:

// Hypothetical syntax
public Task<int> GetValue()
{
  // The presence of "await" implies that this is an "async" method.
  await TaskEx.Delay(100);
  return 13;
}

Consider: Backwards compatibility and code readability.

Eric Lippert has the definitive post on the subject. It's also been discussed in blog comments, Channel9, and forums.

To summarize, a single-word await keyword would be too big of a breaking change. The choice was between a multi-word await (e.g., await for) or a keyword on the method (async) that would enable the await keyword just within that method. Explicitly marking methods async is easier for both humans and computers to parse, so they decided to go with the async/await pair.

Inferring "await"

Question: Since it makes sense to explicitly include async (see above), why can't await be inferred based on the presence of async:

// Hypothetical syntax
public async Task<int> GetValue()
{
  // "await" is implied, since this is an "async" method.
  TaskEx.Delay(100);
  return 13;
}

Consider: Parallel composition of asynchronous operations.

At first glance, inferring await appears to simplify basic asynchronous operations. As long as all waiting is done in serial (i.e., one operation is awaited, then another, and then another), this works fine. However, it falls apart when one considers parallel composition.

Parallel composition in the Async CTP is done using TaskEx.WhenAny and TaskEx.WhenAll methods. Here's a simple example which starts two operations immediately and asynchronously waits for both of them to complete:

// Actual syntax
public async Task<int> GetValue()
{
  // Asynchronously retrieve two partial values.
  // Note that these are *not* awaited at this time.
  Task<int> part1 = GetValuePart1();
  Task<int> part2 = GetValuePart2();

  // Wait for both values to arrive.
  await TaskEx.WhenAll(part1, part2);

  // Calculate our result.
  int value1 = await part1; // Does not actually wait.
  int value2 = await part2; // Does not actually wait.
  return value1 + value2;
}

In order to do parallel composition, we must have the ability to say we're not going to await an expression.