2011-12-01

Out for a Bit

There are several things that I was planning to do over the last week or two:

  • Collect my slides, notes, and demos for my "Thread is Dead" talk recently given at GRDevDay.
  • Update Nito.AsyncEx to support Silverlight 5 and possibly also Windows Phone.
  • Finish my "command line parsing" series of blog posts, and start a new series looking at "async in the real world" - essentially my "Thread is Dead" talk broken up into a couple dozen posts.

Unfortunately, sometimes the unexpected happens. My two-year-old son has been diagnosed with Leukemia, and I am writing this far from home in his hospital room.

I do still plan to do all of the things listed above, but they won't get done as soon as I was hoping. My apologies especially to the GRDevDay people!

2011-11-26

Virtualizing Two Machines over Thanksgiving

Like many "computer people," I do a lot of admin work for friends and familiy. Over the last few years, I've worked with my church to get them out of the dark ages of computing. The process is almost complete; I only have one more machine to replace, and then they will all be 64-bit dual-core 4GB systems running Pro editions of Windows. Next year I hope to (finally) put in a domain.

It turns out that two of the old machines have some outdated software that's critical to weekly operations. I'm working on replacements for the software, but in the meantime, the old machines were just sitting around, taking up space in the church office.

I decided to try to virtualize these machines on Friday (the day after Thanksgiving). This blog entry is just a "lessons learned" from this adventure.

The Challenge, and the Plan

The old "server" (XP Home with 192 MB RAM) and the old office machine (XP Home with 256 MB RAM) both needed virtualization. Due to the way the weekly process is done in the office, the old server would have to be virtualized onto the new server, and the old office machine would have to be virtualized onto the new office machine.

I'm most familiar with VMWare products (particularly VMWare Workstation), and I highly recommend them. However, I wanted to see if it was possible to virtualize these machines without incurring a licensing cost. My budget at Landmark Baptist isn't comparable to most IT departments. ;) So, I decided to try Hyper-V or Virtual PC, falling back on VirtualBox if necessary (it wasn't).

The server was the first machine to be replaced, so unfortunately at this point the new server has the most outdated hardware/OS. It's running Server 2008 but without Hyper-V... or even CPU virtualization support. :( Furthermore, according to what I've read, Hyper-V doesn't support USB, which IMO is a significant limitation (and a showstopper for the old "server").

So, I decided to try using Virtual PC for both virtual machines. The new office machine runs Win7 Pro, which is fully supported by the current version of Virtual PC ("Windows Virtual PC"). I was a bit apprehensive about the new server; Server 2008 isn't an officially supported platform for the previous version of Virtual PC ("Microsoft Virtual PC 2007 SP1"), but it turned out to work fine. Microsoft still has Virtual PC 2007 available for download, and SP1 added support for machines without virtualization hardware (which is just what I needed).

One limitation with Virtual PC is that it can only handle 127 GB hard drives. In my case, both machines had hard drives much smaller than that, so it wasn't a problem.

The plan at this point was to virtualize each machine to a different version of Virtual PC (running on different OSes and hardware). We'll see how well this worked in a moment, but first I'll mention the tool which kicked off this whole adventure.

Systems Internals has a great tool called disk2vhd, which can create a virtual disk from a physical disk - even storing the virtual disk image on the physical disk it's imaging, while the physical disk is running the OS running disk2vhd. If you think about it, that's pretty cool.

Disk2vhd can take quite a while (i.e., 8-10 hours) to run, so I tried to make my plan where it would run overnight. Once I have the machines in a VHD image, I should be able to create a Virtual PC machine using that for a hard drive. VirtualBox also supports VHD, so my fallback would be ready just in case.

There are several articles on the Internet where others have successfully converted a physical XP machine to a virtual PC on Windows 7. The steps are straightforward: Create a disk image using disk2vhd; copy the image to the host PC; set up a new virtual machine in Virtual PC; re-activate Windows on the virtual machine; and install Integration Components/Services.

One final note: during my preparations, I discovered that XP can run into a stop 0x7B when backing up to a disk image and restoring on different hardware (which is very similar to what I'm doing with disk2vhd). The steps to fix this are in KB314082. I did not run into this issue, but I'm including it here for others who may.

On Wednesday (the day before Thanksgiving), I had done all the research and established my plan. I downloaded disk2vhd, VirtualBox, and both versions of Virtual PC onto my USB drive and left for Petoskey. That night, I started both machines running disk2vhd and went over to my Mom's for Thanksgiving.

A Snag: OEM OS

I popped in to check the status on Thursday morning. The server disk2vhd failed; my external USB drive had a faulty power adapter and it had shorted out overnight. So I restarted it with my other USB drive, and turned my attention to the office machine.

I had noticed on the disk2vhd download page that OEM OS licenses prevent virtualization. Turns out the office machine was XP Home OEM. The VHD came out fine, but it was not possible to re-activate Windows on the virtual machine. I did have a spare XP Home Retail key, but apparently you can't activate an OEM install with a Retail key. I also tried the original OEM key, but that didn't work since it's keyed to the BIOS which is different in a virtual machine.

Re-installing the OS was out of the question (if I actually had the install media for the outdated programs, I would have installed them on the new machine and we wouldn't need to virtualize in the first place). In desperation, I searched online for any way to convert OEM to Retail in-place. Most of the articles recommended running a repair from a different CD, but that seemed hokey to me (how would that affect updates already installed?).

Finally, I discovered the Product Key Update Tool. I ran it on the old office machine, converting it from OEM to Retail, and then re-started disk2vhd. This time, I ran disk2vhd with the output disk image going directly over the network to the new host PC; this worked just fine and I highly recommend it.

During my searching, I also discovered sysprep. The Product Key Update Tool changes the old key to a new key; whereas Sysprep removes the existing key, requiring the user to type it in the next time the computer boots. I used the Update Tool, but Sysprep would probably also work.

Another Snag: Remote Control

I was hoping to do most of the work on Friday from the comfort of my Mom's living room, eating Thanksgiving leftovers and watching the kids play with their uncles. Unfortunately, I could not get mouse capture to work at all remotely before Integration Services were installed.

It doesn't appear to be possible to set up a new virtual machine remotely. At least not using LogMeIn, which is my remote control software of choice; in the past I've used pcAnywhere, UltraVNC, and Windows Live Mesh, but I've now settled solidly on LogMeIn.

I also tried to LogMeIn into another computer and Remote Desktop to the Virtual PC host; however, the mouse capture was still funky (the scale was messed up). Once I got Integration Services installed, remotely controlling a host PC worked fine.

So, I ended up having to physically be present for the initial virtual machine setup, which was disappointing.

Another Snag: Networking

When I brought up the old "server" as a virtual machine on the new server, the networking didn't work. Since I only had the Windows Activation UI available, it wasn't possible to diagnose. By default, Virtual PC will share the host's network card (using a network switch in software). The new server had a static IP, but this shouldn't have caused a problem. When I switched it to use NAT (a network router in software), the problem went away.

I've always used NAT for my VMWare virtual machines, so this was a natural step.

Issue: Slow Initial Boot

The "server" disk2vhd process never finished. I'm not entirely sure why; the disk file was approximately the correct size, but disk2vhd never completed. Eventually I just exited the program and decided to try to use the file anyway.

When starting the old server as a virtual machine for the first time, it took about an hour to get from initial startup, through Windows activation, and to the desktop. Virtual PC was pegging the CPU the entire time. I'm unsure of the reason for this; the host PC does not have virtualization hardware, the vhd could be incomplete, the vhd is dynamic, ...

Once I installed Integration Components and rebooted, the CPU problems disappeared. I can't say whether the resolution was due to the installation or the rebooting.

Issue: Integration Components on XP Home

The virtualized XP office machine is running on Windows Virtual PC under a Windows 7 Pro host. Normally, this situation allows a really neat trick: you can set up a program on the virtual machine so it looks like a program on the host, with its own Start menu entry, running in a regular window instead of a full virtual machine desktop, etc.

Unfortunately, that does not work if the virtual machine is XP Home. Apparently, the Integration Components use RDP (Remote Desktop) for that functionality. The auto-login feature is also not available.

Conclusion

The project was completed, though it took longer than I expected. I'll find out next week if everything works sufficiently on the virtualized machines.

Lessons learned:

  • You cannot virtualize an OEM install. You have to change it to a Retail install first, using the Product Key Update Tool.
  • Disk2vhd can target a vhd image over the network.
  • You must be physically present to set up the virtual machines, at least until the point that Integration Services are installed.
  • If you're having problems getting the virtual machine on the network, try using NAT.
  • Some Integration Components features do not work if the guest is XP Home.

2011-10-13

Option Parsing: Case Sensitivity

By default, all option parsing is case-sensitive:

class Program
{
  private sealed class Options : OptionArgumentsBase
  {
    [Option("name", 'n')]
    public string Name { get; private set; }
  }

  static int Main(string[] args)
  {
    try
    {
      var options = OptionParser.Parse<Options>();
      Console.WriteLine("Name: " + options.Name);
      return 0;
    }
    catch (OptionParsingException ex)
    {
      Console.Error.WriteLine(ex.Message);
      return 2;
    }
    catch (Exception ex)
    {
      Console.Error.WriteLine(ex);
      return 1;
    }
  }
}
> CommandLineParsingTest.exe /name Bob
Name: Bob

> CommandLineParsingTest.exe /Name Bob
Unknown option  Name  in parameter  /Name

This is normal for Unix users, but Windows users expect case-insensitivity. You can pass your own StringComparer to the Parse method to support case-insensitivity:

class Program
{
  private sealed class Options : OptionArgumentsBase
  {
    [Option("name", 'n')]
    public string Name { get; private set; }
  }

  static int Main(string[] args)
  {
    try
    {
      var options = OptionParser.Parse<Options>(stringComparer:StringComparer.CurrentCultureIgnoreCase);
      Console.WriteLine("Name: " + options.Name);
      return 0;
    }
    catch (OptionParsingException ex)
    {
      Console.Error.WriteLine(ex.Message);
      return 2;
    }
    catch (Exception ex)
    {
      Console.Error.WriteLine(ex);
      return 1;
    }
  }
}
> CommandLineParsingTest.exe /name Bob
Name: Bob

> CommandLineParsingTest.exe /Name Bob
Name: Bob

2011-10-06

Option Parsing; Positional Arguments

"Positional arguments" are any arguments not associated with an option. When using the Nito.KitchenSink option parsing library, positional arguments must come after any options and their arguments.

Individual Positional Arguments

You can use the PositionalArgumentAttribute to specify positional arguments in your options class. This attribute takes a single integral parameter, the 0-based index of the positional argument.

Positional arguments support the entire range of parsing possibilities, including SimpleParserAttribute.

This example uses a regular Level option along with a Name positional parameter.

class Program
{
  private sealed class Options : OptionArgumentsBase
  {
    [Option('l')]
    public int? Level { get; set; }

    [PositionalArgument(0)]
    public string Name { get; set; }
  }

  static int Main()
  {
    try
    {
      var options = OptionParser.Parse<Options>();

      Console.WriteLine("Level: " + options.Level);
      Console.WriteLine("Name: " + options.Name);

      return 0;
    }
    catch (OptionParsingException ex)
    {
      Console.Error.WriteLine(ex.Message);
      return 2;
    }
    catch (Exception ex)
    {
      Console.Error.WriteLine(ex);
      return 1;
    }
  }
}
> CommandLineParsingTest.exe
Level:
Name:

> CommandLineParsingTest.exe Bob
Level:
Name: Bob

> CommandLineParsingTest.exe -l 13
Level: 13
Name:

> CommandLineParsingTest.exe -l 13 Bob
Level: 13
Name: Bob

> CommandLineParsingTest.exe Bob -l 13
Unknown parameter  -l

The last test above shows that positional arguments must come after all regular options.

If you need to pass a positional argument that starts with a dash (-) or forward slash (/), you can pass the special option "--", which forces all remaining command-line arguments to be interpreted as positional arguments:

> CommandLineParsingTest.exe -Negative
Unknown option  N  in parameter  -Negative

> CommandLineParsingTest.exe -- -Negative
Level:
Name: -Negative

The Positional Argument Collection

Every options class must have one property that can receive "extra" positional arguments. Extra positional arguments are any positional arguments after those defined by PositionalArgumentAttribute.

Most programs do not need this functionality, so the OptionArgumentsBase class provides a simple collection called AdditionalArguments. By default, OptionArgumentsBase.Validate will throw an UnknownOptionException if any positional arguments end up in that collection.

A program may make use of the AdditionalArguments collection by overriding Validate:

class Program
{
  private sealed class Options : OptionArgumentsBase
  {
    [PositionalArgument(0)]
    public string Name { get; set; }

    public override void Validate()
    {
    }
  }

  static int Main()
  {
    try
    {
      var options = OptionParser.Parse<Options>();

      Console.WriteLine("Name: " + options.Name);
      Console.WriteLine("ArgList: " + string.Join(", ", options.AdditionalArguments));

      return 0;
    }
    catch (OptionParsingException ex)
    {
      Console.Error.WriteLine(ex.Message);
      return 2;
    }
    catch (Exception ex)
    {
      Console.Error.WriteLine(ex);
      return 1;
    }
  }
}
> CommandLineParsingTest.exe
Name:
ArgList:

> CommandLineParsingTest.exe Bob
Name: Bob
ArgList:

> CommandLineParsingTest.exe Bob 17
Name: Bob
ArgList: 17

> CommandLineParsingTest.exe Bob -l 13
Name: Bob
ArgList: -l, 13

> CommandLineParsingTest.exe -- Bob
Name: Bob
ArgList:

Alternatively, an options class may provide its own collection, marked with the PositionalArgumentsAttribute (note the plural "Arguments"). When it does this, the options class may not derive from OptionArgumentsBase; rather, it should implement the IOptionArguments interface.

The property does not have to be List<string> (which is used by OptionArgumentsBase). The only requirements on the collection is that it only have one method named Add which takes a single parameter. The parameter does not have to be string; it can be any type, and the standard parsing rules apply.

This means that PositionalArguments can be placed on a property of dictionary type, as long as a matching parser is provided.

Here's an example of a program taking any number of integer parameters:

class Program
{
  private sealed class Options : IOptionArguments
  {
    public Options()
    {
      this.Integers = new List<int>();
    }

    [PositionalArguments]
    public List<int> Integers { get; private set; }

    public void Validate()
    {
    }
  }

  static int Main()
  {
    try
    {
      var options = OptionParser.Parse<Options>();

      Console.WriteLine("Integers: " + string.Join(", ", options.Integers));

      return 0;
    }
    catch (OptionParsingException ex)
    {
      Console.Error.WriteLine(ex.Message);
      return 2;
    }
    catch (Exception ex)
    {
      Console.Error.WriteLine(ex);
      return 1;
    }
  }
}
> CommandLineParsingTest.exe
Integers:

> CommandLineParsingTest.exe 13
Integers: 13

> CommandLineParsingTest.exe 13 7
Integers: 13, 7

> CommandLineParsingTest.exe 13 7 Bob
Could not parse  Bob  as Int32

2011-09-16

Rx and Async

I saw some rather shocking tweets yesterday from the BUILD conference:


The author of that original tweet followed up with a blog post with some interesting Rx-related quotes from Anders Hejlsberg: "I don't know if we've decided [whether Rx will be included in future versions of .NET]." and "Personally I've found the stuff we've done with async allows you to do a lot more [than Rx]." (emphasis mine).

Interesting. I tried to take a listen for myself, but the Channel9 live interview was no longer available. Note that these remarks were made during a live interview, and were not part of a presentation; I'm hoping that Anders just answered off the cuff and didn't mean it.

One reason I found those quotes controversial is because parallel programming (TPL/PLINQ), background operations (async/await), and asynchronous streams (Rx) all address different problems. In particular, Async only supports background operations and does not support asynchronous streams. Rx supports both, but Async will become the default solution for background operations because it's easier to use than Rx.

So, I agree with Anders that Async is easier to use, but I totally disagree that Async is more powerful. Rx can do everything Async can do, and can do some things that Async can't do.

It comes down to the difference between asynchronous operations and asynchronous events. An asynchronous operation is something that my program can start, and it will complete some time later. An asynchronous event stream is something that is happening all the time independent of my program; it can subscribe and unsubscribe, but does not cause the events. This is an important distinction if you consider an event stream that produces in quick bursts (e.g., mouse movement); Rx allows collating all of those events, but an async-based solution may miss some (because it has to restart the operation each time it completes).

Historically, asynchronous events have been a blind spot for Microsoft. Consider a condensed history of asynchronous support:

  1. Asynchronous Programming Model (APM). In the beginning, there was only IAsyncResult (Begin/End). The APM was everywhere, even baked into delegate types. The thing to note about APM is that it is purely an asynchronous operation; no asynchronous events are supported. The program starts the operation, which has a single point of completion.
  2. Event-Based Asynchronous Pattern (EAP). Way back in .NET 2.0, the EAP was introduced. EAP works by capturing the current SynchronizationContext and then raising events on that context. This was the first asynchronous pattern that supported both asynchronous operations and asynchronous events. Unfortunately, the documentation assumed that EAP objects are only implementing asynchronous operations, and completely ignored the EAP support for asynchronous events. In addition, the most famous EAP implementation (BackgroundWorker) was just an asynchronous operation. However, the Nito.Async library included some helpers for EAP components, and included sample socket components using EAP in an asynchronous event fashion.
  3. Rx. Supporting .NET 3.5 and up, the Rx libraries are all about asynchronous events (and they also support asynchronous operations, which are just a singleton asynchronous event). Rx is also more powerful than EAP because it has a very flexible execution context, while EAP ties everything through a single SynchronizationContext. However, the learning curve for Rx is steep.
  4. Async/await and the Task-Based Asynchronous Pattern (TAP). These extensions to the language allow for a very natural and easy way to deal with asynchronous operations, but they do not support asynchronous events.

In terms of power and flexibility, TAP is approximately equivalent to APM (less powerful than EAP and Rx). The only reason it's a step forward is because it is so easy to learn and use. Some simple programs may use only TAP, but other programs will need both TAP and Rx.

Rx is a very welcome (and necessary) addition to our toolset. Async does not and can not replace it.

(P.S. All of this - and much more - is covered in my "Thread is Dead" talk, which has been submitted for consideration at a couple of conferences in the next few months. I'll update this space when it's accepted.)

2011-09-09

Nito.AsyncEx Available

Nito.AsyncEx is now available.

Just as the Nito.Async library helps you work with the Event-Based Asynchronous Pattern (and its underlying concepts such as SynchronizationContext), the Nito.AsyncEx library helps you work with the Task-Based Asynchronous Pattern (and its underlying concepts such as Task).

2011-09-01

The Async CTP "Why Do the Keywords Work THAT Way" Unofficial FAQ

There's a lot of interest in the Async CTP, with good reason. The Async CTP will make asynchronous programming much, much easier than it has ever been. It's somewhat less powerful but much easier to learn than Rx.

The Async CTP introduces two new keywords, async and await. Asynchronous methods (or lambda expressions) must return void, Task, or Task<TResult>.

This post is not an introduction to the Async CTP; there's plenty of tutorial resources available out there. This post is an attempt to bring together the answers to a few common questions that programmers have when they start using the Async CTP.

Inferring the Return Type

When returning a value from an async method, the method body returns the value directly, but the method itself is declared as returning a Task<TResult>. There is a bit of "disconnect" when you declare a method returning one type and have to return another type:

// Actual syntax
public async Task<int> GetValue()
{
  await TaskEx.Delay(100);
  return 13; // Return type is "int", not "Task<int>"
}

Question: Why can't I write this:

// Hypothetical syntax
public async int GetValue()
{
  await TaskEx.Delay(100);
  return 13; // Return type is "int"
}

Consider: How will the method signature look to callers? Async methods that return a value must have an actual result type of Task<TResult>. So GetValue will show up in IntelliSense as returning Task<TResult> (this would also be true for the object browser, Reflector, etc).

Inferring the return type was considered during the initial design, but the team concluded that the keeping the "disconnect" within the async method was better than spreading the "disconnect" throughout the code base. The "disconnect" is still there, but it's smaller than it could be. The consensus is that a consistent method signature is preferred.

Consider: There is a difference between async void and async Task.

An async Task method is just like any other asynchronous operation, only without a return value. An async void method acts as a "top-level" asynchronous operation. An async Task method may be composed into other async methods using await. An async void method may be used as an event handler. An async void method also has another important property: in an ASP.NET context, it informs the web server that the page is not completed until it returns (see my MSDN article for more information on how this works).

Inferring the return type would remove the distinction between async void and async Task; either all async methods would be async void (preventing composability), or they would all be async Task (preventing them from being event handlers, and requiring an alternative solution for ASP.NET support).

Async Return

There is still a "disconnect" between the method declaration return type and the method body return type. Another option that has been suggested is to add a keyword to return to indicate that the value given to return is not really what's being returned, e.g.:

// Hypothetical syntax
public async Task<int> GetValue()
{
  await TaskEx.Delay(100);
  async return 13; // "async return" means the value will be wrapped in a Task
}

Consider: Converting large amounts of code from synchronous to asynchronous.

The async return keyword was also considered, but it wasn't compelling enough. This is particularly true when converting a lot of synchronous code to asynchronous code (which will be common over the next few years); forcing people to add async to every return statement just seemed like "needless busy-work." It's easier to get used to the "disconnect".

Inferring "async"

The async keyword must be applied to a method that makes use of await. However, it also gives a warning if it is applied to a method that does not make use of await.

Question: Why can't async be inferred based on the presence of await:

// Hypothetical syntax
public Task<int> GetValue()
{
  // The presence of "await" implies that this is an "async" method.
  await TaskEx.Delay(100);
  return 13;
}

Consider: Backwards compatibility and code readability.

Eric Lippert has the definitive post on the subject. It's also been discussed in blog comments, Channel9, and forums.

To summarize, a single-word await keyword would be too big of a breaking change. The choice was between a multi-word await (e.g., await for) or a keyword on the method (async) that would enable the await keyword just within that method. Explicitly marking methods async is easier for both humans and computers to parse, so they decided to go with the async/await pair.

Inferring "await"

Question: Since it makes sense to explicitly include async (see above), why can't await be inferred based on the presence of async:

// Hypothetical syntax
public async Task<int> GetValue()
{
  // "await" is implied, since this is an "async" method.
  TaskEx.Delay(100);
  return 13;
}

Consider: Parallel composition of asynchronous operations.

At first glance, inferring await appears to simplify basic asynchronous operations. As long as all waiting is done in serial (i.e., one operation is awaited, then another, and then another), this works fine. However, it falls apart when one considers parallel composition.

Parallel composition in the Async CTP is done using TaskEx.WhenAny and TaskEx.WhenAll methods. Here's a simple example which starts two operations immediately and asynchronously waits for both of them to complete:

// Actual syntax
public async Task<int> GetValue()
{
  // Asynchronously retrieve two partial values.
  // Note that these are *not* awaited at this time.
  Task<int> part1 = GetValuePart1();
  Task<int> part2 = GetValuePart2();

  // Wait for both values to arrive.
  await TaskEx.WhenAll(part1, part2);

  // Calculate our result.
  int value1 = await part1; // Does not actually wait.
  int value2 = await part2; // Does not actually wait.
  return value1 + value2;
}

In order to do parallel composition, we must have the ability to say we're not going to await an expression.