2009-05-28

MSBuild: Factorial!

I've been doing some exploring of MSBuild as a programming language. There are some interesting results regarding mutability/immutability, but that's for another post.

This post is about functions. In particular, a Target may be invoked using the MSBuild task, so I'm exploring using Targets as functions. MSBuild can pass parameters to a Target by sending it Properties. Property changes are not propogated back to the caller, though, so getting a return value is a bit trickier.

It turns out that MSBuild does return one bit of information from a Target: its Outputs. It's possible to set the Outputs of a Target to a Property, and have that Target depend on another Target that sets that Property. In this way, it is possible to create a pair of Targets that can "calculate" the outer Target's Outputs.

By combining these approaches (setting Properties for arguments, and using the Target's Outputs as a return value), it is possible to treat a Target as a function.

To demonstrate, I wrote this program, which uses MSBuild to recursively calculate the factorial of the $(Input) property. Have fun playing!

<Project ToolsVersion="3.5" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <Import Project="$(MSBuildExtensionsPath)\ExtensionPack\MSBuild.ExtensionPack.tasks"/>
 
  <!-- Factorial program using MSBuild recursively -->
 
  <Target Name="Default">
    <!-- Display usage -->
    <Error Condition="'$(Input)' == ''" Text="Usage: msbuild factorial.proj [/nologo] [/clp:v=minimal] /p:Input=nnn"/>
 
    <!-- Argument error checking -->
    <MSBuild.ExtensionPack.Science.Maths TaskAction="Compare" P1="$(Input)" P2="1" Comparison="LessThan">
      <Output TaskParameter="LogicalResult" PropertyName="InputCheck"/>
    </MSBuild.ExtensionPack.Science.Maths>
    <Error Condition="'$(InputCheck)' != 'False'" Text="Input cannot be less than 1."/>
 
    <!-- Invoke the Factorial target with the current Input property -->
    <MSBuild Projects="$(MSBuildProjectFile)" Targets="Factorial" Properties="Input=$(Input)">
      <Output TaskParameter="TargetOutputs" ItemName="FactorialResult"/>
    </MSBuild>
 
    <!-- Display the result -->
    <Message Importance="high" Text="Result: @(FactorialResult)"/>
  </Target>
 
  <!-- The Factorial target uses FactorialCore to do the calculation, storing the result in FactorialResult -->
  <Target Name="Factorial" DependsOnTargets="FactorialCore" Outputs="$(FactorialResult)" />
 
  <Target Name="FactorialCore">
    <!-- If the input is 1, then the factorial is 1 -->
    <PropertyGroup Condition="'$(Input)' == '1'">
      <FactorialResult>1</FactorialResult>
    </PropertyGroup>
 
    <!-- If we don't know the result yet (i.e., the input is not 1), then calculate the factorial -->
    <CallTarget Condition="'$(FactorialResult)' == ''" Targets="CalculateFactorial"/>
  </Target>
 
  <Target Name="CalculateFactorial">
    <!-- Subtract 1 from $(Input) -->
    <MSBuild.ExtensionPack.Science.Maths TaskAction="Subtract" Numbers="$(Input);1">
      <Output TaskParameter="Result" PropertyName="InputMinus1"/>
    </MSBuild.ExtensionPack.Science.Maths>
 
    <!-- Determine the factorial of $(Input) - 1 -->
    <MSBuild Projects="$(MSBuildProjectFile)" Targets="Factorial" Properties="Input=$(InputMinus1)">
      <Output TaskParameter="TargetOutputs" ItemName="SubResult"/>
    </MSBuild>
 
    <!-- Multiply !($(Input) - 1) by $(Input) to get the result-->
    <MSBuild.ExtensionPack.Science.Maths TaskAction="Multiply" Numbers="@(SubResult);$(Input)">
      <Output TaskParameter="Result" PropertyName="FactorialResult"/>
    </MSBuild.ExtensionPack.Science.Maths>
  </Target>
 
  <!-- Maybe I just have way too much time on my hands... -->
</Project>

msbuild factorial.proj /nologo /clp:v=minimal /p:Input=5

Default:
  Result: 120

msbuild factorial.proj /nologo /clp:v=minimal /p:Input=7

Default:
  Result: 5040

Useless, but cool nonetheless.

MSBuild: ItemGroup Metadata Inversion

Sometimes it's useful to treat a piece of metadata as though it were the actual item. This is particularly true if the metadata refers to a file location, so one could pull well-known metadata off the metadata value.

MSBuild does not support metadata having metadata. However, an "inversion" can be performed, where a new ItemGroup is created with the metadata as the primary item entry. The example below also places the original ItemGroup Identity as metadata on the new ItemGroup entries, creating a bidirectional mapping.

<Project ToolsVersion="3.5" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <Import Project="$(MSBuildExtensionsPath)\ExtensionPack\MSBuild.ExtensionPack.tasks"/>
 
  <ItemGroup>
    <ProjectDefinitions Include="First">
      <ProjectFile>one.sln</ProjectFile>
    </ProjectDefinitions>
    <ProjectDefinitions Include="Second">
      <ProjectFile>two.sln</ProjectFile>
    </ProjectDefinitions>
    <ProjectDefinitions Include="Third">
      <ProjectFile>three.sln</ProjectFile>
    </ProjectDefinitions>
  </ItemGroup>
 
  <Target Name="Default">
    <ItemGroup>
      <ProjectFiles Include="%(ProjectDefinitions.ProjectFile)">
        <ProjectDefinition>@(ProjectDefinitions->'%(Identity)')</ProjectDefinition>
      </ProjectFiles>
    </ItemGroup>
    <Message Text="Project files: @(ProjectFiles) (definitions: @(ProjectFiles->'%(ProjectDefinition)'))"/>
  </Target>
</Project>
Project files: one.sln;two.sln;three.sln (definitions: First;Second;Third)

Note that you do have to watch your grouping; if the metadata being inverted is not unique for all entries in the original ItemGroup, then some entries in the resulting ItemGroup will have multi-valued metadata for their "original Identity" values.

2009-05-27

MSBuild: Filtering an ItemGroup based on a Property

I started playing with MSBuild this weekend. It's a little under-documented for my taste, but seems rather powerful. It has a strange combination of functional and procedural styles which make some simple tasks relatively complex.

This is the first in what I hope will be a series of posts of solutions that I've worked through for MSBuild. Keep in mind that I am still an MSBuild beginner, so there may be a better way to solve these problems.

The Problem

Given one ItemGroup (including metadata), how can one choose a subset of the items, keeping metadata intact? The subset is determined by a property that is actually a list of keys.

The Solution

<Project ToolsVersion="3.5" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <Import Project="$(MSBuildExtensionsPath)\ExtensionPack\MSBuild.ExtensionPack.tasks"/>
 
  <ItemGroup>
    <ProjectDefinitions Include="First">
      <Argument>1</Argument>
    </ProjectDefinitions>
    <ProjectDefinitions Include="Second">
      <Argument>2</Argument>
    </ProjectDefinitions>
    <ProjectDefinitions Include="Third">
      <Argument>3</Argument>
    </ProjectDefinitions>
  </ItemGroup>
 
  <PropertyGroup>
    <!-- By default, only build the first and third projects; this property may be overridden on the command line with the "/p" argument -->
    <Projects>First;Third</Projects>
  </PropertyGroup>
 
  <Target Name="Default" DependsOnTargets="DetermineProjectsToBuild">
    <Message Text="Projects to build: @(Projects) (arguments: @(Projects->'%(Argument)'))"/>
  </Target>
  
  <!--
  Determines which projects to build, based off the ProjectDefinitions items and the Projects property. Calculates the following item group:
    Projects - containing all ProjectDefinitions specified in the Projects property, with all metadata intact.
  -->
  <Target Name="DetermineProjectsToBuild">
    <!-- Split the Projects property up into an item group ProjectNamesToBuild that has one entry per item name -->
    <MSBuild.ExtensionPack.Framework.MSBuildHelper TaskAction="StringToItemCol" ItemString="$(Projects)" Separator=";">
      <Output TaskParameter="OutputItems" ItemName="ProjectNamesToBuild"/>
    </MSBuild.ExtensionPack.Framework.MSBuildHelper>
 
    <!-- Build the Projects item group by looking up the project names in the ProjectDefinitions item group -->
    <FindInList CaseSensitive="false" List="@(ProjectDefinitions)" ItemSpecToFind="%(ProjectNamesToBuild.Identity)">
      <Output TaskParameter="ItemFound" ItemName="Projects"/>
    </FindInList>
  </Target>
</Project>
Projects to build: First;Third (arguments: 1;3)

By default, this example decides to build the first and third projects. However, passing /p:Projects="First;Second" will change to the first and second projects (shown below). The metadata is preserved, as shown by displaying the arguments.

Projects to build: First;Second (arguments: 1;2)

Using Socket as a Server (Listening) Socket

(This post is part of the TCP/IP .NET Sockets FAQ)

Normally, server sockets may accept multiple client connections. Conceptually, a server socket listens on a known port. When an incoming connection arrives, the listening socket creates a new socket (the "child" socket) bound to a local port, and establishes the connection on the child socket. The listening socket is then free to resume listening on the same port, while the child socket has an established connection with the client that is independent from its parent.

One result of this architecture is that the listening socket never actually performs a read or write operation. It is only used to create connected sockets.

The listening socket usually proceeds through the operations below.

  1. Construct. Socket construction is identical for all TCP/IP sockets; see Socket Operations for details.
  2. Bind. Binding for listening sockets is usually done only on the port, setting the IP address parameter to IPAddress.Any (MSDN). A Bind failure is usually due to another process already bound to that port (possibly another instance of the server process).
  3. Listen. The listening socket actually begins listening at this point. It is not yet accepting connections, but the OS may accept connections on its behalf.
    The confusing "backlog" parameter. The "backlog" parameter to Socket.Listen is how many connections the OS may accept on behalf of the application. This is not the total number of active connections; it is only how many connections will be established if the application "gets behind". Once connections are Accepted, they move out of the backlog queue and no longer "count" against the backlog limit.
    The value to pass for the "backlog" parameter. Historically, this has been restricted to a maximum of 5, though modern systems have a cap of 200. Specifying a backlog higher than the maximum is not considered an error; the maximum value is used instead. The .NET docs fail to mention that int.MaxValue can be used to invoke the "dynamic backlog" feature (Windows Server systems only), essentially leaving it up to the OS. It is tempting to set this value very high (e.g., always passing int.MaxValue), but this would hurt system performance (on non-server machines) by pre-allocating a large amount of scarce resources. This value should be set to a reasonable amount (usually between 2 and 5), based on how many connections one is realistically expecting and how quickly they can be Accepted.
  4. (repeat) Accept. When a socket connection is accepted by the listening socket, a new socket connection is created. The listening socket should continue listening on the same port by re-starting the Accept operation as soon as it completes. The result of a completed Accept operation is a new, connected socket that has already bound to a local port. This new socket may be used for reading and writing. For more information on using connected sockets, see Using Socket as a Connected Socket. The new socket is completely independent from the listening socket; closing either socket does not affect the other socket.
  5. Close. Since the listening socket is never actually connected (it only accepts connected sockets), there is no Disconnect operation. Rather, closing a listening socket simply informs the OS that the socket is no longer listening and frees those resources immediately.

There are a few common variations on the above theme:

  1. A listening socket may choose to bind to an actual IP address in addition to a port. This is normally done for security reasons. If this is done, then the Bind operation may fail if the network cable is unplugged or wireless router is down.
  2. A listening socket may choose not to bind (actually, the socket is still bound; it is just bound to an OS-chosen port). This is extremely rare, and only found in very old protocols such as non-PASV FTP. This requires an application protocol that can notify the other side of the port that the OS chose to bind, and this tight coupling of the application protocol (e.g., FTP) with the transport mechanism (e.g., TCP) is not recommended. One reason is that it requires any NAT'ing (network address translating) devices to monitor the protocol and dynamically predict the necessary port forwarding.

(This post is part of the TCP/IP .NET Sockets FAQ)

2009-05-23

On a Lighter Note: Airport Error

Being the fan of error messages that I am, I have of course seen pictures of bluescreens in airports. I even saw one myself, but that was several years ago (before I had my digital camera, sadly).

So, I was thrilled when I saw an error message on our way back from Tennessee! It's not quite a bluescreen, but does raise some vendor concerns.

The text is difficult to see without zooming in, so here it is: the failing application is "radpinit.exe"; caption: "DLL Initialization Failed"; message: "The application failed to initialize because the window station is shutting down." I think we've all seen that one before. :)

However, we haven't all seen it from a component of Hewlett Packard's Business Technology Optimization platform. Supposedly, "radpinit" is in charge of automated software deployment and updates.

It's been many, many years since I've bought a computer and not wiped the entire hard drive within the first 24 hours. Sorry, but OEM software is just buggy to the hilt, and usually ad-ridden to boot. The usual procedure is to create the recovery disks, back up all the drivers (yes, I have bought computers where the recovery disks did not include the correct drivers!), copy all license keys, and then wipe everything.

INI File Reader in C#

Most .NET applications do not need access to the old INI file format, so Microsoft decided not to include it in the .NET framework. Multiple other options are available, from .config files to the Registry. However, there are a handful of situations where an old INI file must be read.

I wrote this class while doing some test development on an ini2reg-style program (a program that would read an existing INI file and then write appropriate Registry entries so that the information is read out of the registry instead; see MSDN: GetPrivateProfileString, MSDN: INF Ini2Reg, and MSDN: NT Resource Kit, Chapter 26).

Note that this is not a particularly well-designed class; I wrote it quickly. It's posted here as an example of moderately difficult interop; specifically, how to read a multi-string value, where "multi-string" means a single buffer that is double-null-terminated and may contain embedded (single) nulls.

public sealed class IniReader
{
    [DllImport("kernel32.dll", EntryPoint="GetPrivateProfileStringW", CharSet=CharSet.Unicode, ExactSpelling=true, SetLastError=true), SuppressUnmanagedCodeSecurity]
    private static extern int GetPrivateProfileString(string lpAppName, string lpKeyName, string lpDefault,
        [MarshalAs(UnmanagedType.LPArray, SizeParamIndex=4)] char[] lpReturnedString, int nSize, string lpFileName);
 
    private static string GetPrivateProfileString(string fileName, string sectionName, string keyName)
    {
        char[] ret = new char[256];
 
        while (true)
        {
            int length = GetPrivateProfileString(sectionName, keyName, null, ret, ret.Length, fileName);
            if (length == 0)
                Marshal.ThrowExceptionForHR(Marshal.GetHRForLastWin32Error());
 
            // This function behaves differently if both sectionName and keyName are null
            if (sectionName != null && keyName != null)
            {
                if (length == ret.Length - 1)
                {
                    // Double the buffer size and call again
                    ret = new char[ret.Length * 2];
                }
                else
                {
                    // Return simple string
                    return new string(ret, 0, length);
                }
            }
            else
            {
                if (length == ret.Length - 2)
                {
                    // Double the buffer size and call again
                    ret = new char[ret.Length * 2];
                }
                else
                {
                    // Return multistring
                    return new string(ret, 0, length - 1);
                }
            }
        }
    }
 
    public static string[] SectionNames(string fileName)
    {
        return GetPrivateProfileString(fileName, null, null).Split('\0');
    }
 
    public static string[] KeyNames(string fileName, string sectionName)
    {
        return GetPrivateProfileString(fileName, sectionName, null).Split('\0');
    }
 
    public static string Value(string fileName, string sectionName, string keyName)
    {
        return GetPrivateProfileString(fileName, sectionName, keyName);
    }
}

Using Socket as a Client Socket

(This post is part of the TCP/IP .NET Sockets FAQ)

A client socket connects to a known server socket. To do so, it usually proceeds through the operations below.

  1. Construct. Socket construction is identical for all TCP/IP sockets; see Socket Operations for details.
  2. Connect. The client socket connects to the server socket by specifying the server's IP address and port. If the connection fails, the application may choose to notify the user and/or retry the connection (usually after a timeout), as appropriate; see error handling for details. Once the connection completes, the client socket is a connected socket.
  3. (repeat) Read and Write. Asynchronous sockets normally have an active read at all times, and write as necessary; synchronous sockets must choose whether to read or write based on the application protocol. See Using Socket as a Connected Socket for details.
  4. Close. Closing the socket releases the OS resources. By default, this will perform a graceful disconnect from the server in the background.

There are a couple of common variations for the above theme:

  1. A client application may wish to know when the disconnect from the server has completed. One reason for this is to prevent a graceful disconnect from getting promoted to an abortive disconnect, which can happen if the client exits shortly after closing its socket. In this case, the client may Disconnect the socket before Closing it, and delay exiting the process until the Disconnect has completed.
  2. A client application may wish to specify the network used for communication. This is normally done for security reasons, but there are other valid reasons to specify a network as well. If a client application wishes to control the network used for the connection, it may Bind before it Connects. In this case, usually only the IP address is specified in the Bind, allowing the OS to choose the port (the port number is set to 0, meaning "any available port"). Bind is normally a server socket operation and is covered in detail in Using Socket as a Server (Listening) Socket.

(This post is part of the TCP/IP .NET Sockets FAQ)

2009-05-18

WiX Version Lying, Take Two

In an earlier post, I described how a friend of mine solved an installer update problem by version lying about a data file. Just yesterday they ran into a similar but more sinister problem.

The Problem, In a Nutshell

One of their libraries depended on an ocx control. This was added as a COM reference to the library. They built and distributed the first version of this software, being sure to include the "COM Interop" DLL that is automatically created by the compiler. They followed the Windows Installer component guidelines and made the interop dll and the dependent library separate components. It worked; no problems.

As part of a company initiative to sign their distributed code, they strong-named their library for the next release. Since strong-named assemblies may only load other strong-named assemblies, the compiler automatically strong-named its COM interop DLL on the next build.

The problem: The new (signed) COM interop DLL had the same version as the old (unsigned) COM interop DLL. The company realized while testing their release that if they upgraded a previous installation instead of performing a clean install, then their library would fail to load.

Of course, this is because during upgrades, Windows Installer will examine the file version, see that there are no differences, and skip installing that file. It ignores the last-modified time. The end result is that after an upgrade, the newer (signed) library was trying to load the new (signed) COM interop DLL, but only the old (unsigned) COM interop DLL was present.

Attempted Solutions

Set File.DefaultVersion. This is the old "version lying" trick, which worked before. However, WiX will always ignore File.DefaultVersion as long as the file actually has version information (and in this case it does; the compiler always adds version information to the COM interop DLL). WiX can be instructed to ignore all file version information, but this would require major (and ugly) changes to the installer files - essentially, they would have to do version lying on every single file in the msi. They decided this was not an acceptable option.

Mess with COMReference. The (poorly-documented) COMReference element in the MSBuild file does have VersionMajor and VersionMinor child elements, which do control the version of the COM interop DLL. Unfortunately, they also control the version of the COM/ActiveX object that is loaded. If they are set to anything other than the correct COM object version, the build fails; so, they cannot be used to set the COM interop DLL version.

Write an installer transform. Windows Installer does support the notion of "installer transforms", which are databases of changes to the installer database. An installer transform could do the version lying. That way, MSBuild would create the COM interop DLL, WiX would place it into the installer database along with its version information, and then the installer transform would overwrite the version information in the database. This is the cleanest solution, but ended up getting rejected due to lack of experience with installer transforms.

The Accepted Solution

A few days ago, I posted an interesting message I found in a Windows executable while testing out a Win32 resource manager. I talked my friend into letting me have a crack at their problem.

With a few modifications of this very, very pre-release code, I created a small console program that could change the version number of a PE/PE+ file ("PE/PE+" means EXEs, DLLs, OCXs, etc., either x86/AnyCPU or x64). They included it into their build process, and it worked quite nicely.

Note: what they ended up doing is unconventional and not recommended, but it is a useful workaround for WiX upgrade scenarios. They do plan to change this in a future version of the installer.

The reason this works is because Windows Installer will only consider the version information on disk (for the original file) and in the installer database (for the updated file). In contrast, the .NET loader will verify the strong name signature against the .NET assembly version (AssemblyVersionAttribute) and ignores the file version (AssemblyFileVersionAttribute).

On a Lighter Note: Internet Explorer / MSDN Forums Error Message

The other day (on May 2nd, to be exact), I was using Internet Explorer (version 8, no less) to answer some questions on the MSDN forums site. Suddenly, I was confronted with this "error non-message".

Generally speaking, this site works very well... as long as you've got a high-speed Internet connection that never gets interrupted. However, if you're trying to do a download in the background on a wireless connection to a shared Internet connection that is the slowest of the "high-speeds", then the MSDN forums pages begin to show their quirks.

I find error messages - especially bad ones - to be interesting. I especially like the warning icon; it just brings real meaning to this screenshot. :)

Sample Code: Getting the Local IP Addresses

(This post is part of the TCP/IP .NET Sockets FAQ)

The sample code below enumerates all the adapters on a machine, and then enumerates all IPv4 addresses for each adapter. This is necessary because a computer may have multiple IP addresses.

/// <summary>
/// This utility function displays all the IP (v4, not v6) addresses of the local computer.
/// </summary>
public static void DisplayIPAddresses()
{
    StringBuilder sb = new StringBuilder();
 
    // Get a list of all network interfaces (usually one per network card, dialup, and VPN connection)
    NetworkInterface[] networkInterfaces = NetworkInterface.GetAllNetworkInterfaces();
 
    foreach (NetworkInterface network in networkInterfaces)
    {
        // Read the IP configuration for each network
        IPInterfaceProperties properties = network.GetIPProperties();
 
        // Each network interface may have multiple IP addresses
        foreach (IPAddressInformation address in properties.UnicastAddresses)
        {
            // We're only interested in IPv4 addresses for now
            if (address.Address.AddressFamily != AddressFamily.InterNetwork)
                continue;
 
            // Ignore loopback addresses (e.g., 127.0.0.1)
            if (IPAddress.IsLoopback(address.Address))
                continue;
 
            sb.AppendLine(address.Address.ToString() + " (" + network.Name + ")");
        }
    }
 
    MessageBox.Show(sb.ToString());
}

(This post is part of the TCP/IP .NET Sockets FAQ)

Getting the Local IP Address

(This post is part of the TCP/IP .NET Sockets FAQ)

One common FAQ is how to get the local IP address of the computer. In fact, the very question is wrong: a computer may easily have multiple IP addresses. In fact, a computer may have multiple network adapters, each of which has multiple addresses. A single network card may have multiple IP addresses as long as they are on separate logical networks; this is called "multihoning" and is sometimes done for security reasons. Of course, a computer may have multiple network cards as well, especially when one considers virtual networks.

True (but boring) story: my laptop (on which I am writing this) currently has seven network "adapters": one physical, one wireless, one dialup, two VPN, and two for virtual machine networks. This is not including the Teredo virtual adapter, and others may also install the loopback adapter, which is commonly seen on testing machines.

The moral of this (short) FAQ entry? Never, ever assume that a computer only has one IP address. A lot of sample code for retrieving the local IP address does make this faulty assumption. However, the code here displays a list of IP addresses.

(This post is part of the TCP/IP .NET Sockets FAQ)

2009-05-16

On a Lighter Note: Interesting Message in AutoChk

A lot of my recent blog entries for the last couple weeks have been almost articles, and I'm a bit tired. Writing is hard work!

So, this blog post is a bit in a "lighter" vein, just a little curiosity I found mucking about.

I'm doing some testing on a Win32 resource manager written purely in managed code. One of the tests is to load various resources from operating system files, and make sure everything looks "right". (Don't get me started on how undocumented some of this stuff is...)

Anyway, an unexpected message was found in Vista x64's "autochk.exe" (that thing that runs at bootup if your hard drive needs repairing). It's in the MessageTable resource, with message ID 0x427:

This never gets printed.

How interesting.

Future interesting blog posts will doubtless follow.

Detection of Half-Open (Dropped) Connections

(This post is part of the TCP/IP .NET Sockets FAQ)

There is a three-way handshake to open a TCP/IP connection, and a four-way handshake to close it. However, once the connection has been established, if neither side sends any data, then no packets are sent over the connection. TCP is an "idle" protocol, happy to assume that the connection is active until proven otherwise.

TCP was designed this way for resiliency and efficiency. This design enables a graceful recovery from unplugged network cables and router crashes. e.g., a client may connect to a server, an intermediate router may be rebooted, and after the router comes back up, the original connection still exists (this is true unless data is sent across the connection while the router was down). This design is also efficient, since no "polling" packets are sent across the network just to check if the connection is still OK (reduces unnecessary network traffic).

TCP does have acknowledgments for data, so when one side sends data to the other side, it will receive an acknowledgment if the connection is stil active (or an error if it is not). Thus, broken connections can be detected by sending out data. It is important to note that the act of receiving data is completely passive in TCP; a socket that only reads cannot detect a dropped connection.

This leads to a scenario known as a "half-open connection". At any given point in most protocols, one side is expected to send a message and the other side is expecting to receive it. Consider what happens if an intermediate router is suddenly rebooted at that point: the receiving side will continue waiting for the message to arrive; the sending side will send its data, and receive an error indicating the connection was lost. Since broken connections can only be detected by sending data, the receiving side will wait forever. This scenario is called a "half-open connection" because one side realizes the connection was lost but the other side believes it is still active.

Terminology alert: "half-open" is completely different than "half-closed". Half-closed connections are when one side performs a Shutdown operation on its socket, shutting down only the sending (outgoing) stream. See Socket Operations for more details on the Shutdown operation.

Causes of Half-Open Connections

Half-open connections are in that annoying list of problems that one seldomly sees in a test environment but commonly happen in the real world. This is because if the socket is shut down with the normal four-way handshake (or even if it is abruptly closed), the half-open problem will not occur. Some of the common causes of a half-open connection are described below:

  • Process crash. If a process shuts down normally, it usually sends out a "FIN" packet, which informs the other side that the connection has been lost. However, if a process crashes or is terminated (e.g., from Task Manager), this is not guaranteed. It is possible that the OS will send out a "FIN" packet on behalf of a crashed process; however, this is up to the OS.
  • Computer crash. If the entire computer (including the OS) crashes or loses power, then there is certainly no notification to the other side that the connection has been lost.
  • Router crash/reboot. Any of the routers along the route from one side to the other may also crash or be rebooted; this causes a loss of connection if data is being sent at that time. If no data is being sent at that exact time, then the connection is not lost.
  • Network cable unplugged. Any network cables unplugged along the route from one side to the other will cause a loss of connection without any notification. This is similar to the router case; if there is no data being transferred, then the connection is not actually lost. However, computers usually will detect if their specific network cable is unplugged and may notify their local sockets that the network was lost (the remote side will not be notified).
  • Wireless devices (including laptops) moving out of range. A wireless device that moves out of its access point range will lose its connection. This is an often-overlooked but increasingly common situation.

In all of the situations above, it is possible that one side may be aware of the loss of connection, while the other side is not.

Is Explicit Detection Necessary?

There are some situations in which detection is not necessary. A "poll" system (as opposed to a "subscription/event" system) already has a timer built in (the poll timer), and sends data across the connection regularly. So the polling side does not need to explicitly check for connection loss.

The necessity of detection must be considered separately for each side of the communication. e.g., if the protocol is based on a polling scheme, then the side doing the polling does not need explicit keepalive handling, but the side responding to the polling likely does need explicit keepalive handling.

True Story: I once had to write software to control a serial device that operated through a "bridge" device that exposed the serial port over TCP/IP. The company that developed the bridge implemented a simple protocol: they listened for a single TCP/IP connection (from anywhere), and - once the connection was established - sent any data received from the TCP/IP connection to the serial port, and any data received from the serial port to the TCP/IP connection. Of course, they only allowed one TCP/IP connection (otherwise, there could be contention over the serial port), so other connections were refused as long as there was an established connection.

The problem? No keepalives. If the bridge ever ended up in a half-open situation, it would never recover; any connection requests would be rejected because the bridge would believe the original connection was still active. Usually, the bridge was deployed to a stationary device on a physical network; presumably, if the device ever stopped working, someone would walk over and perform a power cycle. However, we were deploying the bridge onto mobile devices on a wireless network, and it was normal for our devices to pass out of and back into access point coverage. Furthermore, this was part of an automated system, and people weren't near the devices to perform a power cycle. Of course, the bridge failed during our prototyping; when we brought the root cause to the other company's attention, they were unable to implement a keepalive (the embedded TCP/IP stack didn't support it), so they worked with us in developing a method of remotely resetting the bridge.

It's important to note that we did have keepalive testing on our side of the connection (via a timer), but this was insufficient. It is necessary to have keepalive testing on both sides of the connection.

This bridge was in full production, and had been for some time. The company that made this error was a billion-dollar global corporation centered around networking products. The company I worked for had four programmers at the time. This just goes to show that even the big guys can make mistakes.

Wrong Methods to Detect Dropped Connections

There are a couple of wrong methods to detect dropped connections. Beginning socket programmers often come up with these incorrect solutions to the half-open problem. They are listed here only for reference, along with a brief description of why they are wrong.

  • A Second socket connection. A new socket connection cannot determine the validity of an existing connection in all cases. In particular, if the remote side has crashed and rebooted, then a second connection attempt will succeed even though the original connection is in a half-open state.
  • Ping. Sending a ping (ICMP) to the remote side has the same problem: it may succeed even when the connection is unusable. Furthermore, ICMP traffic is often treated differently than TCP traffic by routers.

Correct Methods to Detect Dropped Connections

There are several correct solutions to the half-open problem. Each one has their pros and cons, depending on the problem domain. This list is in order from best solution to worst solution (IMO):

  1. Add a keepalive message to the application protocol framing (an empty message). Length-prefixed and delimited systems may send empty messages (e.g., a length prefix of "0 bytes" or a single "end delimiter").
    Advantages. The higher-level protocol (the actual messages) are not affected.
    Disadvantages. This requires a change to the software on both sides of the connection, so it may not be an option if the application protocol is already specified and immutable.
  2. Add a keepalive message to the actual application protocol (a "null" message). This adds a new message to the application protocol: a "null" message that should just be ignored.
    Advantages. This may be used if the application protocol uses a non-uniform message framing system. In this case, the first solution could not be used.
    Disadvantages. (Same as the first solution) This requires a change to the software on both sides of the connection, so it may not be an option if the application protocol is already specified and immutable.
  3. Explicit timer assuming the worst. Have a timer and assume that the connection has been dropped when the timer expires (of course, the timer is reset each time data is transferred). This is the way HTTP servers work, if they support persistent connections.
    Advantages. Does not require changes to the application protocol; in situations where the code on the remote side cannot be changed, the first two solutions cannot be used. Furthermore, this solution causes less network traffic; it is the only solution that does not involve sending out keepalive (i.e., "are you still there?") packets.
    Disadvantages. Depending on the protocol, this may cause a high number of valid connections to be dropped.
  4. Manipulate the TCP/IP keepalive packet settings. This is a highly controversial solution that has complex arguments for both pros and cons. It is discussed in depth in Stevens' book, chapter 23. Essentially, this instructs the TCP/IP stack to send keepalive packets periodically on the application's behalf. There are two ways that this can be done:
    1. Set SocketOptionName.KeepAlive. The MSDN documentation isn't clear that this uses a 2-hour timeout, which is too long for most applications. This can be changed (system-wide) through a registry key, but changing this system-wide (i.e., for all other applications) is greatly frowned upon. This is the old-fashioned way to enable keepalive packets.
    2. Set per-connection keepalives. Keepalive parameters can be set per-connection only on Windows 2000 and newer, not the old 9x line. This has to be done by issuing I/O control codes to the socket: pass IOControlCode.KeepAliveValues along with a structure to Socket.IOControl; the necessary structure is not covered by the .NET documentation but is described in the unmanaged documentation for WSAIoctl (SIO_KEEPALIVE_VALS).
    Advantages. Once the code to set the keepalive parameters is working, there is nothing else that the application needs to change. The other solutions all have timer events that the application must respond to; this one is "set and forget".
    Disadvantages. RFC 1122, section 4.2.3.6 indicates that acknowledgements for TCP keepalives without data may not be transmitted reliably by routers; this may cause valid connections to be dropped. Furthermore, TCP/IP stacks are not required to support keepalives at all (and many embedded stacks do not), so this solution may not translate to other platforms.

Each side of the application protocol may employ different keepalive solutions, and even different keepalive solutions at different states in the protocol. For example, the client side of a request/response style protocol may choose to send "null" requests when there is not a request pending, and switch to a timeout solution while waiting for a response.

However, when designing a new protocol, it is best to employ one of the solutions consistently.

(This post is part of the TCP/IP .NET Sockets FAQ)

2009-05-14

Error Handling

(This post is part of the TCP/IP .NET Sockets FAQ)

Generally speaking, one should expect any socket operation to have a possibility of failure. Even the immediate operations (see Socket Operations) may fail. A socket operation error is uniquely identified by its error code (MSDN: Windows Sockets Error Codes).

Some methods (such as Socket.EndReceive) have overloads that will return the error code one of two ways. A SocketError enumeration out parameter may be specified for these methods, which receives the error from the operation, if any. The methods without SocketError out parameters will raise a SocketException with its ErrorCode set to the SocketError value. The SocketError overloads were added purely for performance reasons, and are not necessary for the vast majority of socket applications.

Response to Errors

A connected socket should be immediately closed when any Read, Write, or Disconnect operation error is detected. Socket errors usually indicate a problem with the underlying connection (or possibly the network itself), and the socket should be considered unstable and be closed.

Closing a socket almost never raises an exception. Only the "fatal" exceptions (OutOfMemory, StackOverflow, ThreadAbort, and possibly others in future CLR versions) can ever be raised from Socket.Close. This makes it safe to call without requiring a try/catchall.

Bind and Connect failures are not uncommon. Depending on the application, one may either inform the user and exit, or retry at a later time (see below).

Listen or Shutdown failures are extremely rare but still possible. These failures may indicate a shortage of OS resources. For the Listen operation, consider notifying the user and then exiting; alternatively, close the listening socket and retry at a later time (see below). For the Shutdown operation, close the socket.

Accept operations may also fail (though this may be surprising to some). In this case, the server should simply continue accepting new connections. This may be caused by a client socket program unexpectedly exiting.

Retry Timers

It is important not to retry socket operations immediately, since not all errors are the result of network communication. Even the Connect operation may fail immediately if the network cable is unplugged. Retrying socket operations immediately may result in high CPU usage or an exhaustion of OS socket resources.

A long-running server (or client) program should have a built-in automatic "retry timer". When any error is detected, the socket should be closed and the retry timer should be started. When the retry timer goes off, then the operation may be attempted again. The timer does not have to be very long: usually 1 second will suffice.

There are only a couple socket errors that may skip the retry timer and immediately retry: SocketError.TimedOut (WSAETIMEDOUT/10060) and SocketError.ConnectionRefused (WSAECONNREFUSED/10061). Both of these error codes indicate that an actual network timeout (WSAETIMEDOUT) or network round-trip (WSAECONNREFUSED) has taken place, so a futher "retry timeout" is unnecessary.

Common Errors and Their Causes

There are a lot of possible WinSock errors, but it's not clear from the MSDN documentation which errors are "normal". The most common errors and their most common causes are below.

SocketError.AddressNotAvailable / WSAEADDRNOTAVAIL / 10049 - Indicates a bad or invalid address (e.g., "255.255.255.255").

SocketErorr.TimedOut / WSAETIMEDOUT / 10060 - This happens when trying to connect to a valid address that doesn't respond (e.g., a powered-off server or intermediate router). This may also be caused by a firewall on the remote side.

SocketError.ConnectionRefused / WSAECONNREFUSED / 10061 - Indicates that the connection request got to a valid address that is powered on, but there is no program listening on that port. Usually, this is an indication that the server software is not running, though the computer is on. This may also be caused by a firewall, though most firewalls drop the packet (causing WSAETIMEDOUT) instead of actively refusing the connection (causing WSAECONNREFUSED).

SocketError.ConnectionReset / WSAECONNRESET / 10054 - The remote side has abortively closed the connection. This is commonly caused by the remote process exiting or the remote computer being shut down. However, some software (especially server software) is written to abortively close connections as a normal practice, since this does reclaim server resources more quickly than a graceful close. Therefore, this is not necessarily indicative of an actual error condition; if the communication was complete (and the socket was about to be closed anyway), then this error should just be ignored.

SocketError.NoBufferSpaceAvailable / WSAENOBUFS / 10055 - Technically this means that the OS has run out of buffer space for a socket. However, it's usually an indicator that the application is trying to use too many temporary ports. This may be caused by a retry rate that is too high (i.e., the retry timer timeout is too short).

Other errors may be seen occasionally, especially when a network is in the process of coming online or going offline (e.g., the computer is in the process of connecting to a wireless network).

(This post is part of the TCP/IP .NET Sockets FAQ)

2009-05-12

Dealing with WiX data files

I am not an installer guru. The story below is how another company overcame one of their installer upgrade difficulties. The solution was found by their installer guru, a friend of mine.

Splitting up an application into components is a pretty straightforward process - usually, resource files are thrown into a directory-wide component. Apparently, the ideal setup for ".config" files is to be in the same component as their .exe, with their CompanionFile set to the .exe, like this: http://wix.mindcapers.com/wiki/Companion_File.

That's nice. Now, what to do if your previous installs didn't do this?

Unfortunately, our situation was even worse. We had the .config file being installed as its own component, with a util:XmlFile modifying the file at the end of the install. This has the unfortunate side effect of the installer sometimes not wanting to update this file (since it's an XML file, it has no version information, and the modification date may be newer since it was modified at install time by the previous installer).

(BTW, for other users of util:XmlFile, there is an attribute PreserveModifiedDate. If our previous installer had set this to "yes", then we wouldn't have had these problems. But it didn't, so the modified date is changed, and we ended up where we were today.)

The solution we adopted is called "version lying". We added a DefaultVersion attribute to the File element to force the new installer to overwrite the old file. Of course, if the end user had changed the .config file after installing, then the new install would blow that away.

WiX doesn't really like version lying a lot: it will give you a warning. However, it works. We are using an environment variable for the build version, and we just set DefaultVersion to "$(env.MY_BUILD_VERSION)". This way the fake version will stay in sync with the final build version.

For our next major update, we're going to use PreserveModifiedDate and either not support automatic upgrades (forcing the user to uninstall the old version first) or upgrade to a different directory. We can then drop the version lying, and our installers will be kosher from then on.

2009-05-05

Socket Operations

(This post is part of the TCP/IP .NET Sockets FAQ)

There are a few logical operations that may be performed on a TCP/IP socket, regardless of whether the socket is synchronous or asynchronous. Each of the operations below is marked "immediate" (meaning it is completed immediately) or "delayed" (meaning it depends on the network for completion).

Constructing (immediate) - TCP/IP sockets use the InterNetwork (for IPv4) or InterNetworkV6 (for IPv6) AddressFamily, the Stream SocketType, and the Tcp ProtocolType.
MSDN links: Socket

Binding (immediate) - A socket may be locally bound. This is normally done only on the server (listening) socket, and is how a server chooses the port it listens on. See Using Socket as a Server (Listening) Socket for details.
MSDN links: Bind

Listening (immediate) - A bound socket notifies the OS that it is almost ready to receive connections by listening. In spite of the term "listening", this operation only notifies the OS that the socket is about to accept connections; it does not actually begin accepting connections, though the OS may accept a connection on behalf of the socket. See Using Socket as a Server (Listening) Socket for details.
MSDN links: Listen

Accepting (delayed) - A listening socket may accept an incoming connection. When an incoming connection is accepted, a new socket is created that is connected to the remote side; the listening socket continues listening. The new socket (which is connected) may be used for sending and receiving. See Using Socket as a Server (Listening) Socket for details.
MSDN links: Accept, BeginAccept, EndAccept, AcceptAsync

Connecting (delayed) - A (client) socket may connect to a (server) socket. TCP has a three-way handshake to complete the connection, so this operation is not instantaneous. Once a socket is connected, it may be used for sending and receiving. See Using Socket as a Client Socket for details.
MSDN links: Connect, BeginConnect, EndConnect, ConnectAsync

Reading (delayed) - Connected sockets may perform a read operation. Reading takes incoming bytes from the stream and copies them into a buffer. A 0-byte read indicates a graceful closure from the remote side. See Using Socket as a Connected Socket for details.
MSDN links: Receive, BeginReceive, EndReceive, ReceiveAsync

Writing (delayed) - Connected sockets may perform a write operation. Writing places bytes in the outgoing stream. A successful write may complete before the remote OS acknowledges that the bytes were received. See Using Socket as a Connected Socket for details.
MSDN links: Send, BeginSend, EndSend, SendAsync

Disconnecting (delayed) - TCP/IP has a four-way handshake to terminate a connection gracefully: each side shuts down its own outgoing stream and receives an acknowledgment from the other side.
MSDN links: Disconnect, BeginDisconnect, EndDisconnect, DisconnectAsync

Shutting down (immediate) - Either the receiving stream or sending stream may be clamped shut. For receives, this is only a local operation; the other end of the connection is not notified. For sends, the outgoing stream is shut down (the same way Disconnect does it), and this is acknowledged by the other side; however, there is no notification of this operation completing.
MSDN links: Shutdown

Closing (immediate or delayed) - The actual socket resources are reclaimed when the socket is disposed (or closed). Normally, this acts immediate but is actually delayed, performing a graceful disconnect in the background and then actually reclaiming the socket resources when the disconnect completes. Socket.LingerState may be set to change Close to be a synchronous disconnect (delayed, but always synchronous), or an immediate shutdown (always immediate).
MSDN links: Close, LingerState

(This post is part of the TCP/IP .NET Sockets FAQ)

2009-05-04

TCP/IP Resources

(This post is part of the TCP/IP .NET Sockets FAQ)

There are two books that any TCP/IP network programmer needs to have. Unfortunately, they were both written well before .NET, so they only deal with unmanaged code - specifically, the WinSock API. However, the .NET Socket class methods directly correspond to WinSock function calls, so knowledge can be gleaned from these books and directly applied to managed code.

  • TCP/IP Illustrated, Volume 1 (The Protocols), by Stevens. Make a copy of pg 241 (the TCP State Transition Diagram), which is one of the most important pages ever printed. A good understanding of Chapter 18 is also important. Note that volume 1 is the only one most people need; volumes 2 and 3 delve into details about implementing TCP/IP stacks and specific (and rare) application protocols. However, volume 2 does have on the inside front cover a copy of the TCP State Transition Diagram updated with timeout events, which is nice to have.
  • Network Programming for Microsoft Windows, by Jones and Ohlund. Chapter 5 has an excellent overview of the various I/O models available, which helps socket programmers understand how the BCL code is using asynchronous calls under the hood. This entire book should be read by TCP/IP programmers.

Note that when reading the unmanaged socket documentation, there are some potentially confusing terms:

  • The terms blocking and nonblocking do not mean the same as the terms synchronous and asynchronous. Nonblocking sockets were a special quasi-asynchronous socket mode that is maintained only for backwards compatibility. Most modern WinSock programs (including .NET programs) use blocking sockets.
  • TCP is a byte stream, connection-oriented protocol. Ignore any remarks specifically for message-based or connectionless protocols; they do not apply to TCP sockets.

There is a command-line utility that comes with Windows named netstat which displays TCP/IP endpoints. Other useful tools from Microsoft (that are not built in to the OS) are TCPView (a GUI version of netstat), Process Explorer (which also displays TCP/IP endpoints for each process), and DbgView (which displays trace statements from the Debug and default TraceSource classes in realtime).

(This post is part of the TCP/IP .NET Sockets FAQ)