2011-03-28

Exception Types

A while ago, Eric Lippert wrote an excellent blog entry called Vexing Exceptions. He defines four categories of exceptions, along with recommendations of how to handle them. I summarized this information into a Word document which I've printed out and posted at my desk:

Download - View Online in Microsoft Office Live - View Online in Google Docs Viewer

I've also started to use Mr. Lippert's terminology in my regular work, and I see it becoming more common in the programming communities. A brief summary of the terminology is below:

  • Fatal - exceptions that you can't prevent and cannot handle in a reasonable manner, e.g., out of memory, thread aborted.
  • Boneheaded - exceptions that are bugs in the code, e.g., argument null, index out of range.
  • Vexing - exceptions thrown in non-exceptional situations, e.g., parsing errors.
  • Exogenous - exceptions due to external influences, e.g., file not found.

The question of what exactly constitutes a "non-exceptional situation" is not addressed; this is an age-old debate.

Eric Lippert's post is mainly concerned with client-side code; that is, how to handle the different exception categories. When I write code, I always strive to write it as if it were going to be a library (I find that a little thought about API design goes a long way towards code reusability, even if the code never becomes an actual library). Therefore, in my Word document, I added design recommendations for each exception category as well.

2011-03-24

The Border Case of Abs

Things have been busy here, adjusting to two children in the house and trying to hold down two jobs! One problem I ran into on my day job was a boundary condition that I'd never seen before...

Long story short: it turns out that the result of abs(0x80000000) is undefined in C (in C#, Math.Abs((int)0x80000000) throws an OverflowException). In the C library used by my firmware, abs(0x80000000) is actually a negative number (!).

In my case, this caused the wrong logic path to be taken; the call to abs was in an expression like if (abs(large_unsigned - smaller_unsigned) < (signed_value)). Another programmer had added the call to abs because the compiler was complaining about comparing a signed integer to an unsigned integer. As it turns out, the compiler would do an unsigned comparison when it gave that warning (which was correct). By adding the abs, the programmer had introduced a very subtle bug: whenever the difference between large_unsigned and smaller_unsigned was exactly 0x8000000, the result of abs would be negative, causing the branch to be taken when it shouldn't be.

I removed the abs (and cast signed_value to unsigned to - correctly - get rid of the compiler warning). All told, it was a rather expensive mistake: two failures were seen at a customer site, and we had to set up a test bench here with several people working for several days just to find the problem.

Lessons learned:

  • Don't "just make the compiler shut up". I'm a big proponent of warning-free code, but the correct way to get there is to first understand the warning, and then correct the code.
  • There is an interesting boundary condition around abs. I've already searched through the rest of the source for similar occurrences. :)