Wrapping Up Error Handling Techniques

The C/C++ Users Journal, February, 1999

Back when I was in law school I took a seminar course on criminal defense. One of the best pieces of advice I got in that course was "When you’re preparing to interview a client who’s charged, say, with burglary, begin by reading the burglary statute. And when you’re finished with the interview, read the burglary statue again. I don’t care how many thousand times you’ve read it before, chances are that you’ll spot something new that’s pertinent to your client’s case." It’s been fifteen years since I stopped practicing law, but I can still recite Maine’s burglary statute from memory.

Here in The Journeyman’s Shop we try to follow that advice, too. When you’re about to start implementing some part of a project, read the project specification, and review any relevant literature that you have available that relates to what you’re about to do. After you’ve written the code, read the specification again, and review the literature again. Chances are that you’ll spot something new that’s pertinent to the code that you’re writing.

Reviewing the literature after writing the code may sound a bit odd, but it works. You’ve just been through the intellectual exercise of making your code work, and you’re more familiar with the problems and issues you’ve run into than you’ll ever be again. That’s the perfect time to review what others have written about the area that you’ve just been working on. Often the result will be that you see a better way to write your code. Don’t be shy about going back and rewriting it with your newfound insight. The goal is to produce the best code to solve the problem that you’re working on, not to prove that what you produced in your first pass can be made to work.

On the assumption that what I’ve been writing about error handling may be of some use to you out there in the field, I’ve included at the end of this installment a brief outline of the subject as I’ve covered it in my last three columns. If you find this series of columns helpful, consider using the outline as a checklist when you’re working on a project that involves error handling. Look it over before you begin, and look it over again when you’ve finished. It might trigger something that saves you a great deal of work.


Two months ago we talked about handling errors in the code that detects the problem. The main issues there are recognizing that an error has occurred and choosing the best way to handle the error. Last month we began talking about what to do when you cannot handle the error in the function that detects it, and in particular we talked about ways of notifying that function’s caller that something has gone wrong. This month we’ll continue discussing how to handle errors that cannot be handled in the function that detects them, but this time we’ll be looking at more complicated transfers of control, that is, transfers that aren’t simply a function call or a return from a function.

Transfer of Control

One of the most common beginner’s mistakes in writing C or C++ code is forgetting to check the return value of a function for an error indication. How many times have you seen a question on comp.lang.c that goes something like this:

#include <stdio.h>

int main()
    FILE *fp;
    char data[20];
    fp = fopen("input.dat", "r");
    fread(data, 1, 20, fp);
    return 0;

Of course, the call to fread will fail, often disastrously, if fopen was unable to open the specified file. The code above should explicitly check whether fopen returned a null pointer. Even then, it’s not unheard of for a programmer to take the wrong action once they’ve noticed a problem:

#include <stdio.h>

int main()
    FILE *fp;
    char data[20];
    if (fp = fopen("input.dat", "r"))
        printf("Error opening input file\n");
    fread(data, 1, 20, fp);
    return 0;

Here, the programmer has properly tested for a null pointer, then gone ahead and used the pointer anyway. This is probably an oversight, and the programmer meant to add an else-clause, so that the call to fread would only be executed if fopen succeeded.

Now, most of us recognize both of these errors when we see them in the simple form that I’ve presented them in here. The danger is that we won’t recognize them when we make them in code that’s more complex than these simple examples. There are two approaches to avoiding this kind of mistake. We can look for them in code reviews (either in the form of bench checking or as part of a more formal review program), or we can design our error handling to avoid requiring users of our code to remember to check for errors. Both approaches make good sense: if you’re developing high quality software you should have some sort of code review mechanism in place, and adding to your review checklist isn’t a big job. On the other hand, sometimes taking the burden of checking for errors away from the programmer can simplify the resulting code.

When we take responsibility for error checking away from the users of our functions we must add in a mechanism to handle errors ourselves. In BASIC we’d use OnError Goto, or some other such bludgeon. In C and C++ we have much more flexible mechanisms available for transferring control to an error handler without requiring our caller to check for errors. We can use callouts, signals, longjmp, and exceptions.


Let’s put on our application architect hats for a moment, and think about ways to make memory allocation safer. For example, let’s suppose that we know from looking at our design specification that no function called from the UI code will ever need to allocate more than 320 bytes. The user can call several of these functions, of course, but any one call will succeed if there are 320 bytes available for allocation when the call is made. One approach to safer memory management would be to allocate a block of 320 bytes when the program starts up, and if any allocation fails, free up that block and try again. This guarantees that any one call from the UI code will succeed. Once that has been done, the UI code has to worry about the possibility of running out of memory, but none of the code called by the UI needs to concern itself with that possibility.

One way to implement this is to replace malloc with a different function that implements this strategy. For example:

void *safety_block;

void *init_safe_malloc(void)
    return safety_block = malloc(320);

void *safe_malloc(size_t sz)
    void *res = malloc(sz);
    if (res == NULL)
        if (safety_block == NULL)
            printf("Fatal: out of memory");
        safety_block = NULL;
        res = malloc(sz);
    return res;

Now all that’s necessary to make memory allocation much safer is to replace calls to malloc with calls to safe_malloc. Whenever an attempted allocation fails the safety block will be freed, the required memory allocated, and the application can continue to execute normally. At some point we need to check whether safety_block has been set to NULL and take corrective action, but we’ve reduced the burden of checking for allocation failures.

For example, suppose we’re writing the code for an application that allows the user to select operations from a menu. Each operation will finish executing and return to our menu handling system before the user gets a chance to select a new operation. We could use our safe_malloc function like this:

int main()
            case ADD:
            case REMOVE:
            case QUIT:
        if (safety_block == NULL
            && init_safety_block() == NULL)

With this approach we’ve localized the handling of out of memory conditions to our main function. The code in add_data and remove_data can simply allocate memory as needed with safe_malloc, and not check whether the allocation succeeded. If an allocation request initially failed and the safety block had to be released in order to satisfy it, the code after the switch statement attempts to recover. If add_data and remove_data release any memory that doesn’t need to be used across calls, we can reallocate the safety block and resume execution. If not, we terminate the application.

This approach is somewhat intrusive, particularly if we’re adapting existing code. It probably wasn’t written with this strategy in mind, so we’ll have to go through it looking for calls to malloc, calloc, and realloc, and replacing them with calls to safe_malloc. That can usually be done with macros, but it’s a chore.

In C++ there’s a hook that lets you write this sort of code without having to change any of the affected code. When you use operator new to create an object, if the attempted allocation fails, operator new calls a function known as a new-handler, which can attempt to recover memory. This is invisible to the code that is creating the object, so the programmer doesn’t have to worry about whether it is in use. We can provide a callout to handle low memory situations in much the same way as we did in the previous example: pre-allocate a block of memory, write a function to free it, and register that function as the function to be called by operator new if it cannot allocate the requested amount of memory. Like this:

void *safety_block;

void free_safety_block()
    if (safety_block == NULL)
        printf("Fatal: out of memory");
    safety_block = NULL;

void *init_safety_block(void)
    safety_block = malloc(320);
    if (safety_block != NULL)
    return safety_block;

We can use the same main function as before. The call to init_safety_block allocates the safety block and installs free_safety_block as the callback that operator new will use if it is unable to satisfy an allocation request. The call to set_new_handler in init_safety_block tells the runtime system that it should call free_safety_block if operator new is unable to allocate memory. The call to set_new_handler in free_safety_block tells the runtime system that there is no longer any function to call if operator new is unable to allocate memory. We need to do this because operator new will keep calling the new-handler as long as it keeps failing. Once we’ve released the safety block there’s nothing further we can do, so we need to unregister our handler1. Now we don’t have to check whether any of our code that allocates memory succeeded, because we’ve guaranteed that they will (provided, as I said earlier, that we don’t try to allocate more memory than the safety block contains). Creating a global policy like this is easy with callout functions like new-handler. We can do this in our own code just as easily. All that we have to add is the code to use the callout function. In the case of operator new, the code might look something like this:

void *operator new(size_t sz, nothrow_t)
    void *res = malloc(sz);
    while (res == NULL && new_handler != NULL)
        res = malloc(sz);
    return res;

This is typical of code that uses a callout to handle errors. It first tries to do whatever it is supposed to do. If the attempt fails and there is a callout function installed it calls the callout function, then tries again. We use a pointer to the callout function here instead of hard-coding a call to a particular function to make the code more flexible. This version of operator new can be used in any application, with any recovery strategy that is appropriate to the application. The recovery strategy can be changed on the fly, by calling set_new_handler with the address of a function that implements the new strategy.

This is one of the most important reasons for using a callout rather than simply calling a function when an error occurs. Although you haven’t seen its definition in the code snippets we’ve been looking at, there’s actually a pointer to a function involved here. It’s held in the variable named new_handler, and its value is set by each call to set_new_handler. By going through this layer of indirection, the library code that implements operator new becomes much more flexible. It allows the application to set the error recovery strategy without having to rewrite library code or replace functions in possibly non-portable ways.

One drawback with this particular approach is that new_handler is a global variable. This could cause problems in a multi-threaded application. Of course, it’s possible to minimize these problems by providing a semaphore lock around access to new_handler, but that’s slow. It’s also possible to provide a separate function pointer for each thread, provided you’re willing to deal with the portability problems that creating per-thread data structures poses. Another possibility is to pass a pointer to an error handler as an argument to a function. Obviously this has a fairly high overhead, because that argument must be passed to every function that might need access to the error handler. On the other hand, it provides a great deal of flexibility, particularly if the error handler is not just a single function, but a structure that contains pointers to multiple error handlers. This lets your library distinguish, for example, between warnings, errors, and fatal errors, as some tools insist on doing2. The application can set the policy for handling each of these cases by creating an appropriate error handler and passing its address down into the library.


C has had a similar, but much more limited, mechanism for a long time. You use it through the library functions signal and raise. The function signal registers a callout function known as a signal handler that will be called when a particular kind of error occurs. The function raise announces that an error has occurred. If there is a signal handler registered for that error, the signal handler will be called.

You may have heard that signals are dangerous. Like most powerful tools, they are if you don’t know what you’re doing with them. To understand signals you have to know the difference between synchronous signals and asynchronous signals. Synchronous signals are those that occur when your code calls the raise function. Asynchronous signals are those that occur in other cases. For example, many compilers allow you to install a signal handler that will be invoked on a floating point error such as an attempt to divide by 0. Since such a signal does not occur as a result of a call to raise, it is an asynchronous signal. Another example of an asynchronous signal is the kill signal, which occurs when the user tells the operating system to kill an application.

Asynchronous signals can occur more or less without warning. Because of this, there are severe restrictions on what the handler for an asynchronous signal can do: it cannot call any function in the standard library other than signal, and it is only allowed to assign values to global variables of type sig_atomic_t, which is defined in <signal.h>. As a practical matter, this means that an asynchronous signal can only be used to set a flag to indicate that an error occurred. There may be other things that your compiler and operating system permit, but those are extensions and cannot be relied on when you move to a different compiler or operating system.

Synchronous signals are a different matter. They can be used to handle problems in much the same way as the callout functions that we discussed in the previous section. In fact, their operation is quite similar to the code that we talked about. However, when you call raise you can only raise error codes that have been defined by the implementation of the standard library that you are using. The C language definition requires compilers to support six signals, all of which can occur asynchronously, so they are not particularly useful for general-purpose error handling. However, some compilers permit a handful of user-defined signals as well, so if you are not concerned about portability, you can use signals as callbacks.

To install a signal handler, you call the signal function with two arguments: an integer value that indicates which signal you are registering a handler for, and a pointer to a function which should be called when the corresponding signal is received. The signal function returns a pointer to the previously registered handler for that signal. For example, let’s continue our memory management example, using a signal handler instead of a new-handler.

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>

void *safety_block;

void free_safety_block()
    if (safety_block == NULL)
        printf("Fatal: out of memory");
    safety_block = NULL;

struct C
    int i;

struct C *create_c()
    struct C *res = malloc(sizeof(C));
    if (res == NULL)
        res = malloc(sizeof(C));
    return res;

int main()
    safety_block = malloc(320);
    signal(SIGUSR1, free_safety_block);
    while (safety_block != NULL)

In main, the call to signal registers the function free_safety_block as the function to be called when a SIGUSR13 signal is raised. In create_c, when an attempted allocation fails, we call raise to see if we can get more memory. In free_safety_block, which handles our signal, we don’t have to unregister our signal handler: the call to signal does this for us.

You can see that when we’re using synchronous signals, signal and raise can be used in a way that is similar to the callouts that we discussed earlier. The only advantage that signal and raise have over hand-coded callouts is that we don’t have to create a function pointer to keep track of the callout function. Given the severe portability limitations of signal and raise, this advantage is not at all compelling. If you’re going to use callouts, don’t bother with signal and raise. They’re more trouble than they’re worth.


Another form of transfer of control is longjmp, which is often described as a non- local goto. You use longjmp in conjunction with setjmp when you want to transfer control back to a function higher up in your calling chain, without going through the normal sequence of return statements. You create the target of the jump with setjmp, and when the time comes to jump back to it, you call the function longjmp. The connection between a call to longjmp and a particular invocation of setjmp is made through a data structure whose type is jmp_buf. To use these facilities, include the header <setjmp.h>.

You may have noticed that I’ve been careful not to use the word "function" in discussing setjmp. That’s because it isn’t a function, it’s a macro. Its job is to save the program’s current execution state in the jmp_buf. If you think about it, that would be hard to do from inside a function, because setjmp would have to figure out where its calling function was on the stack in order to get the data needed to jump back to there. So setjmp is a macro. Its code executes directly in the code that you’re going to jump back to, making it easier to store the necessary information.

When you call setjmp you get back the value 0. When you jump back to the point of the execution of setjmp you get back a non-zero value that was passed as the argument to longjmp. This means that you typically use setjmp as the expression in an if statement or a switch statement. Like this:

#include <setjmp.h>
#include <stdio.h>
#include <stdlib.h>

jmp_buf buf;

void f(void);

int main()
    if (setjmp(buf) != 0)
        printf("Error occurred\n");
        return EXIT_FAILURE;
    return 0;

As execution of main begins, the call to setjmp initializes buf so that a subsequent call to longjmp can jump back into the if statement. This call returns 0, so the block that the if statement controls is not executed. Instead, execution proceeds to the call to f. If this call returns normally, execution continues to the return statement, and the program terminates.

If the function f determines that an error occurred, it can call longjmp to jump back into main.

extern jmp_buf buf;

void f(void)
    longjmp(buf, 1);

When f is called, execution will jump back to the point of the setjmp call in main. The value of setjmp will then be 1, because that’s the value that we called longjmp with. Since the value is not zero, the block controlled by the if statement will be executed, printing an error message and terminating the program.

You have to be careful when you use setjmp. I said earlier that it saves the program’s current execution state. That’s not quite true. It saves enough information to jump back to where it was invoked, and it might save values of auto variables. However, it also might not. Technically, if you change the value of an auto variable after the call to setjmp and before the call to longjmp, after the call to longjmp that variable has an indeterminate value. For example, let’s change our main function a bit:

int main()
    int auto_var = 0;
    if (setjmp(buf) != 0)
        /* auto_var’s value is indeterminate */
        printf("Error occurred\n");
        return EXIT_FAILURE;
    auto_var = 1;
    return 0;

I’ve added an auto variable named auto_var, and initialized it to the value 0. Just before the call to f its value is changed to 1. When f executes longjmp, the value of auto_var becomes indeterminate. This means that there is no guarantee that it will have a meaningful value when the body of the block of code controlled by the if statement is executed. This, in turn, means that you cannot use the value of auto_var in any way after getting back to main through a longjmp.

Although this example uses a global buffer, that’s not required. All that’s necessary is to be able to pass the buffer to longjmp. You can use a pointer to a buffer if that’s more convenient. Also, although we’ve only looked at an example with a single buffer, there’s no reason you can’t have more than one buffer in an application. All that’s required is that the buffer you pass to longjmp must have been initialized by a call to setjmp, and that no intervening call to longjmp was made with that buffer. Finally, note that setjmp and longjmp should be used with caution, if at all, in C++ code. The C++ language definition allows their use, but leaves it up to the implementation whether destructors of auto objects in the functions skipped over during the longjmp will be run. For example,

class C

jmp_buf buf;

void f()

void g()
    C c;

int main()
    if (setjmp(buf) != 0)
        printf("Error occurred\n");
        return EXIT_FAILURE;
    return 0;

It is up to the implementation whether the destructor of the auto object c in the function g will be run when the longjmp call in f forces execution to return abruptly to main. This unpredictability means that longjmp and destructors simply do not mix. In C++ we’d always use exceptions rather than setjmp and longjmp. You have to be careful, though, if you’re mixing C and C++ code. Be sure that any longjmp calls from the C code are handled by a setjmp call with no intervening auto objects. Otherwise you’re deep in the realm of unpredictability.

When you use setjmp and longjmp to notify your caller of an error, you should use the integer value that you pass to longjmp to indicate the nature of the error. This is pretty much the same as returning an integer value to indicate what went wrong, which we talked about last month. The only difference is that with setjmp and longjmp you can return to a function higher up in your call chain. Other than that, all the same considerations apply.


When we’re writing code in C++ or in Java we can use exceptions to transfer control when an error occurs. Exceptions are similar to setjmp and longjmp in that code that detects an error can jump back up the call stack without knowing exactly where the error will be handled. Exceptions are significantly different from setjmp and longjmp in several ways, however, so don’t press this analogy too hard. Exceptions in Java differ significantly from exceptions in C++ in the way that stack unwinding is handled. We’ll come to that a little later. We’ll begin, though, by looking at how to indicate that an error occurred and how to provide code to handle errors. In this regard Java and C++ are quite similar. We’ll work mostly with code examples in C++, and when there are significant differences in Java I’ll mention them.

The fundamental idea behind exceptions is that code that detects an error should be able to say "I give up, I don’t know how to handle this," without having to know whether some other piece of code can handle the problem. This is a weakness in setjmp and longjmp, because in order to call longjmp there must have been a preceding call to setjmp to set up the jump buffer. Without that call, longjmp will fail disastrously and unpredictably. With exceptions, on the other hand, if there is no handler for the exception that has been thrown the program will terminate in a fairly well defined way.

Throwing and Catching

When our code detects an error it can throw an exception. In C++ an exception can be any type: an int, a float, an object, a pointer. In Java an exception must be an object of a type derived from java.lang.Exception. You throw an object with the keyword throw:

void f()
    if (error)
        // throw an int:
        throw 3;

This throw-expression says to throw the integer 3. For some other type, simply create an object of that type and use it as the argument to throw:

void g()
    if (error)
        // throw a C-style string:
        throw "An error occurred";

class Error {};

void h()
    if (error)
        // throw an object of type Error:
        throw Error();

void j()
    if (error)
        // throw a pointer to Error:
        throw new Error();

The last example is the only correct form if you are coding in Java4: it creates an object and throws it.

The code that is intended to handle the exception announces its readiness to do so with a try block and a catch clause:

int main()
    try {
    catch(int i)
        cerr << "Exception caught: " << i << `\n’;
    return 0;

The try block tells the compiler to generate code to handle any exceptions thrown as a result of executing code contained in the block. In this example, the catch clause says that the next block of code should be executed whenever an exception of type int has been thrown in the try block.

You can have multiple catch clauses to handle different types of exceptions:

int main()
    try {
    catch(char *str)
        cerr << "Exception caught: " << str << `\n’;
    catch(Error& e)
        cerr << "Exception caught: Error object\n";
    catch(Error *e)
        cerr << "Exception caught: pointer to Error\n";
        delete e;
    return 0;

When we write a catch clause that can handle a pointer or a reference to a type, that catch clause will also be used for a pointer or reference to a type derived from the type that it handles. For example:

class Error {};
class DerivedError {};

int main()
    try {
    catch(Error *e)
        cerr << "Exception caught: pointer to Error\n";
        delete e;
    return 0;

The catch clause here will be executed whenever code executed inside the try block throws an exception of type pointer to Error or of type pointer to DerivedError. There is no best match rule here: the first catch clause that can handle the type thrown wins, even if a later one looks better:

class Error {};
class DerivedError {};

int main()
    try {
    catch(Error *e)
        cerr << "Exception caught: pointer to Error\n";
        delete e;
    catch(DerivedError *e)
        cerr << "Exception caught: pointer to DerivedError\n";
        delete e;
    return 0;

In this case, if j throws a pointer to DerivedError, the first catch clause is the one that will be executed.

Cleaning up the Stack

One of the promises that C++ tries to keep is that if you have created an object of a type that has a destructor, when that object is no longer available its destructor will be run. For an auto object this means that when the function where it was created returns, its destructor will be run. This provides a powerful model for resource control: obtain resources in constructors, and release them in destructors. We’re probably all used to doing this for memory: constructors allocate memory and destructors release it. Any scarce resource should be handled in the same way.

Keeping this promise requires that destructors be run when a function stops execution early because of an exception. For example:

class C
	C() { data = new char[320]; }
	~C() { delete [] data; }
	char *data;

void f()
    C c;
    throw 3;
    // other processing here

At the point of the throw, the compiler must generate code to destroy c. Otherwise, when the exception is thrown there would be a memory leak.

Similarly, a function that doesn’t throw any exceptions itself but calls a function that can throw an exception must be prepared to run destructors if an exception is thrown:

void g()
    C c;
    // other processing here

Again, the compiler must generate code to destroy c if f throws an exception. You don’t have to write any code to make this happen: the compiler does it for you in order to keep the promise that C++ makes, to destroy objects when they become inaccessible.

In Java, garbage collection makes things both simpler and more complicated. The resource that most of us deal with most often is memory. Java’s garbage collector handles releasing unused memory, so we don’t need to worry about it. This means that most of what we’d do in a C++ destructor is unnecessary. However, that doesn’t mean that there is never any cleanup needed when an exception is thrown. Java doesn’t have destructors, so you have to write code to handle any necessary cleanup in your functions. You do this with a finally block:

// Java only:
void f()
    try {
            // allocate a resource
            // use the resource
            // release the resource

A finally clause will be executed whenever execution of the try block ends, whether because execution simply reached the end of the block or because an exception was thrown.

Unhandled Exceptions

In both C++ and Java, the runtime system ends up handling exceptions that aren’t handled by any user-written code. When this happens the application is terminated. In applications that we write here at The Journeyman’s Shop, we try to write our code so that this never happens. Running into an unhandled exception is a serious error. Of course, when we’re writing a part of a larger application, we can’t control what the rest of the application does with exceptions, so we document the exceptions that we throw and the circumstances under which we throw them, and we leave it to the application architect to make sure that they are properly handled.

Transmitting Error Information with Exceptions

In C++, as we’ve seen, we can throw an object of type int. If you decide to do this, you can encode errors with integral values, as we discussed last time. However, since we can throw objects and catch them according to their type, we have a much more powerful mechanism that we can use: we create different types to represent different errors. Java programmers are already familiar with this: there are over fifty different exceptions types that can be used to indicate what went wrong in a program. In the code that catches an exception, each catch clause catches a different type, and can then handle the error indicated by that type appropriately.

Coming Up

Next time we’ll talk about a subject that we mentioned in passing here: initialization and cleanup. This is an area that often leads to mysterious program bugs, mostly because we don’t give it enough thought.

Outline of Error Handling
Types of errors
    Bad arguments
    Resources not available
    Security violations
    Coding errors
Decide how to detect errors
Decide how to handle errors
Reporting techniques
    Return values
        boolean value
        special values
        item counts
    Transfer of control

1. In actual use, each time we register our own new-handler function we need to make a note of the function that was previously registered, and if our function cannot fix the problem, it should call the previously registered handler. This is easy to do, because set_new_handler returns a pointer to the previous handler. When we unregister our handler we also must restore the previous handler.

2. In particular, linkers tend to distinguish errors, which result in an executable file that might not work right, from fatal errors, which result in no executable file.

3. This code uses the extensions provided by Borland’s C++Builder. Check your library documentation for implementation-specific signals.

4. Yes, yes, I know: there’s no throw specifier, and Java is very picky about them. For the moment let’s stick to throw expressions and not worry about the surrounding context too much.