tzimmermann dot org
Jun 30, 2017 • 8 min read

Everything About errno; plus Transactions

This is a series of blog posts to build a transaction manager in C. In this entry, we’r going to take a look at errno and how it works behind the scenes. Afterwards we’re going to add support for errno to our transaction manager simpletm.

If you missed earlier entries in the series, you might want to go back and read the installments so far. Entries are self-contained as much as possible, but knowning the big picture might by helpful in some cases.

Communicating Problems

The C and POSIX standards specify a multitude of ways for reporting errors to a program.

If you do something critical, such as accessing unmapped memory regions, you’ll get a signal distributed from the operating system; for example SIGSEGV in the case of unmapped memory. Usually these critical signals cannot be ignored. You have to handle them, or the operating system will abort the program.

But signals are rather uncommon. Most interfaces in C or POSIX return the integer value of -1 on errors and provide an additional way of retreiving information about the actual error.

For floating-point operations, there are floating-point exceptions. These can be queries with fgetexceptflag(). Divide by zero and you’ll get FE_DIVBYZERO, for example.

One of the oldest methods for communicating errors is by setting an error code. Traditional error codes in C are provided in the header file errno.h. There you’ll find short identifier for all kinds of possible errors: ENOMEM for out-of-memory situations, EACCES when file access has been denied, or EINVAL if an incorrect argument was given to a function. Each identifier maps to a positiv integer value, but the actual value is system-dependent.

Some more-recent interfaces, such as POSIX threads, return error codes directly, but most POSIX interfaces store them in the variable named errno.

The History of errno

The variable errno is quite old. I searched through the source code of The Unix Heritage Society and found its first in-code appearance in Unix Release V5. According to the git history, the commit was made on Nov 27 1974! Back then, there was no file named errno.h. Instead errno was implemented as part of perror(), a function that outputs a descriptive string for each error code.

The error codes are even older than errno. The earliest error codes I could find were in the source code of Unix Release V4. This file dates back to Aug 31 1973. The error codes were stored in the field u_error of struct user. Even though the errno variable was not present in the source code of V4, it’s already mentioned on the man page for perror(). This file dates back to Nov 6 1973, so errno must have been introduced at some point between Aug 31 and Nov 6 of 1973.

If you compare these old error codes to what is present on today’s Unix-like system, you’ll find that they are mostly the same. In fact all of the old error codes are still present. New releases and standards only added more error codes when required.

So back in Unix Release V5, errno was an integer value. Error codes were still returned in the field u_error of struct user, but automatically copied to errno.

C was first standardized in the late 1980s and errno was part of this standard. It was specified to be a modifiable lvalue of type int, just as it was in Unix.

Compared to its prominent place in Unix, in the C Standard errno feels as if it is either under-used or out of place. Only 3 error codes are part of the C standard: EDOM, EILSEQ and ERANGE. Many traditional Unix functions don’t have error codes specified. For example, even though malloc() is part of the C Standard Library, its error code ENOMEM is not.

The implementation of errno remained of type int for roughly 22 years, and then… multithreading happend.

In a multithreaded program, having a global variable that signals the error code is obviously a problem. In the time between one thread observing an error (i.e., -1 being returned from a function call) and reading the error code from errno, an error on another thread could change the value of errno. A global errno variable of type int is basically useless in this environment.

The solution to this problem is illustrated in the example code shown below.

#ifdef	_THREAD_SAFE
extern	int *		__error();
#define	errno		(* __error())
#else
extern int errno;			/* global error number */
#endif

In a thread-safe program, errno is defined as macro that resolves to a function call. The function returns a pointer to a thread-local copy of the error variable and the errno macro internally dereference the pointer. This way, accessing errno returns a thread-local error code.

The code snipped above has been taken from a rather confusing commit to FreeBSD. The commit was made on Jan 22 1996 and first shipped in FreeBSD 2.2.0.

Other systems, such as GNU or Linux, traditionally used different implementations of POSIX and the C Standard Library, but have gone through a similar transition.

Transactional errno

After this long excurse into history, let’s get back to transactions. For errno to be transaction-safe, all we have to do is to save its value before it’s being changed by any function. If a transaction aborts, we reset errno to its original value, so the transactions always restarts with the same error code. As the variable itself is already thread-local, no concurrency control is required.

Building upon malloc() and free() support from the previous blog entry, let us first add an interface to safe errno during the transaction.

void save_errno(void);

Too easy.

For the implementation, we add an errno value to the transaction context as well as a flag showing the value’s validity.

struct _tm_tx {
    jmp_buf env;

    unsigned long        log_length;
    struct _tm_log_entry log[256];

    bool errno_saved;
    int errno_value;
};

We further have to save errno and restore its value during a rollback.

void
save_errno()
{
    struct _tm_tx* tx = _tm_get_tx();

    if (tx->errno_saved) {
        return;
    }

    tx->errno_value = errno;
    tx->errno_saved = true;
}

void
tm_restart()
{
    release_resources(arraybeg(g_resource),
                      arrayend(g_resource), false);

    struct _tm_tx* tx = _tm_get_tx();

    /* Revert logged operations */
    undo_log(tx->log, tx->log + tx->log_length);
    tx->log_length = 0;

    /* Restore errno */
    if (tx->errno_saved) {
        errno = tx->errno_value;
        tx->errno_saved = false;
    }

    /* Jump to the beginning of the transaction */
    longjmp(tx->env, 1);
}

We only save errno on the first time that save_error() gets called in a transaction. That’s the error code we’re interested in. Later invocations may not replace is. During a transaction’s roll-back and restart, we check if we saved errno and possibly restore it to its original value.

So far, our transaction manager only supports one function that could potentially modify errno, which is malloc_tx(). Modifying malloc_tx() to save errno is trivial.

void*
malloc_tx(size_t size)
{
    save_errno();

    void* ptr = malloc(size);

    append_to_log(NULL, undo_malloc_tx, (uintptr_t)ptr);

    return ptr;
}

That’s the same implementation as in the previous blog post, extended by a call to save_errno().

Summary

We’ve looked at errno and how to support it in a system transaction.

As usually, you can find the full source code for this blog post on GitHub. If you’re interested in a more sophisticated C transaction manager, take a look at picotm.

If you like this series about writing a transaction manager in C, please subscribe to the RSS feed, follow on Twitter or share on social networks.

Post by: Thomas Zimmermann

Subscribe to news feed