May 26, 2017 • 14 min read

Transactional Access to Arbitrary Memory Locations

In this blog post we are going to implement transactional semantics for arbitrary locations in main memory. So far our simpletm transaction manager can only handle integer values that we store in a dedicated data structure. At the end of this installment, we’ll be able to work with values of arbitrary types that are located (almost) anywhere in memory.

In the previous installments we finished our transaction manager simpletm to contain miminal functionality for detecting and resolving conflicts among transactions, and we separated the transaction manager from the application logic by creating a generic C interface.

Loads and Stores for Integer Resources

For transactional memory access, our example program executes load and store operations on dedicted data structures of type struct int_resource, which is shown below.

struct int_resource {
    int             value;
    int             local_value;
    unsigned long   flags;
    pthread_t       owner;
    pthread_mutex_t lock;
};

The shared integer value is stored in the field value. Each instance of the structure is owned by a single transaction that stores in the field local_value and loads from either value or local_value, depending whether a local value is present. The flags field tells us about this. During the owning transaction’s commit the local value is copied to the actual value, on rollback the local value is discarded.

The corresponding load/store interface for an application looks like this.

void
load_int(struct int_resource* res, int* value);

void
store_int(struct int_resource* res, int value);

It’s specific to integer values and the application has to know about the struct int_resource data structure. Instead we should prefer an interface that can store a value of any type at any location in memory.

Working With Arbitrary Types

To support arbitrary types, let us rework struct int_resource a bit. We call this new structure struct resource.

#define RESOURCE_BITSHIFT   (3)
#define RESOURCE_NBYTES     (1ul << RESOURCE_BITSHIFT)

/**
 * An integer value with an associated owner.
 */
struct resource {
    uintptr_t       base;
    uint8_t         local_value[RESOURCE_NBYTES];
    uint16_t        local_bits;
    pthread_t       owner;
    pthread_mutex_t lock;
};

First of all, we remove the field value. Instead the base field tells us where in memory the resource is located. The value in base is the address of the first byte of a transactional value.

The field local_value is now a fixed-size byte array. It still stores the local value, but it’s not any longer type specific. In our case the array is 8 byte in size, so we can operate on values up to 8 bytes. Further below we will see how to arrange the resources such that values of arbitrary size can be supported.

Finally we replaced the flags field by local_bits. The latter is a bit field, where each bit signals the present of a transaction-local value in local_value. So if bit 0 is set local_value[0] contains a local value, if bit 1 is set local_value[1] contains a local value, and so on.

Working With Arbitrary Memory Locations

Our local_value stores 8 bytes and base could be any address. If we align base such that it always contains an address at an 8-byte boundary, its lowest three bytes will always be zero. For a memory address, these bits contain the index into local_value.

Here are the bits of a 32-bit memory address with their meaning for access to struct resource.

    | 31 .. 13  12 11 10  9  8  7  6  5  4  3 |  2  1  0 |
    |                  Base                   |   Index  |

For mapping a memory address to a byte in a resource structure, first we have to make sure that the base values of address and resource are the same, then use the address’ lowest three bits as index into local_value.

So far we’ve had two values of type struct int_resource. If we had a single resource structure in our transaction manager, we could load and store 8 consecutive bytes for the transactions. To load and store more memory, we have to add more instances; let’s say 1024.

The code fragment below creates an array of 1024 resource structures.

#define NRESOURCES_BITSHIFT (10)
#define NRESOURCES          (1ul << NRESOURCES_BITSHIFT)

struct resource g_resource[NRESOURCES];

We could assign a base address to each array element and search in the array whenever we want to map an address to a resource with corresponding base address. While this would work, there’s an easy way to compute the correct element directly.

Storing the position of element of the array requires exactly 10 bit. These 10 bit are located in the memory address directly before the index; or in other words, they are the next higher bits above the index.

    | 31 .. 13 | 12 11 10  9  8  7  6  5  4  3 |  2  1  0 |
    |          |             Element           |          |
    |                   Base                   |   Index  |

For mapping a memory address to a byte in a resource structure, we can extract the element position (bits 3 to 12), get the element from the element array we declared, make sure that the base values of address and resource are the same, and then use the address’ lowest three bits as index into local_value.

This look-up scheme is very fast and ensures that resources are consecutive in memory as well as the bytes of each resource. It is therefore possible to operate on values that are larger than 8 bytes. For example, if we load or store a value of 16 bytes, we first operate on the resource structure covering the value’s first bytes, then operate on the resource structure covering the next bytes, and so on.

Loading and Storing

Before we go into the details of the load and store functions, let us look at the new interfaces. Because we don’t use int directly any more, the new interfaces take memory addresses and buffers.

void
load(uintptr_t addr, void* buf, size_t siz);

void
store(uintptr_t addr, const void* buf, size_t siz)

The value addr is the address of the shared value. The application logic is not required to know about the resource any more. The transaction manager maintains these internally. The values buf and siz give a transaction-local buffer and the number of bytes in the buffer.

The complexity is now in the implementation of the load and store function. Here is store().

void
store(uintptr_t addr, const void* buf, size_t siz)
{
    const uint8_t* mem = (const uint8_t*)buf;

    while (siz) {

        struct resource* res = acquire_resource(addr & BASE_BITMASK);
        if (!res) {
            tm_restart();
        }

        unsigned long index = addr & RESOURCE_BITMASK;
        unsigned long bits = 1ul << index;

        uint8_t* beg = arraybeg(res->local_value) + index;
        uint8_t* end = arrayend(res->local_value);

        while (siz && (beg < end)) {
            *beg = *mem;
            res->local_bits |= bits;

            bits <<= 1;
            --siz;
            ++addr;
            ++mem;
            ++beg;
        }
    }
}

store() transactionally stores all bytes in buf to the resources that cover addr and successiv memory locations. The function call to acquire_resource() acquires the resource for a specific base address and returns the resource’s data structure. If the resource cannot be acquired, the transaction is in conflict, NULL is returned and the transaction aborts.

Once we have the resource, we start filling its local_value field with the contents of buf. For each byte we copy, we set the corresponding bit in local_bits. We do this until we reach the end of either buf or local_value. If we reached the end of local_value and there are still bytes left in buf we continue the outer while loop with the next resource.

The load function is identical, except for the actual copy operation.

void
load(uintptr_t addr, void* buf, size_t siz)
{
    uint8_t* mem = (uint8_t*)buf;

    while (siz) {

        struct resource* res = acquire_resource(addr & BASE_BITMASK);
        if (!res) {
            tm_restart();
        }

        unsigned long index = addr & RESOURCE_BITMASK;
        unsigned long bits = 1ul << index;

        uint8_t* beg = arraybeg(res->local_value) + index;
        uint8_t* end = arrayend(res->local_value);

        while (siz && (beg < end)) {
            if (res->local_bits & bits) {
                *mem = *beg;
            } else {
                *mem = *((uint8_t*)addr);
            }

            bits <<= 1;
            --siz;
            ++addr;
            ++mem;
            ++beg;
        }
    }
}

Here we walk over the local values of the involved resources and copy into buf. In the inner while loop, we check if there’s a local value and either copy this or the globally shared value.

Our current setup can support 8192 bytes of transactional memory. If this limit is reached or if multiple memory locations map to the same resource, the transaction manager aborts the process. This would be detected by acquire_resource(). As a workaround to this limit the size of the resource array can be doubled until all variables are covered. A real-world implementation could also replace the array by a sparse look-up tree. That’s how it’s done in picotm.

For convenience, we can keep the load_int() and store_int() functions around. They are now mere wrappers around load() and store().

int
load_int(const int* addr)
{
    int value;
    load((uintptr_t)addr, &value, sizeof(value));
    return value;
}

void
store_int(int* addr, int value)
{
    store((uintptr_t)addr, &value, sizeof(value));
}

With helpers as simple as these, there can be plenty of type-specific functions that load and store float, long, complex double, or any other type.

Converting the Application

The old producer code looked like this.

struct int_resource g_int_resource[2];

static void
producer_func(void)
{
    unsigned int seed = 1;

    tm_save int i0 = 0;

    while (true) {

        sleep(1);

        ++i0;
        int i1 = rand_r(&seed);

        printf("Storing i0=%d, i1=%d\n", i0, i1);

        tm_begin

            store_int(g_int_resource + 0, i0);
            store_int(g_int_resource + 1, i1);

        tm_commit
    }
}

This translates into the following code.

static int g_i0;
static int g_i1;

static void
producer_func(void)
{
    unsigned int seed = 1;

    tm_save int i0 = 0;

    while (true) {

        sleep(1);

        ++i0;
        int i1 = rand_r(&seed);

        printf("Storing i0=%d, i1=%d\n", i0, i1);

        tm_begin

            store_int(&g_i0, i0);
            store_int(&g_i1, i1);

        tm_commit
    }
}

The global values of type struct int_resource are gone. Instead we work directly on integer types.

The consumer side looks similar.

static void
consumer_func(void)
{
    while (true) {

        sleep(1);

        int i0, i1;

        tm_begin

            i1 = load_int(&g_i1);
            i0 = load_int(&g_i0);

            verify_load(i0, i1);

        tm_commit

        printf("Loaded i0=%d, i1=%d\n", i0, i1);
    }
}

As in the previous post, we added a certain complexity to the transaction manager, but the application logic got a lot simpler. And just as before, we had to implement the transaction manager exactly once. Further application-specific changes are not required.

Summary

In this blog post, we removed the dedicated resource data structures from the interface of simpletm. The new interface loads and stores at arbitrary memory locations and all resource management is done internally.

We can load and store arbitrary types.
We can protect concurrent access to arbitrary memory locations.
Applications don’t have to know about dedicated data structures for resource management. The interface for transactional memory access uses memory addresses of the involved variables.
We can further optimize transactional memory access without interfering with application logic.

You’ll find the full source code for this blog post on GitHub. If you’re interested in a more sophisticated C transaction manager, take a look at picotm.

If you like this series about writing a transaction manager in C, please subscribe to the RSS feed, follow on Twitter or share on social networks.

Post by: Thomas Zimmermann