Thread: Memory pools, object lifetimes and strict aliasing

  1. #1
    Registered User
    Join Date
    May 2014
    Posts
    121

    Memory pools, object lifetimes and strict aliasing

    I've been doing some reading on how to implement a memory manager without breaking the strict aliasing rule and I'm getting very conflicting answers. I'm interested in what the standard says in both C and C++ when it comes to strict aliasing.

    This post on Stack Overflow does a good job of explaining the situation: Shared memory buffers in C++ without violating strict aliasing rules - Stack Overflow

    There's a follow-up discussion to that topic here: memory - C/C++ strict aliasing, object lifetime and modern compilers - Stack Overflow

    I can think of two ways to implement a memory pool.

    1) Global memory or on the stack:
    Code:
    unsigned char memory_pool[alignas(max_align_t) * 1000];
    2) Heap memory:
    Code:
    unsigned char* memory_pool = malloc(alignas(max_align_t) * 1000); // C
    Are you allowed to write (and then read) different objects (any type) to that unsigned char array without breaking the strict aliasing rule? Is there any difference between case 1) and 2)?

    The C++ FAQ entry on the placement new syntax seem to think that it's valid but I've read in many other places that it isn't (at least for C). Is this a case where C++ differs from C? Does it matter if the memory is dynamically allocated or not?

    The discussion about the lifetime of objects (second link from the top in this post) is interesting because it concerns how you would use a memory pool. The main question seem to be: Is this allowed (in C and C++)?

    Code:
    unsigned long* a = memory_pool;
    *a = 42;
    float* b = memory_pool;
    *b = 42; // Does this write end the lifetime of a and avoid the strict aliasing problem?
    This is obviously a really simple example. In a real example you'd probably use placement new and have more complex objects like classes. I realize that there are a lot of alignment problems that can show up but that's another discussion.
    Last edited by MOS-6581; 05-22-2014 at 03:07 PM. Reason: Used quote instead of code tags

  2. #2
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Are you allowed to write (and then read) different objects (any type) to that unsigned char array without breaking the strict aliasing rule?
    O_o

    You can't really ever write and read different objects to the memory without violating some or other rule.

    The question is, why would you need to do that?

    You would, in practice, actually be writing and then possibly reading one object of a given type followed by writing and then possibly reading a different object of a given type.

    If your pool is referenced through `unsigned char', which has exceptions, while [/I]enforcing proper alignment[/I] you will be able to "return" different objects without knowing about the type.

    You'll notice that, in C, the return value of `malloc' is implicitly coerced into a different pointer type? Literally nothing in the standard prevents two `malloc' calls separated by a `free' from having the same value.

    Code:
    {
        int * s = malloc(sizeof(*s));
        free(s);
    }
    {
        float * s = malloc(sizeof(*s));
        free(s);
    }
    In practice, both `s' objects may have the same location in memory. If that fails due to reordered reads and writes, the point of "strict aliasing" rules is to allow reasonably stable expectations regarding reordering across different environments, you have a major bug; you can't safely write anything complicated without wasting a lot of memory.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  3. #3
    Registered User
    Join Date
    May 2014
    Posts
    121
    I basically want to have N number of memory blocks of a fixed size that I can use to implement my own memory manager for performance and reliability reasons. Let's say you have a game for example where you're constantly creating and destroying small objects. Having to call malloc/free every time you create and destroy an object like in your example would not be acceptable for performance reasons. There may also be platforms where you're not even allowed to use malloc/free so in that case you'd need a static memory pool that you can use for dynamic objects.

  4. #4
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    I googled strict aliasing and one simple explanation of the concept was : Strict aliasing is an assumption, made by the C (or C++) compiler, that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)

    That sounds simple enough to implement.
    Code:
    const unsigned buffer_size = 1024;
    
    char *buffer = new char[buffer_size];
    size_t back = 0;
    
    int *x = new(buffer + back) int(4);
    back += sizeof(*x);
    
    float *f = new(buffer + back) float(1.234);
    back += sizeof(*f);
    I don't think those should overlap.
    x covers bytes 0, 1, 2 and 3. f covers 4, 5, 6 and 7.

    Unless I horribly misunderstood something and then subsequently need to do some research (I'm still learning too).

    Edit : You could also add a way of keeping free blocks as well. You'd need the base of the stack, the top of it and the current holes in it, as well as the size of each hole.
    Last edited by MutantJohn; 05-22-2014 at 05:48 PM.

  5. #5
    Registered User
    Join Date
    May 2014
    Posts
    121
    The main problem is that I can't figure out if this is legal or not:
    Code:
    unsigned char *memory_pool[sizeof(int) * 1000];
    int *p = (int *)memory_pool;
    *p = 42; // Aliasing violation in C and/or C++?
    The C++ FAQ entry on placement new seems to think that it's okay to do this if you use placement new: What is "placement new" and why would I use it?, C++ FAQ

    There's also a similar problem with dynamically allocated memory (C code below):
    Code:
    unsigned char *memory_pool = malloc(alignof(max_align_t) * 1000);
    int *a = (int *)memory_pool;
    *a = 42; // Is this okay?
    float *f = (float *)memory_pool;
    *f = 42; // What about this?
    This discussion about object lifetimes is intrinsically linked to these questions: memory - C/C++ strict aliasing, object lifetime and modern compilers - Stack Overflow

  6. #6
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    By its very nature it would then seem to break aliasing rules but who cares? The key is, don't break aliasjng rules within the pool itself. Using a pool by its nature would seem to break strict aliasing but yolo.

  7. #7
    Registered User
    Join Date
    May 2014
    Posts
    121
    Quote Originally Posted by phantomotap View Post
    You can't really ever write and read different objects to the memory without violating some or other rule.
    I wasn't talking about type punning but more about doing something like this:

    Write type A -> Do some operations on A
    Write type B -> Do some operations on B
    Write type C -> Do some operations on C

    A, B and C will all start at the same memory address here but only one type will be accessed at a time. It would basically be like a union except I wouldn't know beforehand what the type is of all the objects that could be stored there.

    Basically, are you allowed to change the effective type of the object (the part of the memory it occupies) by writing a new value (type) there? I found a similar discussion here but the answers were inconclusive: c - Strict aliasing rules for allocated objects - Stack Overflow

  8. #8
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    @MOS-6581:

    The code, as is, from post #5 does break the "Strict Aliasing" rule. The "Strict Aliasing" rule exists to allow the compiler to make assumptions about reads and writes. Specifically, the compiler is allowed to assume that pointers and references to "incompatible types" do not address the same memory location so that code may be ordered in such a way to benefit from whatever environment is under consideration.

    You have allocated some memory. The compiler has not stored a value at the memory address provided by the allocation. The compiler has not inserted an instruction for reading at the provided memory address. You have not "told" the compiler to read the provided memory address. You write a value to the memory address provided. You then write a different value to the memory address provided. Your code does indeed not read, "access", a stored value from an "incompatible type". However, the "Strict Aliasing" rule comes into play because the compiler may reorder the assignments. In other words, the "imaginary" next line which may read from either `a' or `f' exhibits undefined behavior because the compiler may have written instructions such that the integer value `42' or the floating-point value `42.0f' exists at the provided memory address. Again, the actual value at the provided memory address is undefined because the code violates the "Strict Aliasing" rule.

    As I implied before you ever posted such code, that code is irrelevant with respect to a custom allocator. You are not responsible for how a client uses your memory allocator. You can do nothing if the client fails to observe the "Strict Aliasing" rule. Your job is to write a memory allocator.

    Code:
    void * sPool = malloc(sizeof(int) + sizeof(float)); /* just being safe with the size */
    if(rand() % 2)
    {
        int * s = sPool;
        // use `*s' only as an `int'
        *s = 0;
        /* ... */
        printf("%d", s);
    }
    else
    {
        float * s = sPool;
        // use `*s' only as an `float'
        *s = 0.0f;
        /* ... */
        printf("%f", s);
    }
    Such code as this does not violate the "Strict Aliasing" rule. Yes, we do have two references to the allocated memory in both paths. However, we have only stored the address; we are not reading/writing to the same memory allocation with "incompatible types".

    Again, the "Strict Aliasing" rule comes into play with pointers and references because of the referenced memory as related to code attempting reads and writes with "incompatible types".

    You could have a million different pointers to the same location. If the code only ever reads the memory address as a specific object with a compatible type the code does not violate the "Strict Aliasing" rule.

    Of course, the code may violate other rules such as alignment related issues, and you will have to account for such other rules. Again, you are so near the forest your vision is full of nothing but bark. You need to step back from the specific issue of "Strict Aliasing" so that you may see how all the related rules, optimizations, and how compilers ignore some of those things interrelate.

    Yes. You are allowed to "construct" a new object at a specific memory address. Again, look at my code in post #2; the standard does not guarantee that the memory address referenced by both `s' variables is different. If you were not able to "construct" a different object at a specific memory address the code would exhibit "undefined behavior". Such code can not exhibit undefined behavior. (Edit: Some ancient environments didn't check the `free' parameter full null which could cause "undefined behavior", but those environments are not really relevant.) The "Strict Aliasing" rule only comes into play when you've used "aliases", different names for "incompatible types", to read and write the same address with pointers and references.

    Code:
    void * sPool = malloc(sizeof(int) + sizeof(float)); /* just being safe with the size */
    int * s1 = sPool;
    float * s2 = sPool;
    Such code as this example does not violate the "Strict Aliasing" rule. The actual object, the "value" at the memory address, hasn't been read or written.

    Code:
    void * sPool = malloc(sizeof(int) + sizeof(float)); /* just being safe with the size */
    int * s1 = sPool;
    float * s2 = sPool;
    *s2 = 0.0f;
    if(*s1 == 0) /* attempt to check binary interpretation of `0.0f' for equivalence to binary 0 */
    {
        /* do something */
    }
    Now such code as this example violates the "Strict Aliasing" rule.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  9. #9
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    Quote Originally Posted by MOS-6581 View Post
    I basically want to have N number of memory blocks of a fixed size that I can use to implement my own memory manager for performance and reliability reasons. Let's say you have a game for example where you're constantly creating and destroying small objects. Having to call malloc/free every time you create and destroy an object like in your example would not be acceptable for performance reasons. There may also be platforms where you're not even allowed to use malloc/free so in that case you'd need a static memory pool that you can use for dynamic objects.
    The only platforms I've heard where there's no dynamic allocation is on embedded systems because there ain't no OS (or is this a kernel-specific attribute?). Is that what you're potentially writing for?

    Using a fixed size buffer is a common practice because yes, it allows for better performance with allocations. Well, technically constructions because this is C++.

    I feel like phantom already explained much of this but don't do silly things. If you need an int and a float, just use different spots in the buffer. If you no longer want the int, destruct the region and then set the pointer to null. Keep track of this free spot as well as the size of it. This allows you to create a buffer which supports multiple data types of variable size.

    Aside from that, I don't see the point in allocating an int and then just writing over it. Why would you even do that in the first place?

  10. #10
    Registered User
    Join Date
    May 2014
    Posts
    121
    I still don't understand strict aliasing to be honest. I understand your last example but everything else you wrote is hard to understand without context and raises more questions than it answers. Let me give another example that we can discuss:

    This is apparently a strict aliasing violation:
    Code:
    void *mem = malloc(sizeof(int) + sizeof(float));
    int *a = mem;
    *a = 42;
    float *b = mem;
    *b = 42;
    Now let us assume that malloc and free are implemented in such a way that a call to malloc will always return a pointer to the last freed memory. What magic does free do if the following is allowed but the above isn't?
    Code:
    void *mem = malloc(sizeof(int) + sizeof(float));
    int *a = mem;
    *a = 42;
    free(mem);
    mem = malloc(sizeof(int) + sizeof(float)); // Assume that mem has the same value as before
    float *b = mem;
    *b = 42;
    This is exactly what a custom memory allocator would do if you implement it as a linked list where a deallocation leads to the memory chunk being put back at the front of the list (where memory is allocated from). There's an example of such a memory allocator here: #AltDevBlog » Alternatives to malloc and new

    Quote Originally Posted by phantomotap View Post
    Code:
    void * sPool = malloc(sizeof(int) + sizeof(float)); /* just being safe with the size */
    if(rand() % 2)
    {
        int * s = sPool;
        // use `*s' only as an `int'
        *s = 0;
        /* ... */
        printf("%d", s);
    }
    else
    {
        float * s = sPool;
        // use `*s' only as an `float'
        *s = 0.0f;
        /* ... */
        printf("%f", s);
    }
    Such code as this does not violate the "Strict Aliasing" rule. Yes, we do have two references to the allocated memory in both paths. However, we have only stored the address; we are not reading/writing to the same memory allocation with "incompatible types".
    The bolded part is very confusing to me since the code does write to the same memory location using two different types but the program is logically structured so that only one of the reads can happen (determined by rand). Would changing your code into an infinite loop that calls your code block make it a strict aliasing violation?

  11. #11
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by MOS-6581 View Post
    This is apparently a strict aliasing violation:
    Code:
    void *mem = malloc(sizeof(int) + sizeof(float));
    int *a = mem;
    *a = 42;
    float *b = mem;
    *b = 42;
    Yes, because of what is done with a and b (they are both aliasing the same thing) not because of the malloc() call.

    Which renders the rest of your discussion about magic in malloc() or free() - or in custom allocators - moot. They don't contribute to your example running afoul of the strict aliasing rule, and no magic in them can change that.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  12. #12
    Registered User
    Join Date
    May 2014
    Posts
    121
    I do not follow your logic. It is certainly true that the first example violates the strict aliasing rules and that's because I'm dereferencing two pointers of different types that point to the same memory location. That's also exactly what I'm doing in the second example though.

  13. #13
    Registered User
    Join Date
    May 2014
    Posts
    121
    This is the strict aliasing rule from C99:
    An object shall have its stored value accessed only by an lvalue expression that has one of the following types
    — a type compatible with the effective type of the object,
    — a qualified version of a type compatible with the effective type of the object,
    — a type that is the signed or unsigned type corresponding to the effective type of the
    object,
    — a type that is the signed or unsigned type corresponding to a qualified version of the
    effective type of the object,
    — an aggregate or union type that includes one of the aforementioned types among its
    members (including, recursively, a member of a subaggregate or contained union), or
    — a character type.
    An object is defined as the following:
    region of data storage in the execution environment, the contents of which can represent values
    The effective type is defined as the following:
    The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
    Footnote:
    Allocated objects have no declared type.

    The reason why the second example with the call to free followed by a call to malloc does not violate the strict aliasing rule is because the two pointers are not referring to the same object even though they are referring to the same memory address. This has to do with the fact that objects have lifetimes and calling free on an object that was allocated by malloc effectively ends the lifetime of the object. This is also what happens in the second example so when b is dereferenced it is accessing a completely different object compared to what the pointer a was accessing.

    This example does still not make sense to me:
    Quote Originally Posted by phantomotap View Post
    Code:
    void * sPool = malloc(sizeof(int) + sizeof(float)); /* just being safe with the size */
    if(rand() % 2)
    {
        int * s = sPool;
        // use `*s' only as an `int'
        *s = 0;
        /* ... */
        printf("%d", s);
    }
    else
    {
        float * s = sPool;
        // use `*s' only as an `float'
        *s = 0.0f;
        /* ... */
        printf("%f", s);
    }
    Such code as this does not violate the "Strict Aliasing" rule. Yes, we do have two references to the allocated memory in both paths. However, we have only stored the address; we are not reading/writing to the same memory allocation with "incompatible types".
    The bolded part does not seem true to me. Both the paths are by definition accessing the value of the object that s points to and since it's the same object in both cases I don't see how this doesn't violate the strict aliasing rule since int and float are not compatible types.

  14. #14
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    That's because you're using the strict aliasing rule to reason about things it has no connection with.

    Dereferencing x after calling free(x) results in undefined behaviour. Period. That has nothing to do with the strict aliasing rule.

    Let's say we do this
    Code:
    int *a = malloc(sizeof(*a));    /* assume malloc() does not fail */
    int *b = a;
    *b = 42;     /* well defined */
    *a = 31;     /*  well defined */
    free(a);
    *a = 1;       /*   undefined behaviour */
    *b = 2       /*   undefined behaviour */
    a = malloc(sizeof(*a));     /*  assume this also succeeds *
    *a = 3;       /* defined behaviour */
    *b = 4;       /* undefined behaviour */
    The last statement has undefined behaviour because it is dereferencing a pointer than has been freed. It doesn't matter if the second malloc() call returns the same value as the first. The last statement STILL has undefined behaviour.

    It might happen that the two malloc() calls return the same value. That doesn't make the last statement have defined behaviour. It also doesn't prevent your program behaving as you expect - even if your expectations are invalid. Such is undefined behaviour - sometimes the observed results can SEEM perfectly sensible and well defined.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  15. #15
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    The bolded part does not seem true to me. Both the paths are by definition accessing the value of the object that s points to and since it's the same object in both cases I don't see how this doesn't violate the strict aliasing rule since int and float are not compatible types.
    O_o

    That is because you do not know what you are talking about.

    The code paths do not access the same object. Both paths access the same memory location.

    The object is either an `int' OR a `float'; the object is never treated as both.

    One final time: stop reading only one single entry out of the entire standard.

    *shrug*

    Actually, do whatever you like; I'm done with you anyway. You seem desperate to read every aspect of the standard in isolation from other parts of the standard as well as how compiler vendors interpret the standard. That is never going to work, and I am tired of trying to convince you to do otherwise.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Strict aliasing
    By MOS-6581 in forum C Programming
    Replies: 4
    Last Post: 05-17-2014, 05:18 PM
  2. enums and strict aliasing rule
    By cyberfish in forum C++ Programming
    Replies: 6
    Last Post: 08-10-2010, 08:44 PM
  3. How to know if my code is ANSI strict ?
    By jabka in forum C Programming
    Replies: 1
    Last Post: 10-19-2007, 07:32 AM
  4. Replies: 4
    Last Post: 08-27-2007, 11:51 PM
  5. Array of object & memory
    By richard_hooper in forum C# Programming
    Replies: 7
    Last Post: 05-21-2005, 01:08 AM