Thread: Thread safety for tiny simple functions

  1. #1
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937

    Thread safety for tiny simple functions

    Hello,
    I've found that dereferencing elements of an array can be done safely, in parallel, by different threads, as long as the threads are referring to different elements. That is,
    Code:
    //Thread 1....
    for(int i = 0; i < 100; ++i)
        array[i] = 3*i;
    //end Thread 1
    // . . . .
    //Thread 2 .....
    for(int i = 100; i < 200; ++i)
        array[i] = i % 3;
    //end Thread 2
    is safe. Am I guaranteed the same for the following?
    Code:
    struct array
    {
       type& operator()(unsigned int index) { return data[index]; }
       // . . .
    }; 
    
    array gyar;
    
    //Thread 1....
    for(int i = 0; i < 100; ++i)
        gyar(i) = 3*i;
    //end Thread 1
    // . . . .
    //Thread 2 .....
    for(int i = 100; i < 200; ++i)
        gyar(i) = i % 3;
    //end Thread 2
    If they're inlined then it's the same. What about if not? Is calling the same function ok? Will one have to wait? I'm looking for an answer in terms of standard behavior (if such a thing exists for threads).

    Thanks.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

  2. #2
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    This isn't specifically covered in C/C++ standards (out of scope) but is addressed in multithreading specifications (eg posix).

    The practical rule is that, if any thread is modifying the data at a specific area of memory, your code needs to ensure that no other thread can access (read or modify) that same area of memory at the same time.

    Your code is doing that, as long as the ranges of i in the two loops do not overlap, the variables i do not co-exist (eg i is not a shared global), and as long as the data array does not get reallocated (eg resized) during the loop.

    Inlining is not likely to break things, if your code guarantees there is no concurrent access of any memory location. If you don't provide that guarantee, all bets are off.

  3. #3
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Function calls are irrelevant to thread safety. Only data access matters.

    Note that there are some peculiar rules there. Some architectures cannot access areas smaller than a machine word. For example, a DEC Alpha CPU can only write 64-bit blocks. If you write something smaller, it will load the 64-bit word, replace the relevant part, and write the whole word again. In practice, this means that if two threads access adjacent values that lie within the same word, they may overwrite each other's changes.
    The x86 architecture can access single bytes, so you don't need to worry about that on a typical PC. More interesting is cache coherency. If I remember correctly, x86 is a cache-coherent architecture, which means that the architecture guarantees that multiple cores/CPUs have the same view of system memory. What one CPU writes, another reads, if the read is indeed later than the write. This makes life for the programmer a lot easier than on architectures without coherency, where a read might read the old value until the cache is flushed.
    However, this convenience comes at a cost. When a CPU writes a value, it must signal all other CPUs, and those must then discard their cached versions of the memory area and reload it. Thus, if you have multiple CPUs mucking about a single cache line, you may see a massive performance drop as each CPU's cache is constantly flushed.

    Just things to be aware of.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  4. #4
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937
    Function calls are irrelevant to thread safety. Only data access matters.
    Perfect. Thank you two for all of the useful information.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Simultaneous writing to array elements is not guaranteed to be safe, even if the elements are at different addresses. This is colloquially known as "word tearing."

    Take for instance the SPARC architecture, where all memory access is done in 32-bit words. Writing to adjacent elements of a char array will probably involve accesses to the same underlying memory word, even though you are accessing different parts of that word. The read-alter-write cycle will cause data corruption if you don't properly protect these accesses.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  6. #6
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937
    Fortunately I will be dealing with doubles, which are generally 4 (*edit* 8) chars big, which is nice since that's two words for x86 and one for x64. So no problem, right?
    Last edited by CodeMonkey; 12-29-2008 at 10:13 PM.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

  7. #7
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by CodeMonkey View Post
    Fortunately I will be dealing with doubles, which are generally 4 chars big, which is nice since that's two words for x86 and one for x64. So no problem, right?
    You can safely declare "no problem" for any particular platform, but not for all platforms in general. In practice, on your typical PC hardware with an array of doubles, I'd say "no problem."
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by CodeMonkey View Post
    Fortunately I will be dealing with doubles, which are generally 4 chars big, which is nice since that's two words for x86 and one for x64. So no problem, right?
    "doubles" are generally 8 chars big, but yes, on x86_(32, 64) it would work fine.

    However, as mentioned before, cache-coherency can cause processing inefficiency - in fact one of the major limitations to performance on LARGE Opteron systems is the fact that each processor must at all times be told about every memory access of every other processor - and if the memory access causes the cache to be emptied, then the next access of that area (64 or 128 bytes) of memory will mean a read of it.

    Also be aware that many floating point (double, float or long double) [as well as some integer] operations may decompose by the compiler into more than one instruction, e.g.

    Code:
    // Global variable. 
    double a;
       ...
       a *= 3.0;
    this code may come to:
    Code:
        fld    a
        fld    constant_3_0
        fmulp
        fstp  a
    Now, if another thread is ALSO modifying a, there is absolutely no way to tell whether a has been or hasn't been multiplied by 3 when the other thread is modifying it. It also doesn't say that the other thread can't do it's operation, and then the result of fmulp is produced and a is stored with the value from that calculation.
    (The above example applies to arrays too - I'm just making it REALLY simple)

    You must MAKE SURE that you are never operating on the same data in two threads. If you walk through the same array from two different threads, you WILL not be sure which thread has gone how far - it may work the first two, three or hundred times you try it out. Then something changes (you add one more if-condition, another variable, or something else) and it breaks the whole thing. Or you run on a different model of processor where some particular operation is quicker, or ... I can go on for almost forever with possible scenarios - there are so many variables. There are only two safe options:
    1. Your data is protected by some sort of lock (and locks are made safe by the fact that the processor has specific instructions that are safe in these places).
    2. Your data is known to be clearly different locations.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    If you don't use some type of lock in your code you stand the chance of introducing some very hard to find bugs.

    Best bet is to ensure it cannot happen by using a lock mechanism rather than finding when and where you can get away with not locking. You will eventually get caught and therefore good coding practice would be to follow the multi thread guidelines as they have been laid out by others in this thread.

  10. #10
    Registered User
    Join Date
    Oct 2007
    Posts
    166
    Using OpenMP, would this simple example be safe:

    Code:
    std::vector<int> arr(3000);
    #pragma omp parallel for
    for (int i = 0; i < 3000; i++)
    {
        arr[i] = i;
    }
    ...since only different indices will be written to at the same time?

    From the MSDN help about vector thread safety:

    If a single object is being written to by one thread, then all reads and writes to that object on the same or other threads must be protected. For example, given an object A, if thread 1 is writing to A, then thread 2 must be prevented from reading from or writing to A.
    ...what is the "object" here, is it the whole vector or each item in the vector?

    http://msdn.microsoft.com/en-us/libr...3b(VS.80).aspx

  11. #11

  12. #12
    Registered User
    Join Date
    Oct 2007
    Posts
    166
    Quote Originally Posted by Codeplug View Post
    The whole vector.

    gg
    Are you sure? Why does it work that way for arrays then and not vector? From what I have read people are saying that vector is thread safe since all functions are synchronized, something like that.

    I need a second opinion.

  13. #13
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    Quote Originally Posted by Codeplug View Post
    The whole vector.
    I agree, except in the case where it is a vector of pointers. because you're not manipulating the data *in* the vector - just what it points to. I use this strategy in some of my code. I know it's not safe to directly manipulate data stored in a vector, so I allocate memory on the heap for my objects and store a pointer to the object in the vector. I can then pass that pointer to a thread procedure and read/write that object all I want.

  14. #14
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    It's the whole vector because the documentation says so. In reality, depending on the implementation, the thread safety might be the same as a simple dynamic array, i.e. element accesses are independent as long as you don't do something to the whole array. (Note that various operations on the vector can affect everything.)
    However, this reality is implementation-dependent, and as long as the implementor doesn't guarantee independent element-wise access, you don't have it. MS talks of STL object in their threading guarantees, so you don't have the guarantee for individual objects.

    However, the next standard guarantees that access to individual elements of any standard container does not conflict. See 23.1.2p2 in the current draft.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  15. #15
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> except in the case where it is a vector of pointers. because ...
    You're referring to the thread safety of the objects within the container - not the container itself. Using invalidated references to container objects is bad with or without threads.

    Since the C++03 standard doesn't address threading at all, any type of thread-safety within a standard library implementation has to be documented by the implementer. The above link to MSDN is the documentation for the MS implementation. It doesn't apply to anyone else's implementation. Here's STLport's documentation for example.

    >> Are you sure?
    The following thread safety rules apply to the [MS] Standard C++ Library:
    Container Classes and complex
    The container classes are vector, deque, list, queue, stack, priority_queue, valarray, map, hash_map, multimap, hash_multimap, set, hash_set, multiset, hash_multiset, basic_string, and bitset.

    A single object is thread safe for reading from multiple threads. For example, given an object A, it is safe to read A from thread 1 and from thread 2 simultaneously.

    If a single object is being written to by one thread, then all reads and writes to that object on the same or other threads must be protected. For example, given an object A, if thread 1 is writing to A, then thread 2 must be prevented from reading from or writing to A.

    It is safe to read and write to one instance of a type even if another thread is reading or writing to a different instance of the same type. For example, given objects A and B of the same type, it is safe if A is being written in thread 1 and B is being read in thread 2.
    I don't see where the confusion is. It's obvious to me that the "objects A and B" refer to one of the listed container classes or complex.

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Help on a simple functions program
    By blankstare77 in forum C++ Programming
    Replies: 3
    Last Post: 08-14-2005, 07:06 AM
  2. How to make a thread sleep or std::recv timeout?
    By BrianK in forum Linux Programming
    Replies: 3
    Last Post: 02-26-2003, 10:27 PM
  3. MFC Controls and Thread Safety :: MFC
    By kuphryn in forum Windows Programming
    Replies: 0
    Last Post: 12-06-2002, 11:36 AM
  4. Member Functions as Thread Procs
    By johnnie2 in forum C++ Programming
    Replies: 3
    Last Post: 12-01-2002, 07:14 PM
  5. Need help with simple overloaded functions and pointers
    By smokedragon in forum C++ Programming
    Replies: 2
    Last Post: 03-20-2002, 09:15 PM