Thread: To Volatile, or not to Volatile?

  1. #1
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262

    To Volatile, or not to Volatile?

    Hello everybody,

    I'm a bit confused by the keyword "volatile". My biggest issue is: everybody seems to be saying different things.

    I've written a class, with two threads, one thread going into an loop as long as a certain flag in the class is true. The regular thread sets this flag in the constructor and destructor of the class.
    The pseudo-code:
    Code:
    class SomeClass {
      public:
        SomeClass()
        : isRunning_(true)
        {
            createThread();
        }
    
        ~SomeClass()
        {
            isRunning_ = false;
            waitForThread();
        }
    
      protected:
        void threadFunction()
        {
            while(isRunning_) {
                ...
            }
        }
    
        ...
    
      private:
        [volatile??] bool isRunning_;
    };
    Now my question is: does isRunning_ need to be volatile? If not, I could imagine the compiler optimizing the "while(isRunning_)" bit to "if(isRunning_) while(true)", so I would say I need it. However, many sources on the internet say volatile is useless in any case for threading (and no, they're not just talking about making a technique to implement critical sections).

    So should I use volatile here? And why, or why not?


    Thanks in advance,
    Evoex

  2. #2
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    You should absolutely use `volatile' for variables that are updated from the great "elsewhere".

    A lot of compiler, at highest optimizations in any event, will still ignore `volatile'. However, the majority of compilers will cache a variable it "knows" isn't updated anyway so `volatile' should only ever help.

    It is true, `volatile' may not prevent this issue in all cases, but without `volatile' you can pretty much bank on having such issues.

    As for everyone saying different things, I can explain part of that. The `volatile' keyword is generally brought up by people trying to force a sequence point. (An example, as you suggested, would be a portable attempt at critical sections. My favorite example is "double checked locking" without serialization primitives for which `volatile' was once thought to solve.) Only a few compilers honor that ideal. The rest of the uncertainty comes from compilers and the standard both of which are seemingly uncertain.

    You aren't trying to introduce a sequence point. You are only saying "Please mister compiler, read this variable every time!".

    By the way, this exact example (memory that might be modified by the great "elsewhere") is one of the few cases where `volatile' is useful. It has nothing to do with threads. If your code was using signal handlers capable of interrupting execution in a way that would write to `isRunning_' you would still need (and want) `volatile'.

    So, yep, `volatile' is perfectly useless for threading (outside of a few very specific cases for a few compilers); that doesn't have anything to do with you because you are doing it to prevent caching. Your use of threads is irrelevant.

    [Edit]
    Full disclosure: this is still not going to work the way you are expecting because your reads and writes aren't atomic simply because of `volatile'.

    In other words, you need `volatile' for the compiler; you also need a mechanism to guarantee atomicity for the processor.

    And some more: if you are using a primitive that is protected from compiler ordering shenanigans by the API level you would not need nor want `volatile'. You would use `volatile' here because you are using a simple integer. If you were using an atomic from a library (like standard POSIX implementations) that atomic would not need `volatile'.

    See? There isn't so much a confusing "Some say this; some say that!" situation as there is a confusing reality.
    [/Edit]

    Soma
    Last edited by phantomotap; 05-11-2012 at 10:12 AM.

  3. #3
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Thanks for your answer! It makes sense, now, though it makes a lot of articles on the internet make less sense...

    You're right that I need mutexes in accessing isRunning_, I do have that in my code, I just didn't bother showing that in the pseudocode.

  4. #4
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    I've edited that post like ten times.

    Might want to make sure you've read the latest version. ^_^

    Soma

  5. #5
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    There were some changes I hadn't read ;-). But what exactly do you mean by "an atomic"?

    Also, if I would have an object in the class, say:
    Code:
    std::queue<Task*> tasks_;
    Would this need to be volatile as well?


    Thanks!

  6. #6
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    Okay.

    Let's start with "I have no idea what you are doing with `isRunning_' so I assumed the code you posted was fairly realistic."

    I have no idea what you are doing with `isRunning_' so I assumed the code you posted was fairly realistic. ^_^

    You say you are protecting reads and writes to `isRunning_' (Curse you for making me type '_' at the end!) with a "mutex". If that "mutex" is a full mutually exclusive lock the variable `isRunning_' does not need to be `volatile' in the context you've used it in the original post. The lock protects writes from the great "elsewhere" while reading so can't be updated beyond that scope so the cached value may be safely assumed to be correct.

    I assumed, incorrectly as it happens, that the code was a realistic thing. I thought you were setting up a class that can "spinlock" itself with an atomic to protect some of its components of some other similar concept. (This is a thing I've done.) In that case, the variable `isRunning_' would need to be `volatile' because the compiler mustn't cache the value because the value itself is being used to control, by atomic operations, the flow of threads.

    The distinction here, I'm sure it has been lost as I have a fever, is that in the first case the variable is part of state that is protected while in the second case the variable is part of the machine doing the protection.

    Would this need to be volatile as well?
    That's almost certainly unnecessary; it would probably pointlessly slow down anything that uses that variables.

    However, the important thing is, how is it used?

    Is it, somehow, part of the machinery you are building to protect state? Are you using this queue to control communication? If so, it would provably need to be `volatile' to keep the compiler from doing certain optimizations.

    Or is it, as is most likely, part of the state that needs protected? Are you using this queue as communication? If so, other facilities are protecting the queue so `volatile' isn't necessary.

    Soma

  7. #7
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    But what exactly do you mean by "an atomic"?
    I'm using "an atomic" in this case to mean any facility that guarantees reads and writes are sufficiently ordered to prevent "partial reads" or "partial writes".

    It could be an intrinsic (like the X86 assembler instructions) or a higher level facility wrapping a value (like the new `std::atomic<???>' templates).

    Soma

  8. #8
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by phantomotap View Post
    You say you are protecting reads and writes to `isRunning_' (Curse you for making me type '_' at the end!) with a "mutex". If that "mutex" is a full mutually exclusive lock the variable `isRunning_' does not need to be `volatile' in the context you've used it in the original post. The lock protects writes from the great "elsewhere" while reading so can't be updated beyond that scope so the cached value may be safely assumed to be correct.
    I guess I didn't understand volatile as well as I thought, then (sorry for my confusion). The code more accurately looks like this (should've written this in the first place):
    Code:
    class SomeClass {
      public:
        SomeClass()
        : isRunning(true)
        {
            createThread();
        }
     
        ~SomeClass()
        {
            pthread_mutex_lock(...);
            isRunning = false;
            pthread_mutex_unlock(...);
            waitForThread();
        }
     
      protected:
        void threadFunction()
        {
            while(1) {
                pthread_mutex_lock(...);
                if(!isRunning)
                    break;
    
                pthread_mutex_unlock(...);
    
                ...
            }
        }
     
        ...
     
      private:
        [volatile??] bool isRunning;
    };
    (I know, I'm not unlocking the mutex, again, laziness, but I did remove the underscore for you ;-).

    However, I don't see how that would make a difference. As I understood it, the compiler could still recognize that the isRunning variable is not modified in the function and thus must always remain the same, allowing it to cache the value, or store it in a register, rather than reading it from memory every time it's accessed.

    So what's the error in my logic, or my understanding of volatile?

    Thanks!

    Evoex
    Last edited by EVOEx; 05-11-2012 at 12:44 PM.

  9. #9
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    So what's the error in my logic, or my understanding of volatile?
    ^_^

    I'm not sure that you have any logic error beyond trying to resolve fact (what `volatile' can usually do) with opinion (what `volatile' can't do).

    Try it this way: focus on the protection from optimizing compilers that `volatile' does usually provide (preventing certain caching) and ignore everything else that `volatile' may or may not provide (atomicity).

    For that code, you would want to use `volatile'.

    I'll try to think of a better way to explain the core issue later but for now read again the last bits of post six. True, " yes threads" or "no threads" is still irrelevant. The threads are simply a part of the great "elsewhere". That code is definitely a case for `isRunning' being `volatile' because it may be modified by the great "elsewhere" so the compiler mustn't cache the value.

    You are using a lock to serialize reads and writes to `isRunning'; that lock protects `isRunning' from the processors playing with ordering, cache, and values. However, `isRunning' is itself a part of the machinery used to control the flow of threads so it still needs `volatile' to protect it from the compiler playing with ordering, cache, and values. In this case, you are introducing flow control protecting `this'; the instance (`this') mustn't die until the thread ends and `isRunning' partially governs that behavior.

    Please don't get confused here, nothing of the reason and rationale has changed. You only have a different situation than what I had understood. The fact that you are using threads is still irrelevant. The issue here is, as I'm certain that you correctly understood, whether or not the compiler can make assumptions about ordering, cache, and values.

    All of that out of the way, this situation is a really bad use of `volatile' and full mutually exclusive locks. (I'm literally talking about the mechanism from the code you posted. If the mechanism from the real code you have is different this may not apply.) An atomic integer is the correct way to go for this mechanism. You would use the appropriate API every time for reading and writing the atomic integer allowing you to remove `volatile' and the mutually exclusive lock. (In case you are wondering, every real API for atomic integers protect the variable from the compiler by forcing a sequence point beyond the view of the optimizer usually by implementing the underlying operations directly in assembler.)

    Soma

  10. #10
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    honestly, if you're doing threading in C++, you should really be using boost::thread or if your compiler supports C++11, std::thread. std::thread became available to G++ users with version 4.5 or 4.6. I'm not sure which exactly. if you're using a relatively recent release of most any linux distribution, you should be able to use those features. the command-line option for G++ is -std=c++0x or -std=gnu++0x.the newer versions also include more C++-like methods for locking and unlocking mutexes, etc.

  11. #11
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    There are 2 simple rules for "when to use volatile":
    1) if accessing a global from within a signal handler, that global must be defined as "volatile sig_atomic_t".
    2) if reading or writing to memory mapped I/O (device driver type of stuff)

    If you don't fall under one of those, volatile isn't needed and adding it does nothing but hinder your compiler's optimizer.

    gg

  12. #12
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    If you don't fall under one of those, volatile isn't needed and adding it does nothing but hinder your compiler's optimizer.
    That is way too much of an over simplification.

    I'm not saying it is wrong really; you are just attempting to eliminate thought in favor of a simple rule which is pretty much on the same side of silly as thinking `volatile' works as a memory fence.

    Actually, memory fence offers a good example. If you are using a cheap form of access serialization to a normal integer (one that isn't atomic) that doesn't come with a full memory barrier `volatile' is necessary for most compilers.

    It still has nothing to do with threading. It still may not be sufficient. It is still better to use atomic operations on the target variable or primitives with a memory fence. That's still too much of an oversimplification though; perhaps at least say "reach for a locking primitive with a memory fence" as well.

    [Edit]
    This has already been seen so I will not remove anything.

    Still, if the above came off as rude I apologize. That was not my intention.
    [/Edit]

    Soma
    Last edited by phantomotap; 05-11-2012 at 03:44 PM.

  13. #13
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    @EVOEx: it's not normally a good idea to make anything other than primitives volatile. A "volatile std::queue" won't do what you think it does, because the queue variable will be volatile, but the majority of its data will be on the heap, and that will not be volatile. You could, I suppose, say "std::queue<volatile int>", but if you're doing that you really need to re-think your design.

    Everyone seems to know this more or less but I thought I'd state it again: volatile is just meant to tell the compiler that the value of this variable could change without warning, outside of the normal program flow control. Hence, don't cache read-values for the variable because it could have changed. There are many situations in which this could arise, and certainly multithreaded flags are a common example.

    You do not need to say volatile if you're for example spinning on a variable (while(var)), but could change the variable in the loop. The compiler will notice this with definition-use chains and kill the value used by the while loop, and it won't do the wrong thing with e.g.
    Code:
    while(flag) {
        if(std::rand() % 100) flag = false;
    }
    Compiler optimizations are supposed to preserve the functionality of the program. But the compiler can only do this properly if it has full information, and one of the assumptions it makes is it can see changes to a variable within a function or whatever. Normally, this is perfectly fine. It's only when you're doing something outside of normal flow control that you may need to use volatile -- here "normal flow control" includes loops, conditionals, exceptions, function calls, etc. (Probably doesn't include longjmp though, because longjmp sucks.)

    Cheers, dwk.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  14. #14
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Okay, makes sense now... Thanks very much, guys!

  15. #15
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Well, to satisfy my own curiosity about modern compilers, `volatile', and memory barriers I ran a bunch of tests last night while I was asleep.

    My testbed includes:

    nine modern compilers.
    compiled for 32 bit and 64 bit targets where possible.
    compiled for "GNU/Linux" and "Windows" where possible.
    compiler with and without optimizations where possible.
    poor code written assuming `volatile' does provide some atomic and ordering guarantees beyond reality.
    simple code written without `volatile' that uses locks with memory barriers.
    simple code written with `volatile' that uses locks with memory barriers.
    simple code written without `volatile' that uses locks that has no memory barrier.
    simple code written with `volatile' that uses locks that has no memory barrier.
    obfuscated code written without `volatile' that uses locks with memory barriers to try and force the compiler do something "kinky".
    obfuscated code written with `volatile' that uses locks with memory barriers to try and force the compiler do something "kinky".
    obfuscated code written without `volatile' that uses locks that has no memory barrier to try and force the compiler do something "kinky".
    obfuscated code written with `volatile' that uses locks that has no memory barrier to try and force the compiler do something "kinky".

    Unsurprisingly `volatile' didn't do anything it wasn't designed to do.
    Unsurprisingly `volatile' didn't prevent the optimizer from doing something "kinky".
    Unsurprisingly the lack of `volatile' didn't break code using locks with memory barriers.
    Unsurprisingly the lack of `volatile' didn't break obfuscated code in that it was also broken with `volatile'.

    Surprisingly, to me at least, only three tests presented a situation where `volatile' actually worked to prevent correct code using locks that has no memory barrier from breaking. (These tests were focused on whether or not `volatile' could prevent the compiler from caching a old value.) The majority of tests from this category were successful even without `volatile'.

    With my own evidence bashing me over the head I can't help but change my suggestion. You should not reach for `volatile' by default where a variable may be mutated beyond the view of the compiler by the great "elsewhere" as I asserted. You should simply not bother with `volatile' unless you've managed to prove that the lack of `volatile' is significant. It isn't so much that it can't help in a few very specific situations; it absolutely can and does. It just happens that these cases are likely to be extremely rare so reaching for something that can cripple optimizations by default is severely flawed.

    Soma

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Const and volatile
    By GokhanK in forum C Programming
    Replies: 9
    Last Post: 01-16-2011, 06:55 PM
  2. Volatile Variables
    By Gauravmore in forum C Programming
    Replies: 4
    Last Post: 01-11-2011, 08:29 AM
  3. what are the implications of volatile....
    By sanddune008 in forum C Programming
    Replies: 6
    Last Post: 06-29-2010, 04:33 AM
  4. volatile??
    By jacktibet in forum C Programming
    Replies: 2
    Last Post: 05-29-2003, 03:46 PM
  5. volatile keyword help!!!!
    By penney in forum Linux Programming
    Replies: 2
    Last Post: 03-12-2003, 08:09 AM