This thread developed quickly - everyone writes like twitter it seems (I'm just an old man complaining about 4 or 5 posts that appeared while I was writing this )
If you have a library that isn't itself threadsafe, and you must use it, you can't use it with threads, you must resort to processes and IPC if required, or replace the library.
I've been writing threaded applications for years (OS/2 and Unix when it finally supported it - Unix was all about processes in it's earlier versions, not threads).I don't know if I'm using "thread-safe" here properly, but I've definitely hit upon a problem of concurrency that deserves to have a name. So what is it?
In all that time I've seen the lexicon change, and I'm not convinced it is as well defined as other regions of this science.
Concurrency does have a more reasonable definition; two or more things happening at once, but it doesn't really imply safety. In fact, it is an error to invoke concurrency on a common resource without some form of protection - that doesn't stop the concurrency, it's just a bug, like you've observed.
The fact is that the C library was designed at a time when Unix (the OS which emerged from C, as I read that history) did NOT have threads. It had processes, but not threads within a process - at least not in the 'standard' distributions I had seen. The library assumed 'ownership' such that functions like localtime and asctime used a static structure - obviously a design assumption that may have been fine at the time, but an example of lack of foresight (but that was, what, 40 years ago).
Indeed, you must carefully select API's in threaded development because of this, and I submit one should prefer to avoid C library use as much as is practicable if you intend to use threads. That doesn't mean not to use it at all, but to employ a simple C++ axiom - encapsulate so you can control it.
In your example, foo and goo CAN (though likely won't) call asctime at the same moment. Obviously this could be a problem in some contexts, but since they're being called at the same moment, it may be of little consequence in reality. They'll gain access to the same string, and it will have 'about' the same time in it when they're done. The overwrite isn't dangerous - bytes in the character array will simply be replaced with the same bytes.
The problem comes in when the application code depends upon the char * it receives as the return. If it doesn't copy the data immediately, it could be that ANOTHER call is made which overwrites the time with a newer one, and that is a bug, though again, only as serious as the consequence to your application's functioning. It won't be a crash, I'm fairly certain.
The same thing can happen in standard C use in non-threaded work.
A better example of the problems related to the subject of thread safety deal with the issues of resource ownership (like memory allocation), and, obviously, overwriting data (which follows into the notion of incrementing/decrementing/altering flags and such outside the control of a lock).
Any function that operates entirely on locally scoped data is thread safe. There is no need to control access to the function itself. They are also re-entrant.
Any member function that acts on member data (it probably shouldn't be a non-static member function if it doesn't - read Stroustrup on this point) will have to be considerate of thread safety in either it's implementation or use context.
Since there is no concept of threading in the definition of the C++ language itself, all management for threadsafety is a function of the discipline of the developer, combined with the minimal support provided in both the operating system and the underlying libraries. For example, when you select the use the multi-threaded runtime library, that's a non-standard option implemented by whatever vendor provided the library, and dependent upon the operating system and CPU services that support it. As such, you can expect that all efforts to enforce thread safety have limited leverage provisions from C++, though important features like private data and private functions do help considerably.
Many take the view, for example, that one of the reasons we've been hounded upon to set all data private, and then use get/set functions for access, is so we can protect them with a lock within the get/set function. This is a naive but effective approach, if performance isn't paramount and the complexity of the object or its use is limited.
Sometimes the concept of the boundary of thread safety is broader.
There may be times when you need to perform a series of operations and call other related functions within a logical operation on a boundary of threadsafety on an entire object. This simply means the object should be locked before the 'outer' function is called (which is a public interface to the world), then all of the functions IT calls should be "interior" to the implementation (private). As long as the subject of the "lock" is one object, you only need to consider the lock on the API exposure - all of the "internal" functions, which are not available to API level code using the object - can be used under the assumption that the object is secured under lock.
If, on the other hand, you need to operate on a container of objects that is shared among threads, it is not the objects IN the container that must be locked, but the container itself. This can be complicated by the fact that, depending on design, the objects in the container may themselves also be shared OUTSIDE of the container. This is not problematic if those objects are themselves contained by smart pointers (reference counted, like shared_ptr ), and the relationship between the container and the objects it contains is one way - the objects don't 'talk' to their container.
Otherwise, the "standard" use of STL containers (that is, not smart pointers to objects, but just objects in the container) can be made safe simply by protecting the container. This can be one of the uses of an iterator, perhaps a locking iterator, wrapped around the standard STL iterator you're accustomed to.
Care should also be taken with regard to naive methods of "overprotection" of data. In the "private data/get-set" approach, where get and set lock at each access, there is the potential for deadlock. This can happen if two locks are required, one after the other, in the situation where one function calls another through a get or set.
The situation comes about like this. There are two resources to be accessed (they could be ints, strings, whatever). Resource A has it's own mutex (or critical section), Resource B has it's own mutex.
Thread 1 access resource A, and now has a lock on it. It will soon require access to resource B, but will do so while the lock on A is still being held.
Thread 2 access resource B, and now has a lock on it. It will soon require access to resource A, but will do so while the lock on B is still being held.
These two threads will deadlock, because thread 1 will wait on B's release, while thread 2 will wait on A's release - forever.
For this reason, the very concept of locking should have an application level boundary such that resource acquisition can be controlled. If thread 2 were forced to access the resource in the A-B order, instead of the B-A order, the deadlock could be avoided. If that's not practical, then either it must recognize a deadlock by TRYING a lock on A (it's second step) and releasing B, the scheduling a retry - which isn't practical for application level coding.
This implies that it is best (and I assure you the results are spectacular) to use or develop objects that help in the implementation of threaded development work. Before the STL and boost, smart pointers were not as well known, but I found them absolutely indispensable for threaded development.
A great amount of literature covers "lock free" approaches to threaded development. In some cases this is a simple as making certain you can't access a common piece of data from multiple threads. If, for example, you threaded a simple convolution on a bitmap to, say, change it's contrast - you could spin off 1/4 of the bitmap to each of 4 threads, knowing that while it IS a single image, no two threads will overlap by even 1 pixel. This only needs a simple synchronization at the end - so that whatever is to happen AFTER the contrast adjustment is finished doesn't START until all of the threads finish their jobs (sometimes joinable threads are used, sometimes other techniques).
Another example of a lock free (or, perhaps one might say pseudo lock free) approach is the use of atomic operations. There are, in most modern CPU's, a few assembler instructions that are performed in such a way that the operation is certain to complete without interruption under any circumstance. An example is a decrement and compare against zero. If you're performing reference counting on a smart pointer, you could consider using a 'lock' like a critical section or mutex to protect the integer that counts the references, but it's much slower than an atomic decrement/increment instruction. Instead of a lock, you can use the atomic decrement (when releasing) to guarantee that both the decrement and the comparison of the result, after the decrement, to zero occurs without the possibility of interruption, such that IF the result IS zero, you are known to be safe for deletion of the resource being managed. This is much faster than a lock/decrement/check/release combination. I call foul because there is actually a lock going on here - a refusal to allow the operation to be interrupted - but it's technically about as fast as an assembler decrement and a subsequent compare, and mutexes and locks upon them aren't required. It is a thread safe concept, and is used on the shared_ptr from boost and tr1. Same may not call this a lock free approach, but I've found others that do.
Obviously the functions new/delete and malloc/free must be from a 'threadsafe' library if threaded development is to be possible at all. This happens to be the case when you select a multi-threaded library, but absolute disaster would result if you were able to launch and use threads without that.
The reason there are C library calls still available that are NOT carefully thread-safe or thread aware is simply that they are part of the old standards, and it would break convention (and thus old code) if they were altered to be remotely compliant with threaded development (as in asctime and the like).
One must simply either choose alternatives or encapsulate in such a way that you aren't even tempted to use an interface of such archaic design (violating the notion of threaded access to the function) from your application code. Just because it's there doesn't mean it's appropriate for application to call it. It is part of professional discipline to insist on a coding standard that requires it.