Help MutantJohn understand threads!

**MutantJohn** · 02-19-2015

Okay, so I realized that I've never really sat down to think about threads and what they are at a lower level which is kind of strange considering how much I talk about CUDA and I've multithreaded in C++ before.

Anyway, as I understand it, executables are a sequence of instructions for a processor. I'm cool with a processor being like a giant circuit that if given the proper series of relatively high and low charges, it does something unique.

I think threads are OS specific. Or rather, it's a thing that's created as an abstraction by the OS to handle multiple executables on the same machine. I mean, could you imagine a computer that can only handle one executable at a time?

I've heard the word context used before and I think this ties in so I think an OS abstracts an executable as a thread, something it can track and schedule things around. This way, you'd be able to execute a memory read instruction, have the executable pause and then some other executable's instruction gets processed while the original thread waits for the memory to come across the motherboard.

I guess that would be why people would say, "context" as each thread would be in its own little "universe", so to speak.

Or maybe I botched all of that because I just don't know, man. Was I right though?

**manasij7479** · 02-19-2015

You are talking about processes.
They exist in their own little universes.

A single process can have multiple threads, which generally share the address space.

(This distiction is not very boolean in Linux, as you can use the clone function to configure the behaviour of new processes/threads in a very granular manner)

**Elkvis** · 02-19-2015

A context is basically the environment in which a given thread runs. It typically includes the stack, the registers, and anything else required to store the state of the thread. The OS CPU scheduler can save the state of a thread, restore the state from another, and run multiple processes or threads sequentially on a single core. On a multi-core system, the scheduler can run threads or processes concurrently on multiple cores.

This is often done with a combination of cooperative and preemptive scheduling. Cooperative scheduling may switch contexts any time a thread makes a system call. The state of the first thread is saved, and another thread is resumed, while the OS waits for I/O or whatever other long-running (relatively - this all happens in milliseconds, or even microseconds) task is being requested. Once the operation is done, the OS can resume the first thread, and that thread will see the results of the system call. Preemptive scheduling will forcibly interrupt a running thread, often because it is monopolizing CPU time, and switch to another thread. Most modern operating systems do a combination of the two. The kernel will enforce a timeout, so that once a process's time slice has expired, and it hasn't made a system call, it gets cut off. If the running thread makes a system call before the timeout happens, the timeout will be reset, upon resuming the thread.

**brewbuck** · 02-19-2015

The context is the set of register state and page tables that define the thread/process. The level of distinction between thread and process depends on the architecture and OS, as manasij7479 said.

"Context switching" is swapping out the CPU state for a different state, one that corresponds to some other thread. This is assisted by hardware to various extents.

I've worked on a DSP before that was hardware threaded with a thread count of 3, and the threads interleaved with each other instruction-wise, so the processor executed ABCABCABCABC... whether you wanted things to be like that or not. In other words there are always three threads running and you get equal timeslices for each of them, so you HAVE to design your algorithm to be 3-way threaded to get the most performance out of the processor. Writing code for that kind of... sucked

**Elysia** · 02-19-2015

Well, a process is not necessarily a thread (but that may depend on the OS). In today's common operating systems, a thread is not a process, though. You can see a thread as a scheduleable unit of execution. It has a stack, registers and other state associated with it, basically a context which other mention. A process typically contains multiple threads and is more heavyweight than threads. But yeah, the point of threads is to ensure it "seems" as if multiple tasks are running concurrently.

Consider typing in a word program. There's one thread formatting and drawing what you type. There's another that checks your spelling as you type. These tasks occur concurrently, so they execute in different threads and are often (but not required to) scheduled directly by the OS kernel. Of course, the OS is also allowed to schedule processes and give each thread in a process a time slice before switching to another process. Some operating systems (e.g. Windows, Linux) just schedules threads directly without caring what process it belongs to.

Finally, the OS can pre-empt or halt threads, but it doesn't do it on a low-level such as waiting for memory access. Waiting for data from the motherboard may block as that's a system-level call (e.g. a call to the OS). Things that are handled by the process directly such as memory accesses are typically handled by the processor which tries to hide the latency by executing other instructions, or if it's out of them, it stalls. But it usually doesn't notify the OS (it's much too slow!).

**Nominal Animal** · 02-19-2015

Originally Posted by MutantJohn

Anyway, as I understand it, executables are a sequence of instructions for a processor.

No, not quite. An executable (binaries in POSIX world) is much more than that. It also contains initialized data, linkage information (references to dynamic libraries for example), and other non-executable-code stuff too.

A process is the logical container. It has specific privileges: user, group, and supplementary groups in Unix world, and in Linux, possibly extra capabilities (like whether the process is allowed to change privileges and so on). If any thread changes those, the changes take immediate effect in other threads too. All memory (including page tables and any other virtual memory mapping mechanisms), file descriptors, sockets, streams and so on belong to the process.

A thread is the logical entity used to execute code. All threads belong to a process. (In Linux, there are kernel threads too, but these are actually closer to kernel-created processes.) In POSIX systems, each thread has their own stack (but visible in the process memory space), thread-local variables (__thread, these are also visible in the process memory space, the address of the variable just depends on which thread is executing), CPU registers including the address of the current/next instruction to be executed, and the signal mask.

Signals are the POSIX way of interrupting currently executing code, by temporarily jumping to a specific function (a signal handler). Each signal has exactly one signal handler, process-wide, although the signal handler can check which thread was used to deliver the signal, and thus do thread-specific additional work. Due to this interruption, not all functions are safe in a signal handler: only those listed as async-signal safe are safe to use in a signal handler function.

POSIX signals are not queued, and therefore it is easy to miss one. Fortunately, POSIX realtime signals are queued, and you can even piggy-back one pointer or integer of data along with the signal. The originator and the delivery mechanism details are available in a siginfo structure, if the signal handler was installed requesting it.

If one thread opens a new file or socket, or allocates more memory, it belongs to the process, not to the thread. Thus, it does not matter which thread closes or releases it. Thread handling primitives -- mutexes, rwlocks, barriers and so on -- are the only ones that care about exactly which thread is doing the operation. In POSIX, it is perfectly okay to take the address of a thread-local (__thread) variable; there is no virtual memory tricks involved, just a dedicated CPU register that points to the TLS storage for the currently executing thread.

When the OS creates a new process, it sets up the necessary memory regions, and creates the initial thread. In POSIX systems, the only thing special about this initial thread is that its stack might grow automatically as needed; the stack size is fixed for all other threads.

(In Linux, the default thread stack size is rather large, something like 8 megabytes. This is much more than typically needed, and although it is just the virtual memory that is reserved, not actual RAM, this is what limits the number of usable threads at least on 32-bit architectures. The workaround is trivial: just create a set of thread attributes that defines a smaller stack size, and use it when creating the threads. It takes about ten lines of code, and the same attributes can be used for creating any number of threads and immediately destroyed; it's basically just a "note" attached to the thread creation request. Yet, most choose not to do this. Bah.)

The process dies when the last thread exits; the initial thread is really not that special. All memory and file and socket descriptors are released at that point. Only the process identifier, and the identity (privileges) it had when the last thread exited, and some statistics that depend on the OS/kernel are saved. At this point, the dead process is called a zombie.

When some process checks to see if any of its children have exited, this is called reaping. When this happens, the dead/zombie process is reaped, and it no longer exists in any sense. Since only the parent process can reap child processes, processes are reparented to the init process (process ID 1) when their parent dies. The init process is responsible for reaping dead child processes, and if there is anything funky happening at this point, you're in trouble; this is one reason why many of us hate the SystemD design, adding all sort of extra stuff to the init process.

Objections? Questions?

**Elysia** · 02-19-2015

Originally Posted by Nominal Animal

The workaround is trivial: just create a set of thread attributes that defines a smaller stack size, and use it when creating the threads. It takes about ten lines of code, and the same attributes can be used for creating any number of threads and immediately destroyed; it's basically just a "note" attached to the thread creation request. Yet, most choose not to do this. Bah.)

Yet doing so means you have to use platform-specific code instead of the standard facilities in C/C++ to create threads, so why sacrifice portability if not necessary?

Originally Posted by Nominal Animal

The process dies when the last thread exits; the initial thread is really not that special.

I'll also just quickly point out that while this may be true in linux, this is not true in Windows. Each process is associated with a main thread. When the main thread exits, all other threads are terminated.

**Nominal Animal** · 02-19-2015

Originally Posted by Elysia

Yet doing so means you have to use platform-specific code instead of the standard facilities in C/C++ to create threads, so why sacrifice portability if not necessary?

Why do you insist on conflating two different languages? Just because C and C++ have common origins, they are two friggin' separate languages!

As far as I know, no C compiler and operating system yet implements C11 thread support, and C11 is the first C standard to define it at all. In other words, there is no "standard C thread facilities" available yet.

On the other hand, POSIX C thread support is quite stable, non-platform-specific, very well standardized, and supports the thread attributes just fine. It actually works just about everywhere except Windows. Also, the C11 thread support is implementable as simple wrappers around the corresponding POSIX thread functions -- for most functions, only a preprocessor macro is needed for the translation, they're that similar.

Knowing Microsoft, I would be extremely surprised if they actually implement C11 thread support in their compiler. (It might require some kernel modifications, for one.)

As to my comment, "Yet, most choose not to do this. Bah.", I meant that most choose not to bother adding the seven lines or so (plus one line for each customized stack size per thread created) to fine-tune the stack size, when using POSIX threads. It's just better to set it within the program, since some threads may need very little stack, and others may need to do some recursive stuff and therefore can use a bigger stack. It's something that changing the default stack size for the entire process (ulimit -s kBytes) does not address.

Originally Posted by Elysia

I'll also just quickly point out that while this may be true in linux, this is not true in Windows. Each process is associated with a main thread. When the main thread exits, all other threads are terminated.

It is true for all POSIX systems (ref), and also for C11 thread support, not just Linux. (C11 7.26.5.5p3 explicitly states "The program shall terminate normally after the last thread has been terminated.")

Again, Windows is the odd system out. The rest of the operating systems follow pretty closely to the POSIX model, even when not POSIX compliant.

**Yarin** · 02-19-2015

Originally Posted by Nominal Animal

Again, Windows is the odd system out. The rest of the operating systems follow pretty closely to the POSIX model, even when not POSIX compliant.

Windows may be the odd by not following the POSIX model. But that only proves that the POSIX model is wrong.

**Codeplug** · 02-19-2015

>> It actually works just about everywhere except Windows.
Depends on which winpthreads lib you're using: http://sourceforge.net/p/mingw-w64/c...thread.c#l1465

gg

**gemera** · 02-19-2015

Originally Posted by Nominal Animal

Why do you insist on conflating two different languages? Just because C and C++ have common origins, they are two friggin' separate languages!

As far as I know, no C compiler and operating system yet implements C11 thread support, and C11 is the first C standard to define it at all. In other words, there is no "standard C thread facilities" available yet.

Pelles C supports C11 threads.

**Alpo** · 02-19-2015

I have a question (it might be stupid, I don't know).

How is the initial thread (or process) created on booting?

I mean at that point, it seems like the computer doesn't have enough self awareness to know what a thread context looks like, or what the format of an executable even is. So I was wondering how a thread could be created under these circumstances? Or am I thinking of it wrong, and such things as "threads" and "executables" only come later in booting after something puts it in a state where these things are possible?

**phantomotap** · 02-20-2015

How is the initial thread (or process) created on booting?

O_o

You are talking about bootstrapping.

The machine in question, to be brief, is configured to look at a specific device which needs to match certain requirements. The machine loads a specific chunk of data, which also includes instructions, from the specific device into a specific memory address. (We are talking a bootloader or similar.) The machine then initializes some state and jumps into executing the instructions at the specific memory address. The instructions executed are expected to know what to do next such as finding a more elaborate process to execute.

Soma

**rcgldr** · 02-20-2015

In the case of Windows (not sure about *nix versions), each process has it's own virtual address space, while the threads of a process share a common virtual address space. Context means the state of a process or thread, the registers, stack, virtual address space, processor state, privilege level, ... . A context switch between threads doesn't require switching the virtual address space, but a context switch between processes does require switching the virtual address space. I don't think windows uses co-operative context switches, and instead processes or threads at the same priority level are cycled through on a time slice basis based on the ticker, which defaults to 64hz, but can be set to as fast as 1000hz (or slower than 64hz). Processes or threads at higher priority are preemptively switched to the moment their state changes from waiting to runnable. Processes or threads at lower priority are only run when all processes or threads at higher priority are in a wait (like sleep) state. As mentioned previously, on a multi-core processor, each core runs a process or thread in parallel.

**MutantJohn** · 02-20-2015

Interesting. Very interesting.

This was a good topic. Thank you, you guys.

Thread: Help MutantJohn understand threads!

Thread Tools

Search Thread

Display

Help MutantJohn understand threads!

Similar Threads

threads inside threads problem

Let's help MutantJohn modify a terminal emulator...

threads - threadpool - reset threads - stop threads without destroy - pthreads -linux

Threads , how? Portable Threads Library?

a point of threads to create multiple threads